Category Archives: data mining

33 bits, 5 Degrees of Separation, Facebook and You

Now, I am sure many of you are wondering what in the hell I am talking about when it comes to the 33 bits and 5 Degrees of Separation in the title of this post– let me explain.Are you anonymous?

There are 6.6 billion people in the world.  So in order to figure out who you are you only need 33 bits of data about you or any  person (more accurately it is 32.6 bits of data, but who’s counting?).  In today’s world people act under the false assumption on-line that they have some level of anonymity, which maybe only a few years ago was true to a degree.  But in today’s world of cheap computing power and massive databases of user information, actions, likes, dislikes, fans of, etc it is becoming less and less an issue to figure out the who, what, when, where, why and how of YOU.

Now let’s look at the 5 degrees, which refers to the 5 degrees of separation.  This is in reference to an article I just read about Twitter and how the average separation of users is just 5 (4.67 to be exact).  Granted, most people probably think of the famous Six Degrees of Separation which mainly became popular with the Six Degrees of Kevin Bacon.  But all fun aside, this helps to bolster the argument that privacy and anonymity on the Internet may be a thing of the past soon, if not already.

This is where Facebook and you get involved.  Recently Facebook has launched a few new services that impact these two concepts greatly.  First, is their “Like” everything model (which I even have on this site below this post).  This allows sites to add the button to everything be it blog posts or items on an e-commerce site.  The somewhat scary part of this is that data is being sent to Facebook and their partners and in some cases to add to their data warehouses.  Now you are helping to paint an even more detailed picture of the things you “like” and the sites you visit without the need for cookies or other tracking means.  Basically you are tracking yourself.  So “like” at your own risk.

Another popular service that is seemingly getting more popular exponentially is Foursquare.  This is a location-based service where people “check-in” when they are at places out and about town using their GPS enabled cell phones .  So now we are voluntarily tracking our literal movements and posting what we do and like for others to see.  The scary part of this is that I have been reading about Facebook’s dis-concern for privacy and users say in the sharing of what they do.  So getting those 33 bits of Entropy is not very hard at all and is becoming easier by the day.

There is a popular reaction from most people who say, “well I am not doing anything wrong”.  This is one of the most annoying things people can say on the topic since it does not address them caring about their freedoms.  I am sure many Londoners said the same thing when they were install a CCTV network across the whole city and for what?

Common Data Mining Techniques

MinerWas reading an interesting article which represents a good introduction to popular data mining techniques.  That is the true beauty of the study of data mining is its not really the tools used.  It is more what you do with these tools in terms of how you tie them together and build the models.  You could compare it to a painter.  Anyone can go out and get an easel and paint.  But not everyone can paint a wonderful work of art.

Marketing and sales professionals are beginning to capture and analyze many different types of customer data—attitudinal, behavioral, and transactional—related to purchasing and product preferences to make predictions about future buying behavior.

Today’s challenging environment is forcing more organizations to explore predictive analytics. Commonly used by market researchers when analyzing survey data, predictive analytics can also be applied in real-time scenarios, such as personalizing offers to customers or improving an online customer experience.

There are various approaches to predictive analytics, and most depend on clean databases and the ability to mine data to look for patterns or to create classifications. It is important to understand the various approaches so you know when to use which one.

Read more (off site)