In 2012, the New York Times published an article detailing how Target collected data on its shoppers to determine if they were pregnant. By examining a woman’s shopping patterns, Target could determine if they were expecting and then send them promotional material and coupons. Target became a pioneer in advanced data mining and personal marketing. Back before the age of the Internet, marketers advertised to the general public. Since the introduction of social media, online shopping and GPS, marketing is now personal.
The problem many companies face is not that they have too little of your data; it’s that they have too much. This is know as data exhaust, and it’s the reason companies hire scores of data analysts and marketers to analyze this metadata. In 2010, we as a species were creating more data per day than we had created from the beginning of time until 2003. In 2015, 76 exabytes of data will travel across the Internet in just one year.
Companies collect this data using various methods (which I’ll talk about shortly), and begin to build profiles around each consumer. For every consumer who uses a credit card, downloads an app, or sends a text, there is some corporate record of that transaction. You might be log #688330 in Target’s database or customer #0023837 within Uber’s system. As time passes, companies can begin to build big data sets.
Big data sets derive value, in part, from the inferences that can be made from them. Some of these can be obvious. If you have someone’s detailed location data over the course of a year, you can infer what their favorite stores are. If you have a list of calls and emails, you can infer who the user’s friends are. Some companies, like in the case of Target, can discover more subtle facts about you. Ethnicity, religion, drinking history, sexual orientation: these are all inferences companies can draw from your transactions and actions.
It may seem quite surprising that a team of data analysts could draw so many inferences from your online data, but there is an even more surprising number of ways companies can collect this data.
Location, Location, Location
In recent years, collecting location-based data has been a proverbial gold rush for corporations. Practically everyone and their mother has a smartphone with GPS location tracking. Location based apps have given us a multitude of benefits, from online delivery to ride-sharing services. However, many companies aren’t exactly transparent about when they’re collecting your location data.
In 2012, the free flashlight app Brightest Flashlight Free came under scrutiny when it was found out they were collecting location data from their 50 million users and selling it to third parties. Brightest Flashlight had snuck a clause into their agreement contract that almost all of its users failed to read. The conflict was resolved when the US Federal Trade Commission got involved, but it showed consumers that even something as simple as a flashlight app might be collecting information on its users.
Other apps and services have taken similar sneaky approaches to collect your data. In 2013, Jay-Z and Samsung teamed up to offer people who downloaded an app the ability to listen to Jay’s new album early. However, the app required the ability to view all accounts on the phone, track the phone’s location and track who the user was talking to. Amazon quietly and constantly collects location data on you via Kindle. The Angry Birds app even collects location data when the app is off.
With this location data, marketers can use a technique called “geofencing” to identify people who are near a particular business so as to deliver an ad to them. A single geofencing company, Placecast, delivers location-based ads to ten million phones for retailers like Kmart, Starbucks, and Subway. Microsoft also does the same thing to people passing within ten miles of its stores.
Some retailers take it a step further and actually track your physical location within their stores. Using a combination of Bluetooth IDs, MAC addresses and security cameras, retailers can view which isles you walk down and which displays you stop at. The goal of this monitoring is to combine the patterns of hundreds of shoppers into a heat map, which can reveal the effectiveness of certain displays and isles.
Clicks and Quizzes
Location-based data collection is only one facet of how companies can collect your information. Another tool data-miners utilize is the “Cookie.” Cookies save a small record of each website you’ve visited and link you’ve click on within the web browser you’re using. For a standard fee, companies can set their own cookies on pages belonging to other sites. This is known as the “Third-Party Cookie.” Have you ever wondered why you keep seeing the same ad over and over again, no matter what website you’re on? This is a third-party cookie tracking your behavior across multiple sites. Companies like Rubicon Project and Double Click operate to help advertisers target individual users across multiple sites.
One of the easiest ways for companies to collect data is to simply have you give it to them. Believe it or not, BuzzFeed has saved almost every answer and response to a majority of its quizzes. BuzzFeed can then take this free information and sell it to third parties, with the quiz taker unaware. Most of the information garnered from these quizzes is useless (i.e. Which Ousted Arab Spring Ruler Are You?), yet some of them can provide valuable information on an individual’s financial status and identity. In 2014 BuzzFeed published a quiz titled “How Privileged Are You?,” containing questions regarding job security, sexual orientation, race and a multitude of other characteristics. Other websites like WebMD and Trivago collect health and travel information on their users. This is the kind of information marketers work so hard to collect, analyze and utilize; websites like BuzzFeed have just made their job a lot easier.
*Much of the information used in this blog comes from Bruce Schneier’s 2015 Data and Goliath.