What the Cambridge Analytica story and last week’s congressional hearings with Facebook’s CEO are really all about is that people – not even “just” social media users, voters and policymakers – are waking up to the meaning of “big data.”
It’s a big story not only because Facebook has more than 2.2 billion users or because Cambridge Analytica may have helped Donald Trump become president, as mind-bending as both the data point and the possibility are. It’s big because it’s personal. People weren’t going to understand the implications of big data until Facebook was in the story. Although there are uncountable retailers, banks, publishers, campaigns, governments and bad actors benefiting from big data, Facebook brought it home to us because the data there is so visibly us – our own and significant others’ everyday likes and lives, in all our own words, photos and videos, posted by us.
Not that “big data” hasn’t been a lot more than all that for a long time. But with the personal part added to the results of two game-changing votes in the UK and US and the confusing mix of political news, information, misinformation, disinformation and advertising on Facebook that appeared to affect those votes, you get what may well turn out to be a story we’ll tell our grandchildren as well as children.
So here are some talking points for family and classroom conversations about this pivotal moment:
First, what is “big data”? Well, the dictionary definition is: “extremely large data sets that may be analyzed computationally [like with machine learning] to reveal patterns, trends, and associations, especially relating to human behavior and interactions.” Data is just information that comes in all kinds of forms: text, numbers, photos, videos, etc. Even though not all of it needs to stay private, what we’re finding out is, it’s hard to tell how much people and companies can tell about us when the kind of data that’s fine to make public gets blended with other data that’s stored or private. That unknown concerns us, which is why we’re hearing more and more calls for “transparency.” So you can tell from the definition that “big data” is about a whole lot more than lots of information; it’s more about what can be discovered from the data than the data itself. That can be all kinds of things, good and bad, from banks being able to find patterns of fraud to governments stopping infectious diseases from spreading to companies like Cambridge Analytica using people’s information to create and place ads aimed at getting people to vote a certain way.
So is social media big data? It’s only part of it. It’s just the very visible part that regular people like us contribute to. When we post comments, photos and videos, “like” others’ content, click on ads, buy things online, visit other sites, etc. we’re adding all kinds of information (called “psychographic data,” which I’ll explain in a minute) to the databases at social media companies and sometimes elsewhere, whether unethically, criminally or just mistakenly, as happened with Cambridge Analytica, which bought some 87 million people’s data from someone who Facebook says violated its policy. Facebook doesn’t sell data to other companies, it says; the way it makes money is from advertisers who, based on our detailed data in its ad placement system, place their ads on the pages of users who will really like the ads (and maybe buy the thing being advertised). Does that make sense? All that detailed information we share – and the technology I’ll tell you about in a minute – makes it possible for advertising to be more relevant, or more “highly targeted,” than ever before in the history of advertising, which makes it more valuable than ever to advertisers (because more likely to lead to a purchase). Some companies, called data brokers, do sell your data so that the buyers will have even more data on us to help them get even better at placing ads that will make us want to buy stuff.