+ A

The perils of big data

How was Cambridge Analytica able to glean the psychological tendencies of Facebook users through likes?
Apr 04,2018
Facebook is in trouble. For the first time in six years, its stock price fell below $160, and the Federal Trade Commission has launched an investigation. Congress is asking CEO Mark Zuckerberg to testify in a hearing. The data breach crisis seems to be spreading to the entire industry, with some calling on the chief executives of Google and Twitter to also testify if Zuckerberg appears in the hearing.

At the core of the problem is a leak of user information. Because of Facebook’s slack management policies, the personal information of countless users fell into the hands of the wrong people. Many have reacted sensitively because it is different from other information leaks with which we are familiar.

What we consider to be sensitive personal information includes ID numbers, contact information, credit card numbers and passwords. When they are stolen, certain people could suffer serious damage.

But the information leaked by Facebook is different. They are seemingly insignificant data points that anyone can see and access — reactions to posts, page likes, comments — but they are powerful pieces of information.

Cambridge Analytica, the company at the center of the controversy, analyzed the activity of tens of millions of Facebook users and provided the information to Donald Trump’s campaign during the 2016 election. After his victory, the U.S. government detected signs of Russian meddling in the election, and there are allegations that Cambridge Analytica helped in the course.

If the suspicion proves true, what Cambridge Analytica offered to the campaign will be crucial evidence in the ongoing investigation of Trump’s connection to Russia.

How was Cambridge Analytica able to glean the psychological tendencies of Facebook users through likes? Simply put, it is the power of big data. Facebook has more than two billion users worldwide, and the likes may seem like trivial information. But the sheer size of this data pool reveals the tendencies of users with an accuracy that cannot be imagined in the past.

What is noteworthy here is that in the era of big data, not much information is needed about an individual to understand his or her personal character. According to one study, an analysis of 10 posts that someone liked on Facebook can be more accurate than a colleague’s description of a person. To do better than a friend, a computer just needs 70 likes. Exceeding a parent’s ability requires 150, and with 300 likes, the computer can do better than a spouse. This is possible because of the efficiency of big data.

Facebook, which had been slack in managing its users’ information, realized the danger of big data abuse in 2014 and revised its policy. But the data of 50 million users had already been leaked, was thoroughly analyzed and then used in targeted advertisements during the 2016 election.

What can we learn from the crisis? We need to understand that the problem of big data cannot be solved simply with tighter management as in previous personal information leaks. Facebook cannot provide a clear solution because it is Facebook’s business model to provide user information collected in the form of big data to a third party.

This is not just an issue limited to Facebook. To various degrees, global platforms like Amazon, Google and Twitter share similar information with other companies. In the age of the fourth industrial revolution, there is virtually no way to completely ban the proliferation of big data.

There should be a social consensus on collecting and managing user information. Since the most valuable resource in the world is not petroleum but user information, all companies are eager to obtain and take advantage of it. Once obtained, user information does not disappear easily, and it is also difficult to verify who received it.

Facebook initially announced it had searched for and destroyed all the user information that Cambridge Analytica had transferred to third parties, but it turned out to be a lie. In order not to repeat the fiasco, the procedure of collecting and providing information should be transparent, and in the process, users must be given more initiative. It applies to not only Facebook but many other companies as well.


Translation by the Korea JoongAng Daily staff.

JoongAng Ilbo, April 3, Page 29

*The author is a tech commentator and director of the content lab at Mediati.

Park Sang-hyun