Cyber Security

Personal Data Scraping: The Social Media Scandal Worth Caring About

Leo Hynett

If someone were to compile a database of all the information you share on different social media platforms, how much would they know about you?

Last month a hacker who goes by the name of Tom Liner did exactly that. He took the names of 700million LinkedIn users from around the world and scraped their other social media accounts for further information to add to the enormous list. Once he had gathered phone numbers, email addresses, and other information he made the database available for purchase. Unfortunately for those whose data is on the list, he has already had multiple buyers.

Buying the database

A copy of the database can be purchased for around $5,000 (£3,600). Tom Liner won’t disclose the details or the motivations of people who have purchased copies, but he has said that ‘the data is likely being used for further malicious hacking campaigns.’

PrivacySharks, a company specialising in cyber security and online privacy, has highlighted that the database could also be used by companies wishing to better target their advertising. A lot of the uses are likely to be more malicious than that though, varying from spam emailing to identity theft. Although the database doesn’t contain things like credit card details, expert hackers may be able to track down sensitive information from what they have already accessed.

Tom Liner has said it does ‘bother him’ that the database will likely be used for malicious purposes, but would not give any further details to the BBC journalist he was in contact with about why he continues the operation. He had stated he does this ‘for fun’ and as a hobby, so perhaps it was done simply to prove it’s possible.

To prove the legitimacy of the database’s contents, Tom Liner ‘provided a sample of 1m records which PrivacySharks’ researchers analyzed for authenticity to discover that they contained a wealth of personal information including full names, gender, email addresses, phone numbers and industry information.’ PrivacySharks has not been able to authenticate the legitimacy of the entire database as this would require purchasing a copy of the stolen data.

It seems much of the data came from LinkedIn, but the company has issued a statement saying that this is ‘not a data breach and no private LinkedIn member data was exposed.’ LinkedIn is, unfortunately, no stranger to data breach accusations having faced a similar issue in April of this year. In both instances, the data that was used from their site was publicly available so did not constitute a breach of privacy.

Publicly available data

The data on Tom Liner’s database was all freely available and is the kind of information that one could find simply through strategic web searches. The existence of the database does not highlight risks of data breaches, it instead shines a light on the quantity of information that people share online without thinking twice.

Many people list their birthday on social media so that their friends remember it, or share posts to celebrate their anniversary – these are perfectly normal social media habits, but if your security questions on your email are ‘what is your date of birth?’ or ‘when is your anniversary?’, you may have just unwittingly given someone a way into your inbox. These are tiny things we often don’t think about, but this database has proven that they can add up.

Sharing these small details was not always such an issue – after all, compiling all your information would take even a determined person quite some time. It’s only thanks to powerful scraping algorithms that such a vast database can be compiled in just a matter of months. Data scraping is now, unfortunately, incredibly easy.

LinkedIn is far from the only target of scraping activities such as these, Facebook has also been hit over the past few months with the latest incident being in April. In this incident 33million users had their data compromised, ‘including Facebook IDs, full names, phone numbers, and dates of birth.’ Since then, Facebook has actually sought to frame data leaks as a normal aspect of the industry. This came to light after an internal memo was leaked wherein Facebook acknowledged that these leaks will continue to form a normal part of online life.

What can you do to protect your data?

Enabling two-factor authentication on your accounts is good practice, as is frequently changing passwords. And though it may be tempting to have the same password on multiple accounts, avoid this where you can. Microsoft Regional Director Troy Hunt has created a site where you can enter your email address to check if your email address has been associated with any data breaches.

However, this is only relevant to data that sites deem as private. If you are worried about data scraping it is possibly worth considering the amount of data you make freely available about yourself. You can make entire profiles private on Facebook and Instagram, or some platforms allow users customise which data is public. Twitter, for example, makes it possible to share the day and month of your birthday but hide the year – this way people know to wish you a happy birthday without having access to your full date of birth.

It may also be worth brushing up on the most common scams as they have increased in frequency during the pandemic. Some scammers are even creating attacks that resemble vaccine invitations, so always verify senders before entering any personal details.


The database of 700million social media users created by Tom Liner is comprised entirely of publicly available data. Therefore, this does not represent a breach of privacy according to the firms involved.

Aggregated publicly available data can still be enough to provide hackers with a way to access more sensitive information, so it is always worth considering what information you make available online and setting profile information to private where you can.

About the Author: Leo Hynett

Leo Hynett is a contributing Features Writer, with a particular interest in Culture, the Arts and LGBTQ+ Politics.