The security research team at Comparitech today disclosed how an unsecured database left almost 235 million Instagram, TikTok and YouTube user profiles exposed online in what can only be described as a massive data leak.
Recently there has been a spate of reports concerning account data appearing on dark web cybercrime forums. From the dark web audit suggesting there are currently 15 billion stolen logins from 100,000 breaches out there, to the hacker giving away 386 million stolen records for free. Not all of this data will have been hacked, at least not in the usual sense of the word: some, as was likely the case in the Utah Gun Exchange incident, will have been exposed by an unsecured database.
The unsecured database problem
Unsecured databases are fast becoming such a huge data protection problem that it's thought a vigilante security researcher is behind the spate of "Meow" attacks that have overwritten the indexes of thousands of such databases. And it was such an unsecured database that the Comparitech researchers, led by Bob Diachenko, discovered on August 1, leaving the personal profile data of nearly 235 million Instagram, TikTok and YouTube users up for grabs.
The data was spread across several datasets; the most significant being two coming in at just under 100 million each and containing profile records apparently scraped from Instagram. The third-largest was a dataset of some 42 million TikTok users, followed by just under 4 million YouTube user profiles.
Comparitech says that, based on the samples it collected, one in five records contained either a telephone number or email address. Every record also included at least some, sometimes all, the following information:
Full real name
Statistics about follower engagement, including:
Number of followers
Follower growth rate
Last post timestamp
"The information would probably be most valuable to spammers and cybercriminals running phishing campaigns," Paul Bischoff, Comparitech editor, says. "Even though the data is publicly accessible, the fact that it was leaked in aggregate as a well-structured database makes it much more valuable than each profile would be in isolation," Bischoff adds. Indeed, Bischoff told me that it would be easy for a bot to use the database to post targeted spam comments on any Instagram profile matching criteria such as gender, age or number of followers.
Tracing the source of the leaked data
So, where did all this data originate? The researchers suggest that the evidence, including dataset names, pointed to a company called Deep Social. However, Deep Social was banned by both Facebook and Instagram in 2018 after scraping user profile data. The company was wound down sometime after this.
A Facebook company spokesperson told me that "scraping people's information from Instagram is a clear violation of our policies. We revoked Deep Social's access to our platform in June 2018 and sent a legal notice prohibiting any further data collection."
Once the researchers found the database and the clues to its origin, "we sent an alert to Deep Social, assuming the data belonged to them," Bischoff says. The administrators of Deep Social then forwarded the disclosure to a Hong Kong-registered social media influencer data-marketing company called Social Data. "Social Data shut down the database about three hours after our initial email," Bischoff says.
Social Data responds to the database exposure incident
Social Data has denied any connection between itself and Deep Social, according to the Comparitech report. It should also be made clear that the data leaked, social media public profile data is available to anyone who visits the accounts of the users concerned. However, the phishing risk is clearly amplified once such a hoard of profiles is collected together in a well-structured database. It isn't known at this time how long the database was exposed without a password before the August 1 discovery. The Comparitech report points out that: "Our honeypot experiments show that hackers can find and attack unsecured databases within hours of being exposed."
I reached out to Social Data, and a spokesperson provided the following statement:
"We collect data and enrich it with additional useful insights solely on behalf of our reputable customers, who use it strictly for the intended purposes. It is extremely sad that this incident has occurred due to a mixture of unfortunate events. However, as soon as we learned of the incident, we fixed it immediately. We have since been closely working with the information security experts on auditing our security infrastructure and increasing the required levels of information security to avoid similar occurrences in the future."
A TikTok spokesperson told me: "TikTok places the highest priority on user privacy, and we have anti-scraping policies in place. Our Terms of Service prohibit third parties from running automated scripts to collect information from our services, including public profile information. If we identify any such practices, we will take rapid action, including seeking legal redress."
I have also reached out to Google GOOGL 0.0%, who, at the time of publication, was still looking into the matter and unable to provide a statement. I will, of course, update this story if this changes.
Advice for concerned Instagram, TikTok and YouTube users
Meanwhile, I would advise users of all the services affected, Instagram, TikTok and YouTube, to be especially alert to phishing scams by email or posted as social media comments.
Meanwhile, if your company has any databases "in the cloud" then I would strongly recommend you audit the access permissions and make sure these are not open to anyone who comes looking. Elastic has an excellent guide to securing Elasticsearch deployments.
By: Davey WinderSenior Contributor