Patientslikeme.com noticed suspicious activity at 1 am in one of its forums which users express highly personal information on their experiences with medical conditions including depression and other emotional disorders. A new user was seemingly copying every single entry on the forum.
Problem Being Addressed
Upon identifying the suspicious activity, PatientsLikeMe decided to block the user. Note: they were not using ScrapeDefender at this time.
PatientsLikeMe was using confidential software that monitors unusual activity, which notified site administrators of suspicious behavior. After the behavior was flagged, the chief marketing officer, David Williams, identified the suspicious user as a bot running an automated script. He blocked and then shut down the user’s account. However it was unclear how long this suspicious behavior had been occurring.
PatientsLikeMe identified three other suspect sites and blocked their access by the next afternoon. All suspect accounts were traced back to Nielsen Co., the media-research firm that provides monitoring services for various mass media including the Internet and collects data for its clients.
PatientsLikeMe sent a cease-and-desist letter to Nielsen. Ten days later, Nielsen agreed to stop scraping but then said it was unable to remove the scraped data from its database. A company spokesman later said Nielsen had found a way to quarantine the PatientsLikeMe data to prevent it from being included in its reports for clients.
PatientsLikeMe’s president, Ben Heywood, disclosed the break-in to the site’s 70,000 members in a blog post where he reminded users that PatientsLikeMe sells its data in an anonymous form, without attaching user’s names to it. That setoff a debate on the site about the legality of selling sensitive health related information. The company says most of the 350 responses to the blog post were supportive. But PatientsLikeMe says 218 members subsequently quit and were apparently concerned their real names could be traced through usernames that may have been copied. “I felt totally violated,” says Bilal Ahmed, a 33-year- old resident of Sydney, Australia, who used PatientsLikeMe to connect with other people suffering from depression. He used a pseudonym on the message boards, but his PatientsLikeMe profile linked to his blog, which contains his real name.
Trends Identified: Data scraping is now an accepted practice of main-stream companies who have the ample resources to successfully collect and use web site’s confidential information.
The use of a commercially accepted anti-scraping solution, such as ScrapeDefender, would have identified the attack immediately and blocked all four suspect accounts at the start of the attack rather than it was too late.
The scraping threat will only worsen in the future. According to the Wall Street Journal, Oct 12, 2010 firms will double spending on data scraped from the internet in 2012 ($840 million) and larger, blitzkrieg like attacks will plague sites.