Maintain Your Business’s Value By Blocking Screen Scraping

Posted by on Nov 21, 2013 in Screen Scraping | Comments Off

In an information economy, the key to building value is controlling who has access to your web content.  To have this control, you need a way to allow the right people just enough access to prove that what you have is valuable, but not so much access that they can take the information without paying.

Unfortunately, there will always be people who want to take content  without paying for it.  To do this, many software developers rely on screen scraping   to access free content  that is published on web sites. These developers access a web site programmatically and process the large numbers of  pages  to steal any  information that they want.

Defending against screen scraping  is hard because to your web server it looks exactly like a standard user using a web browser to access your site.  This is where ScrapeDefender  comes into play.  By monitoring all your web traffic, ScrapeDefender can identify those users who are stealing information from your site and stop them.

That being said, not all screen scrapers are bad for your business.  Google indexes and ranks your web site by using their GoogleBot to perform a type of benevolent screen scraping on every page.  When building a defense against screen scrapers, it is important to realize that some scrapers drive value, while others are taking without giving anything back.

So, how do we tell good bots from bad?  ScrapeDefender generates  a diverse  collection of anti-scraping metrics about the offending user to determine what their goals are.  For example, the GoogleBot announces who it is clearly when it attempts to crawl your site.  Most poorly behaved bots try to hide their identity by stating that they are well known browsers like Internet Explorer or Firefox.  By combining the behavior of the user with the declared identity of the browser, ScrapeDefender is able to clearly determine if this is a person or a computer.  With this information, you have to decide if this scraper is a net negative or a net positive for your company and your bottom line.  The ScrapeDefender team  can help you figure that out, but for each suspicious scraper identified, there is a business decision to make.  For example, you might have a scraper grabbing one page of data once a day.  That is probably not costing you enough to spend the effort to block them.  Or you could have a scraper pulling down your entire site every day, which is worth blocking. car rentals .  Our unique dashboard (Hyperlink to demo page)  make it easy for you to determine what type of screen scraper you are dealing with and our experts can help you determine how to best deal with them.

Once you have identified a suspicious  scraper and determined that you want to stop them from crawling your site, the easiest solution is to block the IP address or limit how many requests that IP address can make in a reasonable amount of time.  This is most easily done through your Firewall.  ScrapeDefender also has plugins to block scrapers at various points in your infrastructure.

There are no easy answers to stopping screen scraping, but it is a challenge that every web site has to deal with.  The ScrapeDefender team is here to ensure that you know when somebody is taking your data without giving anything back.