Screen scraping in OSS

“…the Court of Justice of the EU (CJEU) is due to hear arguments from Ryanair and a Dutch price comparison business about the extent to which rules contained in the EU’s Database Directive apply to data that is not protected by copyright or a ‘sui generis’ database right.
The CJEU’s judgment on the matter, which is unlikely to be issued for many months, will determine the extent to which businesses can apply contractual restrictions, in the absence of having copyright or database rights protection for their data, to prevent others from using that data. Screen scraping involves the use of software to automatically collect information from websites and systems…
The issue of the legality of screen scraping is of huge interest to many industries which use the internet as their primary sales channel,” Connor said. “The ability to stop a competitor, intermediary and/or any other commercial organisation using the information a company posts on its website, is fundamental to the operation of the internet. If this can be stopped by intellectual property rights or website terms and conditions, a number of businesses, for example comparison websites, will have to change their business models
An article on

Whilst being far from the target, the ruling described above could potentially impact the OSS industry too. The implications are that any organisation wishing to procure an OSS should seek contractual terms that provide themselves with the right to access data of the OSS they’re purchasing via screen-scraping or mechanisms other than official APIs.

Screen-scraping is a common mechanism used by OSS integrators to gather data from systems that they don’t have an easily accessible interface to. These systems are generally on-net rather than the Internet examples described but the concept is the same. OSS screen scraping might be used to access information from legacy systems where documentation isn’t available or even where there are no published interfaces to work with. Alternatively, it could be a means of accessing data from a product that is no longer being licensed or officially supported by a vendor.

In cases like this, a vindictive vendor could potentially seek to restrict an organisation from scraping data (or even using other data access mechanisms such as command line interfaces [CLI]) by other OSS tools. Some OSS tools heavily rely on these “agentless” interfaces and would be more at risk than others.

The risk is probably quite small for the OSS industry, but a simple contractual clause could provide mitigation.

If this article was helpful, subscribe to the Passionate About OSS Blog to get each new post sent directly to your inbox. 100% free of charge and free of spam.

Our Solutions


Most Recent Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.