Web scraping No1 question: Is it legal to scrape a website?

One of the most controversial topics regarding internet activities is related to web scraping. Because this process is quite technical, people face a number of dilemmas. One of the most common is - is web scraping legal?

is web scraping legal

Since justLikeAPI (JLA) is based on web scraping, we think that provides a pretty certain answer, right? 🙂

But, let’s explain where this common misconception comes from.

What is web scraping?

It’s safe to say that today’s business is more present online than in the offline world. Therefore, many companies are trying their luck in the eCommerce sector. Being successful in eCommerce means being able to obtain as much data as possible. That’s when web scraping comes into the picture.

Web scraping is the process of extracting data from a certain website. In other words, you’ll be able to gather all the publicly available information. It’s important to point this out because it’s a major part of the answer that we’re trying to provide in this blog post.

Web scraping can be done by your own company, or you can outsource it. For example, JLA is currently able to scrape more than 45 platforms/websites. These websites cover different industries such as tourism and hospitality, health, review platforms, household, etc.

What clients are mostly looking for in a web scraping tool is to help them with data analysis. They’re trying to get to know their customers (and competition) better, make price comparisons and check what is their overall market position. Hence, this process can be used for scraping prices, images, reviews, and any other information available on the website. Hence, these are the most common client requests that JLA is dealing with.

Are web scraping and web crawling the same thing?

Web scrapers have many synonyms such as data extraction tools, bots, robots, spiders, and the one that’s probably used the most - web crawlers. It’s a common misconception that these two terms are the same.

As we said, web scraping means that you can extract the data, while web crawling is what search engines do. They scan and index the whole page meaning that they are not on the hunt for specific information.

Why do people think that web scraping is illegal?

Without any doubt, web scraping being illegal is the number one misconception. Even when you try to google this term, this question will definitely pop up.

web scraping

So, why is everyone questioning this?

Well, mostly because there are tools that don’t respect legal regulations. Web scraping isn’t illegal in itself, but there are still some rules that need to be followed. Even though many countries are still in the process of making clear laws, there are many regulations that refer to this topic:

Issues with web scraping - Linkedin vs. hiQ Labs

Clients also often ask to scrape things like email addresses, or posts and information from social media channels in general. This brings us to a very interesting legal situation in which Linkedin and a data analysis company called hiQ Labs have found themselves.

The issue dates back to 2017. when Linkedin decided to proceed with legal actions against hiQ Labs. Why? HiQ Labs scraped data from publicly available Linkedin profiles and used them to consult and check with the employers the information that employees have mentioned in their CVs. This was going on for a few years but in 2017. Linkedin decided to put the charges against this company stating that they’ve been violating the CFAA (Violation of the Computer Fraud and Abuse Act). In the same year, the court sided with HiQ Labs, but Linkedin filed an appel and in 2020 the final decision was made - the computer fraud and abuse act does not apply to information available to the General public.

What’s even more interesting is that the court prohibited Linkedin from interfering with HiQ scraping process which is very important in any other similar cases that might happen in the future.

JLA is also able to scrape public information from Linkedin, and we have lots of clients who are approaching us with that request.

Therefore, web scraping is legal, but you must know that some information can not be scraped:

Any private data that requires passwords/usernames can not be scraped;
Sometimes Terms and Services can explicitly prohibit any web scraping action

Conclusion

It’s clear that everyone wants to keep their business tricks a secret, but web scrapers work like the best detectives who can hardly miss something out. Of course, it’s crucial to always know legal limits and behave accordingly.

With JLA you can rest assured that none of these kinds of issues will happen. As we mentioned in the introduction - data is important for success, but integrity and fair play are too.

In case you want to know more about the platforms that JLA works with, or you’re interested in the tool in general, we invite you to reach out at system@justlikeapi.io.

Would you like to share your thoughts about web scraping? Let us know in the comments below.

HostHubs Blog