I Tested 5 Free Web Scraping Tools: An honest review for extracting product data from Amazon using free web scraping tools, from a non-tech professional perspective
I embarked on a mission to extract data from Amazon as a marketing manager with no prior experience in web scraping or programming. My primary objective was to scrape the top best-selling products from each department on Amazon, with position, name and price. To achieve this, I put 5 free web scraping tools to the test, evaluating their user-friendliness, learning curve, and the effectiveness of their free features.
Although I work as a marketing manager for an enterprise web scraping company, I consciously refrained from seeking my colleague’s assistance, determined to explore the capabilities of each tool independently. I wanted to explore how easy it would be for someone without technical expertise to utilize web scraping tools and perform data extraction without external assistance. Additionally, I aimed to assess the usefulness of the information gathered through this process.
Whether you’re a marketing professional looking for competitive intelligence or a beginner without technical knowledge, I hope this review provides insights and guidance to other non-technical professionals who may be interested in utilizing such free web scraping tools for their own data-gathering purposes.
Exploring the Web Scraping Tools:
During my exploration of web scraping tools, I encountered a variety of options available online. These tools can be categorized into 3 different types:
Desktop applications (require downloading and installing on your device)
Web extensions
Web-based services (Cloud-based)
In total, I tested approximately 5 different web scraping tools. Among them, I found that 3 stood out as being particularly user-friendly for non-tech professionals: ParseHub, Webscraper.io and Octoparse. However, it is worth noting that most of the tools I encountered required programming skills, which I lacked. And others proved to be quite complicated to use.
Before you start scraping:
As I tested various web scraping tools, I noticed that they became easier to use not because of the tools themselves but because my understanding of web scraping improved. As I tested the tools, I better understood scraping terminologies. Additionally, I gained a better understanding of how the website I was scraping was organized, and that was key for my success.
From my experience, I learned that before starting any scraping project, it is crucial to:
Have a clear vision of the specific data you want to extract: this clarity helps in selecting the right tool and defining the parameters for scraping effectively.
Understanding the structure of the website: this includes knowing how the pages are organized, how the categories are structured, how the navigation system works, and pinpointing the exact location of the desired information. Familiarizing yourself with these aspects allows for more efficient and accurate scraping.
By mastering these elements, you can optimize your scraping process and achieve better results using the tool you choose.
The 3 best free web scraping tools for non-tech professionals scraping Amazon:
Ranking
WebScraper (Chrome plugin):
Overall score: 7.5
User-friendliness: 8
Learning curve: 8
Effectiveness: 7
The Web Scraper plugin proved to be a reliable option for e-commerce web scraping requirements. With the assistance of a tutorial, I was able to set it up and get started quite easily.
Within around 45 minutes, I familiarized myself with the basic features of the Web Scraper plugin. The tutorial provided step-by-step instructions, which helped me grasp the functionalities of the tool. However, the tutorial could have provided better clarity on pagination, as it caused some confusion when dealing with multiple pages of data. After watching all the videos and restarting my work, I was able to understand and find the best structure that suited my needs.
The selector graph feature of the plugin was beneficial in visualizing the organization of the website before running the job. However, the preview feature could be improved, as you can not really gasp the what the final result would look like before actually running the job.
I was able to achieve my goal to scrape the best sellers of each department products with the position, name and price. It should be noted that more complex scraping tasks may not be ideal with this plugin.
Also, it’s important to mention that the Web Scraper plugin offers a cloud automation tool for free, although I didn’t explore or utilize this feature during my review.
In conclusion, the Web Scraper plugin proved to be a reliable tool for web scraping, particularly for simpler scraping tasks. It had a moderate learning curve and provided organized data in a convenient format. Among all the platforms I reviewed, it was one of the easiest to learn and use. While there is room for improvement in handling more complex scenarios and offering advanced features, the plugin serves as a solid foundation for beginners venturing into web scraping projects.
Octoparse ( Desktop-based):
Overall score: 8
User-friendliness: 8
Learning curve: 7
Effectiveness: 9
Octoparse, a desktop-based web scraping tool, offers a convenient option by providing a downloadable dashboard directly on your desktop.
During my testing, I explored the templates they have available for Amazon. Pre-made web scraping tasks for common projects. The template is keyword based, so I tried using the keyword “bestsellers” but unfortunately, the template did not generate any data, so I proceeded to create a custom task.
One notable feature of Octoparse is its smart functionality. The tool automatically recognizes items on the web page through its “auto detect” tool. This automation saves time and effort by eliminating the need for manual selection. Although the task setting interface may not be immediately self-explanatory, it is more intuitive compared to some other tools I tried.
Octoparse provides a help center with a variety of resources, including 101 guides, case tutorials, and frequently asked questions. These resources can assist users in understanding and navigating the tool effectively.
In summary, Octoparse offers a convenient desktop-based solution for web scraping needs. The platform offers powerful tools for more complex web scraping tasks that I did not explore. The task setting interface may require some initial exploration, but the availability of the help center with comprehensive guides and tutorials enhances the overall user experience.
In the end, I successfully accomplished my goal of scraping the bestsellers from Amazon for each department, retrieving the product’s position, name, and price. However, it is important to note that the learning curve for Octoparse was slightly steeper compared to other tools. It took me approximately 75 minutes to become familiar with its features and settings.
ParseHub (Desktop-based):
Overall score: 9
User-friendliness: 10
Learning curve: 9
Effectiveness: 9
ParseHub, a desktop-based web scraping tool, provides clear and concise installation instructions, making the setup process seamless.
One standout feature of ParseHub is its intuitive command system. The commands are designed in a way that is easy to understand and navigate. This makes creating scraping tasks a straightforward process, even for users with limited technical expertise like myself. I found their instructions to be highly informative and user-friendly.
ParseHub’s relative selection feature is a smart and intuitive selection option, allowing users to extract data accurately and efficiently. This feature enhances the overall usability of the tool and contributes to a positive user experience. You can also preview the data and see if you are setting the scraper correctly as you go, and make corrections on the way.
What sets ParseHub apart is its comprehensive approach to user guidance. The platform is built in such a way that users are guided at every step of the scraping process. From tutorials to interactive instructions, ParseHub ensures that users are well-supported throughout their web scraping journey.
For my specific goal of scraping the best sellers from Amazon for each category, including the product’s position, name, and price, ParseHub proved to be the best tool. Its capabilities aligned perfectly with my requirements, and I was able to achieve the desired results effortlessly.
ParseHub is a powerful desktop-based web scraping tool. With clear installation instructions, intuitive commands, smart relative selection, and a user-centric approach, it stands out as a top choice. It excelled in helping me accomplish my goal of scraping Amazon’s best sellers for each category with ease and precision.
Other Tools I Tested:
Apify (web-based):
Apify is a web-based platform that initially appears promising upon logging into the portal. However, its dashboard lacks self-explanatory features, particularly in the browser version.
The platform provides a help center, which is a valuable resource for users seeking guidance. However, I found that the “Getting Started” section lacked clear instructions on how to actually begin scraping. As a result, I had to resort to external search engine queries to find instructions on how to use the tool effectively. The documentation provided in the Apify Academy was not well-organized or straightforward.
Another drawback is the reliance on third-party apps and the need for coding knowledge. It is not explicitly clear if the expected results can be achieved by following the instructional videos. This creates uncertainty and adds complexity to the scraping process.
In terms of difficulty, Apify presents a steep learning curve. It took me several hours to grasp the tool’s functionality. However, I still had doubts about whether I would be able to accomplish my scraping task, so I did not proceed further.
Data Scraper/Data Miner (Chrome extension):
The Data Scraper extension requires the creation of a recipe (web scraping task) and has limitations in terms of scraping beyond the main page.
This tool was relatively easy to use and learn, but from a page-by-page perspective rather than for scraping across multiple pages. Therefore, Data Scraper is easy to use but with certain restrictions on scraping capabilities and it was not ideal to fulfill my goals, but it can definitely be of good use if you need data from a one page scroll. However, if you require data from a number of categories, and need the crawler to enter each page, it is not the ideal tool.
Webz.io and ScrapingBot:
When selecting the tools to test, I also considered Webz.io as a potential option. However, I found it to be too difficult to start with, which hindered my progress. Additionally, upon researching the tool, it became apparent that its primary focus is on scraping blog, forum, and review data. The lack of mention regarding e-commerce data made me hesitant to proceed further with Webz.io.
Another tool I considered was ScrapingBot. However, it is primarily designed for developers, which may limit its usability for non-technical users like myself. Due to this specialization, I decided not to explore ScrapingBot further in the context of my web scraping project.
True-hearted note:
When considering web scraping for daily data collection, it is important to note that running the scraper manually every day can be time-consuming and impractical. However, some web scraping tools offer paid automation and scheduling features that can automate the scraping process on a daily basis.
It is crucial to assess the complexity and quality requirements of your scraping job, especially if the data is time-sensitive. One example of a complex web scraping project is jobs that require automation and matched products, for building an e-commerce site with competitive pricing. This project involves scraping data from multiple online stores, collecting product information such as name, description, price, and availability, and then matching similar products across different websites with time sensitivity. If you have a more complex task or cannot afford any compromise on data quality it may be advisable to seek the services of a professional web scraping company. These companies have the expertise and track record of handling difficult-to-scrape data and can provide reliable and timely results.