Web scraping without code#
With a subscription to Zyte API and a subscription to Scrapy Cloud, you can use Zyte’s no-code solution to web scraping, which can extract product data from any e-commerce website.
Note
Support for additional types of websites, such as news sites, is coming soon.
Follow the steps below to get your product data now.
1. Create a project#
You first need a Scrapy Cloud project, which stores a web scraping code base. Zyte provides an AI-powered code base that you can use with no coding on your part.
Open the Start a Project page.
Enter a Project Name.
Click Select under Zyte’s AI-Powered Spiders.
Click Create project.
2. Create a spider#
Next you need to define a spider. Your project provides a template for an e-commerce spider that you can use to create a spider for any e-commerce website by specifying a target URL and a few optional parameters.
On the Create Spider page:
Click Select under E-commerce.
Enter a Name for your spider.
Enter the initial URL.
You can use the home page of an e-commerce website to get all products from the website, or you can point to a category to get only products from that category and subcategories.
Geolocation lets you customize the country from which the target website will be crawled.
If unspecified, Zyte API will automatically select the right geolocation to use based on the target website.
You can use Max Requests to set a limit on the number of Zyte API requests that your spider can make, which determines the cost of a spider job. See Zyte API pricing.
For example, to generate a small sample, you can set the limit to 100 requests.
By default, your spider follows pagination, subcategories, and product detail pages.
If you are not happy with the resulting coverage, you can switch Crawl Strategy to Full to make your spider follow every link found.
Under Extraction Source, you can select httpResponseBody to lower costs and run time on websites that do not require browser rendering. When in doubt, better leave the default value.
You can now click Save and run to finish creating your spider and start your first spider job.
Once the job finishes, download the extracted product data (items).
You can now run a new job of your new spider whenever you want, and create as many additional spiders as you want.
A programmer can use zyte-spider-templates-project as a template to create a new Scrapy project that includes Zyte’s AI-powered spiders, make customizations on top of them, or even implement new spiders and spider templates from scratch, and deploy the resulting code into your Scrapy Cloud project so that you can use it from the Scrapy Cloud UI.