Web scraping without code#
With a subscription to Zyte API and a subscription to Scrapy Cloud, you can use Zyte’s no-code solution to web scraping: AI spiders.
Follow the steps below to get web data now with an AI spider.
1. Create a project#
You first need a Scrapy Cloud project, which stores a web scraping code base. Zyte provides an AI-powered code base that you can use with no coding on your part.
Open the Start a Project page.
Enter a Project Name.
Click Select under Zyte’s AI-Powered Spiders.
Click Create project.
2. Create a spider#
Next you need to define a spider.
Your project provides a template for an e-commerce spider that you can use to create a spider for any e-commerce website by specifying a target URL and a few optional parameters.
On the Create Spider page:
Click Select under E-commerce.
Enter a Name for your spider.
Enter the initial URL.
You can use the home page of an e-commerce website to get all products from the website, or you can point to a category to get only products from that category and subcategories.
Max Requests sets a limit on the number of Zyte API requests that your spider can make.
Zyte API charges per request. The default limit, 100, is meant to prevent accidental extra costs during tests. You need to increase this limit for larger crawls.
Under Extraction Source, you can select a specific type of request for extraction.
httpResponseBody is cheaper, but on JavaScript-heavy websites it might miss some data. browserHtml can give you that extra data, but has a higher cost. If no extraction source is specified, Zyte API automatically decides which extraction source to use.
You can now click Save and run to finish creating your spider and start your first spider job.
Once the job finishes, download the extracted product data (items).
You can now run a new job of your new spider whenever you want, and create as many additional spiders as you want.
See AI spiders for more information.