The Automatic Extraction is a service for automatically extracting information from web pages.
You provide the page URLs that you are interested in, and what type of content you expect to find there: article, article list, comments, forum posts, job posting, product, product list, real estate, reviews or vehicle.
The service will then fetch the content, and apply a number of techniques behind the scenes to extract as much information as possible. Finally, the extracted information is returned to you in structured form.
The following page types are supported:
Price intelligence & ecommerce#
Media & discussion monitoring#
In addition to that, Automatic Extraction returns some general information about a web page.
If you just want to extract data using CLI, use the zyte-autoextract client library.
Otherwise, see our code examples:
See Automatic Extraction API for the detailed description of the Automatic Extraction HTTP API.