Zyte Automatic Extraction will be discontinued starting April 30th, 2024. It is replaced by Zyte API. See Migrating from Automatic Extraction to Zyte API.

Get started#

The Automatic Extraction is a service for automatically extracting information from web pages.

You provide the page URLs that you are interested in, and what type of content you expect to find there: article, article list, comments, forum posts, job posting, product, product list, real estate, reviews or vehicle.

The service will then fetch the content, and apply a number of techniques behind the scenes to extract as much information as possible. Finally, the extracted information is returned to you in structured form.

Page types#

The following page types are supported:

Price intelligence & ecommerce#

Media & discussion monitoring#

Market research#

In addition to that, Automatic Extraction returns some general information about a web page.

Getting started#

If you just want to extract data using CLI, use the zyte-autoextract client library.

Otherwise, see our code examples:

See Automatic Extraction API for the detailed description of the Automatic Extraction HTTP API.