# Zyte Documentation

## Zyte documentation

How would you like to get public data from the Internet?

> ##### Write your own code
>
> New to web scraping? Learn here!
>
> Have some experience? Improve your stack with our solutions below.

> ##### [We do it for you](https://www.zyte.com/data-extraction/)
>
> We can design, build, monitor, and maintain custom web scraping
> solutions.
>
> [Talk to us!](https://www.zyte.com/data-extraction/)

At Zyte we have solutions for every web scraping need:

> ##### Antiban
>
> Get HTTP responses without bans with Zyte
> API.

> ##### Hosting
>
> Run your code on Scrapy Cloud.

> ##### Browser automation
>
> Get powerful browser automation with
> Zyte API.

> ##### Coding Agent Add-Ons
>
> Use Coding Agent Add-Ons to build better web scraping projects faster.

> ##### Automatic extraction
>
> Let AI do the parsing for you with Zyte API.

## Get started with web scraping

**Web scraping** is the download of data from websites in a structured format
that you can process.

Use cases include price intelligence, market analysis, competitor intelligence,
vendor management, lead generation, investment research, and brand monitoring.

Below we expand on the steps and challenges of web scraping, and the solutions
that we offer or recommend.

> ###### TIP
>
> For a more hands-on experience, see the tutorial.

### Steps in web scraping

Getting structured data from a website involves the following steps:

1. Building a list of **target URLs** from which you want to get structured
   data.

   You can build it manually, or get it from an external source, however you
   will often want to use web scraping to find all URLs of interest in a given
   website.

   This process, known as **web crawling**, is a web scraping process in
   itself, with the same steps (target URLs, download, parsing), where the
   target URL is often the homepage of the target website, and the output is
   usually the list of target URLs.
2. **Downloading** the webpages at those URLs.

   The main challenge at the download stage is avoiding bans. However, the complexity of this step can be also
   influenced by your output needs (e.g. screenshots) and parsing choices (e.g. browser
   automation).
3. **Parsing** those webpages to extract data of interest in a structured data
   format as output.

   Parsing can be a complex step, and it may involve downloading additional
   URLs, or using browser automation.

   In the long-term, however, the main challenge of the parsing stage is
   dealing with breaking changes in the target website.

   To make parsing easier, use ai-code.

### Choosing a framework

Choosing the right technology to write your code is key for the long-term
success of your web scraping project. To make your choice, you should
consider aspects like development speed, performance, maintainability, and
vendor lock-in.

At Zyte we use and maintain [Scrapy](https://scrapy.org), a popular open source web scraping
framework written in Python. Scrapy is a powerful, extensible framework that
favors writing maintainable code.

For most Zyte products and services we provide Scrapy plugins that make
integration with Scrapy seamless.

### Avoiding bans

An increasing number of websites ban some of their traffic.

What you need to do to avoid bans depends on the target websites, and it can
vary wildly. Proxy rotation is often necessary, but on top of that you may need
extra logic, including cookie and session handling, browser-like JavaScript
execution, and browser-like HTTP protocol handling.

While you can implement ban avoidance on your own by combining different
services and tools, it can be time-consuming to implement, maintain, and scale.

To avoid bans, we provide Zyte API, an API that
automatically avoids bans cost-efficiently.

### Saving time with browser automation

For websites that use JavaScript to load content, there are 2 approaches you
can take:

- Reverse-engineering and recreating the JavaScript code that loads the
  content.
- Letting a browser automation tool run the JavaScript code.

Reverse-engineering usually requires more development time, but requires fewer
resources once implemented, and can uncover useful, hidden data.

Browser automation usually saves development time, but requires additional
resources, and can be hard to scale.

Zyte API provides browser automation features and, unlike regular browser automation tools, it:

- Scales easily, offsetting one of the main drawbacks of using browser
  automation tools.
- Supports special actions, high-level actions
  built with website-specific knowledge, such as searching or filling
  location data, to save you even more time.

### Taking screenshots

Sometimes you want a screenshot of the webpage from which you are extracting
data.

Screenshots can be handy as a visual representation of the extracted data, but
they can also be used, for example, to perform random quality checks where you
compare the screenshot with the extracted data.

To take webpage screenshots at scale, use Zyte API.

### Running your code

When running your web scraping code, you usually want a system where you can
easily start, schedule, monitor, and inspect your web scraping jobs, where they
can run uninterrupted for as long as needed, and where you can run as many
parallel jobs as you wish.

Scrapy Cloud is our solution for running web scraping code
in the cloud.

### Avoiding breaking website changes

Websites change, and when they do they can break your parsing code.

Monitoring your web scraping solution for breaking website changes, and
addressing those changes, can be very time-consuming, and it scales up as you
target more websites.

One way to avoid this issue altogether is *not* to write parsing code to begin
with. Instead, you can let Zyte API handle parsing
for you.

Alternatively, you can make addressing those changes less time-consuming with
ai-code.

## Web scraping tutorials

> ##### Web scraping tutorial
>
> Build a production-ready web-scraping project from scratch.

> ##### Tutorial for Zyte Web Data for Claude Code
>
> Use **Zyte Web Data for Claude Code** to generate better web scraping
> code faster.

> ##### Tutorial for Web Scraping Copilot
>
> Use **Web Scraping Copilot** to generate better web scraping code
> faster with **Visual Studio Code**.

## Web scraping tutorial

In this tutorial you will build a [production-ready web-scraping project](https://github.com/zytedata/web-scraping-tutorial-project) from
scratch:

> ##### 1. Start a Scrapy project
>
> Install Python and Scrapy, create a Scrapy project, and write your
> first spider.

> ##### 2. Deploy and run on Scrapy Cloud
>
> Deploy your project to Scrapy Cloud, run a job, and download the
> results.

> ##### 3. Enable Zyte API to avoid bans
>
> Install scrapy-zyte-api, and configure your project to use it in
> transparent mode.

> ##### 4. Handle JavaScript content
>
> Reproduce JavaScript code with HTTP requests, or execute it with
> browser automation.

> ##### 5. Automate parsing
>
> Use automatic extraction to get structured data without writing parsing
> code.

If you want to learn more, check out our guides!

## Start a Scrapy project

To build your web scraping project, you will use [Scrapy](https://scrapy.org/), a popular open source
web scraping framework written in [Python](https://www.python.org/) and maintained by Zyte.

### Set up your project

#### Claude Code

> ###### NOTE
>
> Uses claude.

1. Install Zyte Web Data for Claude Code.
2. Create a `web-scraping-tutorial` folder and start a **Claude
   Code** session from it:
   ```shell
   mkdir web-scraping-tutorial
   cd web-scraping-tutorial
   claude
   ```
3. Prompt **Claude Code** to:
   > Create a Scrapy project named `web-scraping-tutorial` in the
   > current folder.

#### Copilot

> ###### NOTE
>
> Uses copilot.

1. Install Web Scraping Copilot.
2. On the Web Scraping Copilot sidebar view, select **Start building ›
   Create new project**.
3. On the **Create new Scrapy project** page, set the **Scrapy project
   name** to `web-scraping-tutorial`, select a projects folder, and
   click **Create**.

   Your new `web-scraping-tutorial` workspace will be created and
   set up.

#### CLI

1. [Install Python](https://wiki.python.org/moin/BeginnersGuide/Download), version `3.10` or higher.
2. Create a `web-scraping-tutorial` folder and make it your working
   folder:
   ```bash
   mkdir web-scraping-tutorial
   cd web-scraping-tutorial
   ```
3. Create and activate a [Python virtual environment](https://docs.python.org/3/tutorial/venv.html#creating-virtual-environments).

   #### Windows

   ```batch
   python3 -m venv venv
   venv\Scripts\activate.bat
   ```

   #### macOS, Linux

   ```bash
   python3 -m venv venv
   . venv/bin/activate
   ```
4. Install the latest version of Scrapy:
   ```bash
   pip install scrapy==2.14.2
   ```
5. Make `web-scraping-tutorial` a [Scrapy](https://scrapy.org/) project folder:
   ```bash
   scrapy startproject web_scraping_tutorial .
   ```

Your `web-scraping-tutorial` folder should now contain at least the following
folders and files:

```text
web-scraping-tutorial/
├── .venv/
│   └── …
├── web_scraping_tutorial/
│   ├── spiders/
│   │   └── __init__.py
│   ├── __init__.py
│   ├── items.py
│   ├── middlewares.py
│   ├── pipelines.py
│   └── settings.py
└── scrapy.cfg
```

### Create your first spider

Now that you are all set up, you will write code to extract data from all books
in the Mystery category of [books.toscrape.com](http://books.toscrape.com/).

Create a file at `web_scraping_tutorial/spiders/books_toscrape_com.py`
with the following code:

```
from scrapy import Spider

class BooksToScrapeComSpider(Spider):
    name = "books_toscrape_com"
    custom_settings = {
        "CONCURRENT_REQUESTS_PER_DOMAIN": 8,
        "DOWNLOAD_DELAY": 0.01,
    }
    start_urls = [
        "http://books.toscrape.com/catalogue/category/books/mystery_3/index.html"
    ]

    def parse(self, response):
        next_page_links = response.css(".next a")
        yield from response.follow_all(next_page_links)
        book_links = response.css("article a")
        yield from response.follow_all(book_links, callback=self.parse_book)

    def parse_book(self, response):
        yield {
            "name": response.css("h1::text").get(),
            "price": response.css(".price_color::text").re_first("£(.*)"),
            "url": response.url,
        }
```

In the code above:

- You define a [Scrapy spider class](https://docs.scrapy.org/en/latest/topics/spiders.html) named `books_toscrape_com`.
- You set custom values for `CONCURRENT_REQUESTS_PER_DOMAIN` and
  `DOWNLOAD_DELAY` to speed crawls during the tutorial.
  [https://toscrape.com](https://toscrape.com) is a test site, so it is safe to do so.
- Your spider starts by sending a request for the Mystery category URL,
  [http://books.toscrape.com/catalogue/category/books/mystery_3/index.html](http://books.toscrape.com/catalogue/category/books/mystery_3/index.html)
  (`start_urls`), and parses the response with the default callback method:
  `parse`.
- The `parse` callback method:
  - Finds the link to the next page and, if found, yields a request for it,
    whose response will also be parsed by the `parse` callback method.

    As a result, the `parse` callback method eventually parses all pages
    of the Mystery category.
  - Finds links to book detail pages, and yields requests for them, whose
    responses will be parsed by the `parse_book` callback method.

    As a result, the `parse_book` callback method eventually parses all
    book detail pages from the Mystery category.
- The `parse_book` callback method extracts a record of book information
  with the book name, price, and URL.

> ###### TIP
>
> What if, instead of writing parsing code manually, you could use AI to
> generate it? See the tutorials of ai-code.

Now run your spider:

#### Claude Code

> ###### NOTE
>
> Uses claude.

In a separate terminal, *not* in your **Claude Code** session, run:

```bash
scrapy crawl books_toscrape_com -O books.csv
```

#### Copilot

> ###### NOTE
>
> Uses copilot.

1. Select **Web Scraping Copilot** on the sidebar.
2. Expand the **Spiders** view. Click the **Refresh** button if your
   spider is not listed.
3. Click the **Run Spider Locally** button of your spider.
4. Paste the following in the **Arguments** field:
   ```none
   -O books.csv
   ```
5. Click **Run Spider**.

#### CLI

```bash
scrapy crawl books_toscrape_com -O books.csv
```

Once execution finishes, the generated `books.csv` file will contain records
for all books from the Mystery category of [books.toscrape.com](http://books.toscrape.com/) in [CSV](https://en.wikipedia.org/wiki/Comma-separated_values)
format. You can open `books.csv` with any spreadsheet app.

Continue to the next chapter to learn how you can
easily deploy and run you web scraping project on the cloud.

## Deploy and run on Scrapy Cloud

You now have a working Scrapy project that you have
been running locally. Running your code locally is fine during development, but
for production you usually want something better.

You will now deploy and run your code on Scrapy Cloud,
which you can do for free.

### Deploy to Scrapy Cloud

#### Claude Code

> ###### NOTE
>
> Uses claude.

1. Create a Scrapy Cloud project on the [Zyte dashboard](https://app.zyte.com/).
2. Once created, copy your Scrapy Cloud project ID from the browser
   URL bar.

   For example, if the URL is
   `https://app.zyte.com/p/000000/deploy?state=deploy`, `000000`
   is your Scrapy Cloud project ID.
3. Prompt **Claude Code** to:
   > Deploy to Scrapy Cloud project `000000`

   Replacing `000000` with your actual project ID.

#### Copilot

> ###### NOTE
>
> Uses copilot.

1. Create a Scrapy Cloud project on the [Zyte dashboard](https://app.zyte.com/).
2. Back to **Visual Studio Code**, select **Web Scraping Copilot** on
   the sidebar.
3. On the **Spiders** view title, click the **Deploy to Scrapy Cloud**
   button.
   ![image](web-scraping/tutorials/main/images/cloud/deploy-button.png)
4. Complete the interactive Scrapy Cloud setup steps.
   ![image](web-scraping/tutorials/main/images/cloud/interactive-steps.png)
5. Add the following to `web-scraping-tutorial/scrapinghub.yml`:
   ```yaml
   stacks:
       default: scrapy:2.14-20260217
   ```
6. Click the **Deploy to Scrapy Cloud** button again, and confirm.

Once your Scrapy project has been deployed to your Scrapy Cloud
project, you will see a `Run your spiders at: <link>` line in the
output.

#### CLI

1. Create a Scrapy Cloud project on the [Zyte dashboard](https://app.zyte.com/).
2. Install the latest version of `shub`, the Scrapy Cloud
   command-line application:
   ```bash
   pip install --upgrade shub
   ```
3. Create a YAML file at `web-scraping-tutorial/scrapinghub.yml`
   with the following content:
   ```yaml
   stacks:
       default: scrapy:2.14-20260217
   ```
4. Copy your [Scrapy Cloud API key](https://app.zyte.com/o/settings/apikey) (*not* a Zyte API key) from the
   Zyte dashboard.
5. Run the following command and, when prompted, paste your API key
   and press `Enter`:
   ```bash
   shub login
   ```
6. On the [Zyte dashboard](https://app.zyte.com/), select your Scrapy Cloud project under
   **Scrapy Cloud Projects**, and copy your Scrapy Cloud project ID
   from the browser URL bar.

   For example, if the URL is `https://app.zyte.com/p/000000/jobs`,
   `000000` is your Scrapy Cloud project ID.
7. Make sure `web-scraping-tutorial` is your current working
   directory.
8. Run the following command, replacing `000000` with your actual
   project ID:
   ```bash
   shub deploy 000000
   ```

Your Scrapy project has now been deployed to your Scrapy Cloud project.

### Run a Scrapy Cloud job

Now that you have deployed your Scrapy project to your Scrapy Cloud project, it
is time to run one of your spiders on Scrapy Cloud:

1. On the [Zyte dashboard](https://app.zyte.com/), select your Scrapy Cloud project under **Scrapy
   Cloud Projects**.
   ![](web-scraping/tutorials/main/images/cloud/select-project.png)
2. On the **Dashboard** page of your project, select **Run** on the top-right
   corner.
   ![](web-scraping/tutorials/main/images/cloud/run.png)
3. On the **Run** dialog box:
   1. Select the **Spiders** field and, from the spider list that appears,
      select your spider name.
   2. Select **Run**.
      ![image](web-scraping/tutorials/main/images/cloud/run-run.png)

   A new Scrapy Cloud job will appear in the **Running** job list:
   ![](web-scraping/tutorials/main/images/cloud/running.png)

   Once the job finishes, it will move to the **Completed** job list:
   ![](web-scraping/tutorials/main/images/cloud/completed.png)
4. Follow the link from the **Job** column, **1/1**.
   ![](web-scraping/tutorials/main/images/cloud/job-link.png)
5. On the job page, select the **Items** tab.
   ![](web-scraping/tutorials/main/images/cloud/items.png)
6. On the **Items** page, select **Export › CSV**.
   ![](web-scraping/tutorials/main/images/cloud/export-csv.png)

The downloaded file will have the same data as the `books.csv` file that you
generated locally with your first spider.

Continue to the next chapter to learn how to avoid
website bans.

## Enable Zyte API to avoid bans

Now that you have run your project in Scrapy Cloud, it
is time to improve the project itself, starting with handling website bans.

Your target domain in this tutorial, [toscrape.com](http://toscrape.com/), does not ban traffic.
However, when targeting other websites, sooner or later you will get bans.

You will now configure your web scraping code to use Zyte API
to avoid bans on any website:

1. [Sign up for Zyte API](https://app.zyte.com/account/signup/zyteapi).

   You get $5 free for a month, and you should only need a
   fraction of that to complete this tutorial.
2. Set up your project to use Zyte API with your key:

   #### Claude Code

   > ###### NOTE
   >
   > Uses claude.

   Add `ZYTE_API_KEY = "YOUR_API_KEY"` to
   `web-scraping-tutorial/settings.py`, and replace
   `YOUR_API_KEY` with [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access).

   #### Copilot

   > ###### NOTE
   >
   > Uses copilot.

   Remove `#` from the `#ZYTE_API_KEY = "YOUR_API_KEY"` line in
   `web-scraping-tutorial/settings.py`, and replace
   `YOUR_API_KEY` with [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access).

   #### CLI

   1. Install the latest version of scrapy-zyte-api:
      ```bash
      pip install --upgrade scrapy-zyte-api
      ```
   2. Configure scrapy-zyte-api in transparent mode by adding the following code at the end of
      `web-scraping-tutorial/settings.py`, replacing
      `YOUR_ZYTE_API_KEY` with [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access):
      ```python
      ZYTE_API_KEY = "YOUR_ZYTE_API_KEY"
      ```

      Then, edit the Addons section at the start of the
      `web-scraping-tutorial/settings.py` file:
      ```python
      ADDONS = {
          "scrapy_zyte_api.Addon": 500,
      }
      ```
3. Get `scrapy-zyte-api` installed when running in Scrapy Cloud:

   #### Claude Code

   > ###### NOTE
   >
   > Uses claude.

   Skip this step.

   **Zyte Web Data for Claude Code** handled this automatically when
   you deployed your project to Scrapy Cloud.

   #### Copilot

   > ###### NOTE
   >
   > Uses copilot.

   1. Create `web-scraping-tutorial/requirements.txt` with the
      following content:
      ```none
      scrapy-zyte-api
      ```
   2. Add the following to
      `web-scraping-tutorial/scrapinghub.yml`:
      ```yaml
      requirements:
          file: requirements.txt
      ```

   #### CLI

   1. Create `web-scraping-tutorial/requirements.txt` with the
      following content:
      ```none
      scrapy-zyte-api
      ```
   2. Add the following to
      `web-scraping-tutorial/scrapinghub.yml`:
      ```yaml
      requirements:
          file: requirements.txt
      ```

If you run your code again, your code will work
the same, only that requests will be sent through Zyte API, to avoid bans
cost-efficiently.

Continue to the next chapter to learn about browser
automation.

> ###### TIP
> - We closely monitor the success rate for the most popular websites, but
>   less popular websites may slip under our radar. If you ever find a
>   website for which Zyte API does not work as expected (e.g. gives you a
>   ban response or too many errors), you can [reach
>   out to our expert anti-ban team](https://support.zyte.com/support/tickets/new).
> - If you get an SSL error, install the Zyte CA certificate on
>   your system and try again.

## Handle JavaScript content

Now that you know how to handle bans, you will learn
how to handle websites that load content dynamically using JavaScript.

You will first reproduce what the JavaScript code does with regular HTTP
requests, then you will use browser automation to achieve the same, and finally
you will interact with a page.

### Reproduce JavaScript requests

Your next target will be [http://quotes.toscrape.com/scroll](http://quotes.toscrape.com/scroll), from which you will
extract 100 quotes.

However, the HTML code of that page contains no quote at all. All 100 quotes
are loaded dynamically, the webpage uses JavaScript code to send requests to
its own API. To get all 100 quotes, you will need to reproduce those requests.

Create a file at
`web_scraping_tutorial/spiders/quotes_toscrape_com_scroll_api.py` with
the following code:

```
import json
from scrapy import Spider

class QuotesToScrapeComScrollAPISpider(Spider):
    name = "quotes_toscrape_com_scroll_api"
    custom_settings = {
        "CONCURRENT_REQUESTS_PER_DOMAIN": 8,
        "DOWNLOAD_DELAY": 0.01,
    }
    start_urls = [
        f"http://quotes.toscrape.com/api/quotes?page={n}" for n in range(1, 11)
    ]

    def parse(self, response):
        data = json.loads(response.text)
        for quote in data["quotes"]:
            yield {
                "author": quote["author"]["name"],
                "tags": quote["tags"],
                "text": quote["text"],
            }
```

The code above sends 10 requests to the API of [quotes.toscrape.com](http://quotes.toscrape.com/),
reproducing what JavaScript code at [http://quotes.toscrape.com/scroll](http://quotes.toscrape.com/scroll) does, and
then parses the JSON response to extract the desired data.

Now run your new `quotes_toscrape_com_scroll_api`
spider with `-O quotes.csv`.

After all 10 requests are processed, all 100 quotes can be found at
`quotes.csv`.

When the information that you want to extract is not readily available in the
response HTML, but loaded from JavaScript, reproducing the JavaScript code
manually, like you did above sending those 10 requests, is one option. Next you
will try a few alternative approaches.

### Use browser automation

You will now ask Zyte API to use browser automation to
render the page contents and return browser HTML,
instead of raw HTML, and you will get Zyte API to render all 100 quotes with a
single Zyte API request.

Create a file at
`web_scraping_tutorial/spiders/quotes_toscrape_com_scroll_browser.py`
with the following code:

```
from scrapy import Request, Spider

class QuotesToScrapeComScrollBrowserSpider(Spider):
    name = "quotes_toscrape_com_scroll_browser"

    async def start(self):
        yield Request(
            "http://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "scrollBottom",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        for quote in response.css(".quote"):
            yield {
                "author": quote.css(".author::text").get(),
                "tags": quote.css(".tag::text").getall(),
                "text": quote.css(".text::text").get()[1:-1],
            }
```

The code above sends a single request to [http://quotes.toscrape.com/scroll](http://quotes.toscrape.com/scroll), but
this request includes some metadata. That is why the `start` method is used
instead of `start_urls`, since the latter does not allow defining request
metadata.

The specified metadata indicates to Zyte API that you want the URL to be loaded
in a web browser, that you want to execute the `scrollBottom` action, and that
you want the HTML rendering of the webpage [DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction) after that. The `scrollBottom`
action keeps scrolling to the bottom of a webpage until that webpage stops
loading additional content, so that you get all 100 quotes, and not only the
first 10.

Now run your new
`quotes_toscrape_com_scroll_browser` spider with `-O quotes.csv`.

`quotes.csv` will have the same data as before, only that now it has been
generated through browser rendering.

### Use network capture

What if you could have the best from both worlds, i.e. use browser rendering to
avoid reverse engineering, and get the API responses and not only what is loaded
into the DOM?

You will now ask Zyte API to use network capture
to render the page contents and *capture* the API responses.

Create a file at
`web_scraping_tutorial/spiders/quotes_toscrape_com_scroll_capture.py`
with the following code:

```
import json
from base64 import b64decode

from scrapy import Request, Spider

class QuotesToScrapeComScrollCaptureSpider(Spider):
    name = "quotes_toscrape_com_scroll_capture"

    async def start(self):
        yield Request(
            "http://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "scrollBottom",
                        },
                    ],
                    "networkCapture": [
                        {
                            "filterType": "url",
                            "httpResponseBody": True,
                            "value": "/api/",
                            "matchType": "contains",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        for capture in response.raw_api_response["networkCapture"]:
            text = b64decode(capture["httpResponseBody"]).decode()
            data = json.loads(text)
            for quote in data["quotes"]:
                yield {
                    "author": quote["author"]["name"],
                    "tags": quote["tags"],
                    "text": quote["text"],
                }
```

The specified metadata indicates that we want to capture the body of any
network response that contains `/api/` in its URL.

Now run your new
`quotes_toscrape_com_scroll_capture` spider with `-O quotes.csv`.

`quotes.csv` will have the same data as before, only that now it has been
generated through network capture.

Which option is best, reproducing JavaScript code manually, using
browser-rendered HTML or using network captures, depends on each scenario. To
choose the right option, you need to factor in website specificity, development
time, run time, request count, request cost, etc.

### Use an action sequence

Sometimes, it can be really hard to reproduce JavaScript code manually, or the
resulting code can break too easily, making the browser automation option a
clear winner.

You will now extract a quote from [http://quotes.toscrape.com/search.aspx](http://quotes.toscrape.com/search.aspx) by
interacting with the search form through browser actions.

Create a file at
`web_scraping_tutorial/spiders/quotes_toscrape_com_search.py` with the
following code:

```
from scrapy import Request, Spider

class QuotesToScrapeComSearchSpider(Spider):
    name = "quotes_toscrape_com_search"

    async def start(self):
        yield Request(
            "http://quotes.toscrape.com/search.aspx",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "select",
                            "selector": {"type": "css", "value": "#author"},
                            "values": ["Albert Einstein"],
                        },
                        {
                            "action": "waitForSelector",
                            "selector": {
                                "type": "css",
                                "value": "[value=\"world\"]",
                                "state": "attached",
                            },
                        },
                        {
                            "action": "select",
                            "selector": {"type": "css", "value": "#tag"},
                            "values": ["world"],
                        },
                        {
                            "action": "click",
                            "selector": {"type": "css", "value": "[type='submit']"},
                        },
                        {
                            "action": "waitForSelector",
                            "selector": {"type": "css", "value": ".quote"},
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        for quote in response.css(".quote"):
            yield {
                "author": quote.css(".author::text").get(),
                "tags": quote.css(".tag::text").getall(),
                "text": quote.css(".content::text").get()[1:-1],
            }
```

The code above sends a request that makes Zyte API load
[http://quotes.toscrape.com/search.aspx](http://quotes.toscrape.com/search.aspx) and perform the following actions:

1. Select Albert Einstein as author.
2. Wait for the “world” tag to load.
3. Select the “world” tag.
4. Click the **Search** button.
5. Wait for a quote to load.

From the HTML rendering of the DOM after those actions are executed, your code
extracts all displayed quotes.

Now run your new `quotes_toscrape_com_search`
spider with `-O quotes.csv`.

`quotes.csv` will have 1 quote from Albert Einstein about the world.

If you were to try and write alternative code that, instead of relying on the
browser HTML feature from Zyte API, reproduces the underlying JavaScript code
with regular requests, it may take you a while to build a working solution, and
your solution may be more fragile, i.e. more likely to break with server code
changes.

Continue to the next chapter to learn how you can
avoid the need to write and maintain parsing code in the first place.

## Automate parsing

Now that you are familiar with browser automation, it is
time to learn about *parsing automation*.

> ###### TIP
>
> This page covers AI-powered parsing on every request. See also the
> tutorials of ai-code for an alternative approach where AI is used to
> generate parsing code instead. See also zapi-extract-vs-ai-code.

Your first spider parsed 3 fields from book webpages of
[books.toscrape.com](http://books.toscrape.com/): `name`, `price`, `url`.

When targeting other websites, there are 2 challenges you are going to face:

- You will probably want more fields.

  For example, our [automatic extraction product schema](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/product) has more than 25 fields. You need to write parsing
  logic for every combination of target field *and* target website.
  ai-code can speed up this work significantly, but they cannot make
  it go away.
- Websites change, and when they do they can break your parsing code.

  You need to monitor your web scraping project for breaking website changes,
  and update your parsing code accordingly when they occur.

These issues are time-consuming and scale up with additional fields and
websites. To avoid them altogether, you can let Zyte API handle parsing for
you.

Create a file at
`web_scraping_tutorial/spiders/books_toscrape_com_extract.py` with the
following code:

```
from scrapy import Spider

class BooksToScrapeComExtractSpider(Spider):
    name = "books_toscrape_com_extract"
    custom_settings = {
        "CONCURRENT_REQUESTS_PER_DOMAIN": 8,
        "DOWNLOAD_DELAY": 0.01,
    }
    start_urls = [
        "http://books.toscrape.com/catalogue/category/books/mystery_3/index.html"
    ]

    def parse(self, response):
        next_page_links = response.css(".next a")
        yield from response.follow_all(next_page_links)
        book_links = response.css("article a")
        for request in response.follow_all(book_links, callback=self.parse_book):
            request.meta["zyte_api_automap"] = {"product": True}
            yield request

    def parse_book(self, response):
        yield response.raw_api_response["product"]
```

The code above is a modification of your first spider
that uses automatic extraction, where:

- In requests for book URLs, at the end of the `parse` callback method, you
  include request metadata to have Zyte API give you structured data for an
  e-commerce product.
- The `parse_book` callback method yields the product data from the Zyte
  API response.

Now run your new `books_toscrape_com_extract`
spider with `-O books.csv`.

Your code will now extract many more fields from each book, all without you
having to write a single line of parsing code.

> ###### NOTE
>
> zapi-extract requires you to specify the kind of data you
> want to extract.
>
> Your spider above uses [product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product) to request the data of a
> single e-commerce product, but automatic extraction supports many
> other types of data extraction.
>
> For example, if you need to extract a news article or a blog post, use the
> [article](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/article) data extraction type instead.

This concludes our web scraping tutorial. The tutorial code is available [on
GitHub](https://github.com/zytedata/web-scraping-tutorial-project). To learn
more, check out our web scraping guides, our documentation for
Zyte API and Scrapy Cloud, and the
Scrapy documentation. You can also visit our [Support
Center](https://support.zyte.com/support/home) or reach out to the wider [web
scraping](https://discord.gg/GjB8dHCCJS) and [Scrapy](https://scrapy.org/community/) communities.

## Tutorial for Zyte Web Data for Claude Code

With [Claude Code](https://code.claude.com/docs/en/overview) and Zyte Web Data for Claude Code,
you can write better web scraping code faster:

1. Install Zyte Web Data for Claude Code.
2. Create a new folder and start a **Claude Code** session:
   ```shell
   mkdir claude-tutorial
   cd claude-tutorial
   claude
   ```
3. Prompt **Claude Code** to:
   > Scrape books.toscrape.com

**Claude Code** will take care of the rest, interacting with you only when
needed. For example:

- It will ask you how to get a detail page to analyze. You can, for example,
  choose to **Explore the site** to let Claude Code find a detail page on its
  own.
- It will analyze detail pages and propose fields to extract with example
  data, and you can adjust the extracted data schema to fit your preferences.

  It will also later on give you the option to open a **browser review**, a
  local web app where you can review data extracted with the earlier schema
  and provide feedback to the model about it.

## Tutorial for Web Scraping Copilot

With [GitHub Copilot](https://github.com/features/copilot) and Web Scraping Copilot you
can write maintainable web scraping code using AI:

> ###### WARNING
>
> The **GitHub Copilot Free** plan is *not* recommended for
> AI-assisted web scraping. See codegen-requirements for details.

> ##### 1. Set up a project
>
> Prepare Visual Studio Code and a Scrapy project with all prerequisites.

> ##### 2. Generate parsing code
>
> Use AI to generate parsing code for a webpage.

> ##### 3. Generate crawling code
>
> Use AI to generate crawling code for a website.

## Set up an AI web scraping project

Set up a project for AI-assisted web scraping:

1. Install Web Scraping Copilot.
2. On the Web Scraping Copilot sidebar view, select **Start building ›
   Create new project**.
3. On the **Create new Scrapy project** page, set the **Scrapy project name**
   to `copilot-tutorial`, select a projects folder, and click
   **Create**.

Your new `copilot-tutorial` workspace will be created and set up with the
following folders and files:

```
copilot-tutorial/
├── .venv/
│   └── …
├── copilot_tutorial/
│   ├── pages/
│   │   └── __init__.py
│   ├── spiders/
│   │   └── __init__.py
│   ├── __init__.py
│   ├── items.py
│   ├── middlewares.py
│   ├── pipelines.py
│   └── settings.py
└── scrapy.cfg
```

You have now everything you need to start generating parsing code with AI.

## Generate parsing code with AI

Now that your project is ready, you will use AI
to generate code to parse book webpages from [https://books.toscrape.com](https://books.toscrape.com).

### 1. Generate an item class

First, you need to define the type of data that you want to parse from each
book page.

> ###### TIP
>
> Select a somewhat smart model in the chat view, i.e. **GPT-5** or
> similar. **GPT-5 mini** is OK if you prefer a non-premium model. GPT-4.1 is
> problematic for web scraping.

Ask the AI to:

> Define a dataclass item called Book with title, price and url fields. Make
> them optional and of type str | None.

The AI should edit `copilot-tutorial/copilot_tutorial/items.py` to add:

`copilot-tutorial/copilot_tutorial/items.py`
```python
from dataclasses import dataclass

@dataclass
class Book:
    url: str | None = None
    title: str | None = None
    price: str | None = None
```

> ###### TIP
>
> You can use any item type supported by Scrapy,
> `dataclass` is one of many options.
>
> You can also use a pre-made item type from
> zyte-common-items, like
> `zyte_common_items.Product`, instead of writing your own item type
> from scratch.

### 2. Generate parsing code

Select **Web Scraping Copilot › Page Objects › Generate Parsing Code with AI**:

![image](_static/copilot/generate-0.1.0.png)

The chat view will open with the **WebScraping** agent, a prompt will be sent,
and the AI will start assisting. It should:

1. Ask you for some **input**.

   It usually detects the right item type to use and the right path to save
   your page objects (more on them later), but it always needs you to specify
   **example target URLs**.

   You are generating a page object for book detail pages, so choose a few
   such URLs and share them in chat.
   For example:
   [https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html](https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html)
   [https://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html](https://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html)
   [https://books.toscrape.com/catalogue/soumission_998/index.html](https://books.toscrape.com/catalogue/soumission_998/index.html)
2. Create
   `copilot-tutorial/copilot_tutorial/pages/books_toscrape_com.py` with
   something like:
   ```python
   from copilot_tutorial.items import Book
   from web_poet import Returns, WebPage, field, handle_urls

   @handle_urls("books.toscrape.com")
   class BooksToscrapeComBookPage(WebPage, Returns[Book]):
       pass
   ```

   > ###### NOTE
   >
   > This is a page object class. It
   > defines how to extract a given type of data
   > (e.g. `Book`) from a given URL pattern (e.g.
   > the `books.toscrape.com` domain).
3. Generate tests for the target example URLs.
   > ###### NOTE
   >
   > web-poet tests are example inputs and
   > expected outputs for a page object class. They can also assert that a
   > given input should raise an expected exception. You can use them to
   > test your code, and the AI can use them to generate the right parsing
   > code.
4. Populate test expectations.
5. Generate parsing code for your new page object class.
6. Run the generated tests to check that the generated parsing code extracts
   the expected data.

By the end, you should have a working page object class that can extract book
data from any book URL from [https://books.toscrape.com](https://books.toscrape.com).

### 3. Create a spider

Now that you have a working page object, it is time to implement a Scrapy
spider that uses it.

Create the following file:

`copilot-tutorial/copilot_tutorial/spiders/books.py`
```python
from scrapy import Request, Spider

from copilot_tutorial.items import Book

class BookSpider(Spider):
    name = "book"
    url: str

    async def start(self):
        yield Request(self.url, callback=self.parse_book)

    async def parse_book(self, _, book: Book):
        yield book
```

The spider expects a `url` argument, which you can pass to a spider with the
`-a url=<url>` syntax.

When a request targets the `parse_book` callback, scrapy-poet sees the `Book` type hint and injects a `book`
parameter built with your page object class.

Your spider can now extract book data from any book details page from
[https://books.toscrape.com](https://books.toscrape.com). For example, try running your `book` spider with the following arguments:

```bash
-a url=https://books.toscrape.com/catalogue/soumission_998/index.html
-o books.jsonl
```

It will generate a `books.jsonl` file with the following JSON object:

```json
{
    "url": "https://books.toscrape.com/catalogue/soumission_998/index.html",
    "title": "Soumission",
    "price": "50.10"
}
```

You can also repeat step 2 for other book stores, and this spider will also
work for them, no need to have separate spiders per website.

Continue to the next chapter to use AI to
generate *crawling* code, to be able to write a spider that can crawl an entire
book store, and not just a single book URL.

## Generate crawling code with AI

Now that you have generated parsing code with AI, you will use AI to generate book URL discovery code
by parsing navigation webpages (homepage, categories) from
[https://books.toscrape.com](https://books.toscrape.com).

### 1. Generate navigation code

This time around, you can try generating an item class and a page object with a
single prompt:

> Now I want you to create a new item type, BookNavigation, for navigation
> data from the homepage and categories with the following optional fields:
> url, book_urls, next_page_url.

> Then create a page object for that item type and books.toscrape.com.

The workflow will be the same as before.

The generated item should look something like:

`copilot-tutorial/copilot_tutorial/items.py`
```python
@dataclass
class BookNavigation:
    url: str | None = None
    book_urls: list[str] | None = None
    next_page_url: str | None = None
```

As input URLs, use some variety. For example:
[https://books.toscrape.com/index.html](https://books.toscrape.com/index.html)
[https://books.toscrape.com/catalogue/category/books/paranormal_24/index.html](https://books.toscrape.com/catalogue/category/books/paranormal_24/index.html)
[https://books.toscrape.com/catalogue/category/books/mystery_3/index.html](https://books.toscrape.com/catalogue/category/books/mystery_3/index.html)
[https://books.toscrape.com/catalogue/category/books/fiction_10/page-2.html](https://books.toscrape.com/catalogue/category/books/fiction_10/page-2.html)
[https://books.toscrape.com/catalogue/category/books/historical-fiction_4/page-2.html](https://books.toscrape.com/catalogue/category/books/historical-fiction_4/page-2.html)

### 2. Create a crawling spider

Finally, add a new spider to
`copilot-tutorial/copilot_tutorial/spiders/books.py` that uses the new
`BookNavigation` item to implement crawling:

```python
from copilot_tutorial.items import BookNavigation

class BookNavigationSpider(Spider):
    name = "books"
    url: str

    async def start(self):
        yield Request(self.url, callback=self.parse_navigation)

    async def parse_navigation(self, response, navigation: BookNavigation):
        if navigation.next_page_url:
            yield response.follow(navigation.next_page_url, callback=self.parse_navigation)
        for url in navigation.book_urls or []:
            yield response.follow(url, callback=self.parse_book)

    async def parse_book(self, _, book: Book):
        yield book
```

Your new spider expects a navigation page as its `url` argument, and can
follow pagination and extract all relevant books.

Before you run it, however, you best add the following at the end of
`copilot-tutorial/copilot_tutorial/settings.py`:

`copilot-tutorial/copilot_tutorial/settings.py`
```python
DOWNLOAD_SLOTS = {
    "books.toscrape.com": {
        "delay": 0.01,
        "concurrency": 16,
    },
}
```

By default, Scrapy rate-limits requests. For [https://books.toscrape.com](https://books.toscrape.com),
however, it is safe to use a higher concurrency and a lower delay, and it will
make running the spider much quicker.

Now run the `books` spider again with the
following **Arguments**:

```bash
-a url=https://books.toscrape.com/catalogue/category/books/mystery_3/index.html
-o books.jsonl
```

It will add the 32 books from the [Mystery category](https://books.toscrape.com/catalogue/category/books/mystery_3/index.html) to `books.jsonl`.

### Next steps

Congratulations! You have successfully used AI to generate maintainable web
scraping code.

This concludes our **Web Scraping Copilot** tutorial.

If you are wondering what to do next, consider enabling Zyte API to avoid bans.
See tutorial-zapi in tutorial.

## Web scraping guides

> ##### Exporting scraped data
>
> Learn to download or export your scraped data however and wherever you
> like.

## Exporting scraped data

How would you like to get your scraped data?

> ##### Scrapy Cloud
>
> Download from Scrapy Cloud.

> ##### File storage
>
> Use Scrapy to export to a file storage service, like Amazon S3 or
> Google Cloud Storage.

> ##### Item storage
>
> Use Scrapy to export to a database, message queue, indexer, or similar
> service.

### File storage

Choose to which file storage service you wish to export using [Scrapy](https://scrapy.org):

> ##### Amazon S3

> ##### Azure Storage

> ##### Dropbox

> ##### FTP servers

> ##### Google Cloud Storage

> ##### Google Drive

> ##### Google Sheets

> ##### SFTP servers

File storage exporting with Scrapy also provides [many options](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options),
including:
[batching](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEED_EXPORT_BATCH_ITEM_COUNT),
[field customization](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEED_EXPORT_FIELDS),
[item filtering](https://docs.scrapy.org/en/latest/topics/feed-exports.html#item-filter),
[compression](https://docs.scrapy.org/en/latest/topics/feed-exports.html#post-processing).

You can also create your own [Scrapy storage backend](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-backends).
Check the [code of existing storage backends](https://github.com/scrapy/scrapy/blob/721df895f9ea9d8073c13fbd2f75a6fbdc75ffc7/scrapy/extensions/feedexport.py#L258-L281)
to learn more.

### Item storage

Choose to which item storage service you wish to export using [Scrapy](https://scrapy.org):

> ##### Google BigQuery

You can also [create a custom Scrapy item pipeline](https://docs.scrapy.org/en/latest/topics/item-pipeline.html#writing-your-own-item-pipeline)
to implement item-based storage, for example using an existing Python asyncio
client library for a [database](https://github.com/timofurrer/awesome-asyncio#database-drivers) or a
[message queue](https://github.com/timofurrer/awesome-asyncio#message-queues)
service.

## Exporting to Amazon S3 with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to
[Amazon S3](https://aws.amazon.com/pm/serv-s3/):

1. Install [boto3](https://github.com/boto/boto3):
   ```bash
   pip install boto3
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   boto3
   ```
2. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
3. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "s3://<BUCKET>/<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<BUCKET>` is your [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) name, e.g. `mybucket`.
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
     > ###### WARNING
     >
     > Any pre-existing file in the specified path will be
     > overwritten. [Amazon S3 does not support appending to a file](https://stackoverflow.com/a/41783997).
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).
4. Define the [AWS_ACCESS_KEY_ID](https://docs.scrapy.org/en/latest/topics/settings.html#std-setting-AWS_ACCESS_KEY_ID) and [AWS_SECRET_ACCESS_KEY](https://docs.scrapy.org/en/latest/topics/settings.html#std-setting-AWS_SECRET_ACCESS_KEY) Scrapy settings
   with your [access key](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html):
   settings.py
   ```python
   AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"
   AWS_SECRET_ACCESS_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
   ```

   You can alternatively define the [AWS_SESSION_TOKEN](https://docs.scrapy.org/en/latest/topics/settings.html#std-setting-AWS_SESSION_TOKEN) setting to configure
   access with [temporary security credentials](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#temporary-access-keys).

   [Additional settings](https://docs.scrapy.org/en/latest/topics/feed-exports.html#s3)
   exist to define a target region, a custom access-control list, or a custom
   endpoint.

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Amazon S3 location.

## Exporting to Azure Storage with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to
[Azure Storage](https://learn.microsoft.com/en-us/azure/storage/):

1. You need Python 3.8 or higher.

   If you are using Scrapy Cloud, make sure you are
   using [stack](https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks) `scrapy:1.7-py38` or higher. Using the latest stack
   (`scrapy:2.14-20260217`) is generally recommended.
2. Install [scrapy-feedexporter-azure-storage](https://github.com/scrapy-plugins/scrapy-feedexporter-azure-storage):
   ```bash
   pip install git+https://github.com/scrapy-plugins/scrapy-feedexporter-azure-storage
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   scrapy-feedexporter-azure-storage @ git+https://github.com/scrapy-plugins/scrapy-feedexporter-azure-storage
   ```
3. In your `settings.py` file, define `FEED_STORAGES` as follows:
   settings.py
   ```python
   FEED_STORAGES = {
       "azure": "scrapy_azure_exporter.AzureFeedStorage",
   }
   ```

   If the setting already exists in your `settings.py` file, modify the existing
   setting to add the key-value pair above, instead of re-defining the setting.
4. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
5. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "azure://<ACCOUNT>.blob.core.windows.net/<CONTAINER>/<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<ACCOUNT>` is the name of your [storage account](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#storage-accounts), e.g.
     `myaccount`.
   - `<CONTAINER>` is the name of your [container](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers), e.g. `mycontainer`.
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).
6. Define the `AZURE_ACCOUNT_URL` and `AZURE_ACCOUNT_KEY` settings with
   your [credentials](https://learn.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python#types-of-credentials):
   settings.py
   ```python
   AZURE_ACCOUNT_URL = "https://<ACCOUNT>.blob.core.windows.net"
   AZURE_ACCOUNT_KEY = "<KEY>"
   ```

   You can alternatively define the `AZURE_CONNECTION_STRING` setting to a
   [connection string](https://learn.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string):
   settings.py
   ```python
   AZURE_CONNECTION_STRING = "DefaultEndpointsProtocol=https;AccountName=xxxx;AccountKey=xxxx;EndpointSuffix=core.windows.net"
   ```

   Or, if you have an [account URL that includes a SAS token](https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/how-to-guides/create-sas-tokens), use the
   `AZURE_ACCOUNT_URL_WITH_SAS_TOKEN` setting instead:
   settings.py
   ```python
   AZURE_ACCOUNT_URL_WITH_SAS_TOKEN = "https://my.blob.core.windows.net/source-en/source-english.docx?sv=2019-12-12&st=2021-01-26T18%3A30%3A20Z&se=2021-02-05T18%3A30%3A00Z&sr=c&sp=rl&sig=d7PZKyQsIeE6xb%2B1M4Yb56I%2FEEKoNIF65D%2Fs0IFsYcE%3D"
   ```

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Azure Storage location.

## Exporting to Dropbox with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to [Dropbox](https://www.dropbox.com/):

1. Install [scrapy-feedexporter-dropbox](https://github.com/scrapy-plugins/scrapy-feedexporter-dropbox):
   ```bash
   pip install git+https://github.com/scrapy-plugins/scrapy-feedexporter-dropbox
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   scrapy-feedexporter-dropbox @ git+https://github.com/scrapy-plugins/scrapy-feedexporter-dropbox
   ```
2. In your `settings.py` file, define `FEED_STORAGES` as follows:
   settings.py
   ```python
   FEED_STORAGES = {
       "dropbox": "scrapy_dropbox.DropboxFeedStorage",
   }
   ```

   If the setting already exists in your `settings.py` file, modify the existing
   setting to add the key-value pair above, instead of re-defining the setting.
3. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
4. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "dropbox://<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).
5. Define the `DROPBOX_API_TOKEN` setting with your [access token](https://dropbox.tech/developers/generate-an-access-token-for-your-own-account):
   settings.py
   ```python
   DROPBOX_API_TOKEN = "<ACCESS TOKEN>"
   ```

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Dropbox location.

## Exporting to an FTP server with Scrapy

> ###### NOTE
>
> Not to be confused with [SFTP](https://en.wikipedia.org/wiki/SSH_File_Transfer_Protocol) (see
> sftp) or [FTPS](https://en.wikipedia.org/wiki/FTPS).

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to an [FTP
server](https://en.wikipedia.org/wiki/File_Transfer_Protocol):

1. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
2. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "ftp://<USER>:<PASSWORD>@<HOST>/<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<USER>` and `<PASSWORD>` are your credentials for the FTP server,
     [percent-encoded](https://en.wikipedia.org/wiki/Percent-encoding).
   - `<HOST>` is the FTP server host, e.g. `ftp.example.com` or
     `203.0.113.123`.
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured FTP server location.

## Exporting to Google Cloud Storage with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to
[Google Cloud Storage](https://cloud.google.com/storage):

1. Install [google-cloud-storage](https://cloud.google.com/storage/docs/reference/libraries#client-libraries-install-python):
   ```bash
   pip install google-cloud-storage
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   google-cloud-storage
   ```
2. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
3. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "gs://<BUCKET>/<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<BUCKET>` is your [bucket](https://cloud.google.com/storage/docs/buckets) name, e.g. `mybucket`.
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
     > ###### WARNING
     >
     > Any pre-existing file in the specified path will be
     > overwritten. [Google Cloud Storage does not support appending to a
     > file](https://cloud.google.com/storage/docs/objects#immutability).
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).
4. [Configure credential provision to ADC](https://cloud.google.com/docs/authentication/provide-credentials-adc#how-to).

   Also define the [GCS_PROJECT_ID](https://docs.scrapy.org/en/latest/topics/settings.html#std-setting-GCS_PROJECT_ID) Scrapy setting with your [project ID](https://cloud.google.com/resource-manager/docs/creating-managing-projects):
   settings.py
   ```python
   GCS_PROJECT_ID = "myproject"
   ```

   [Additional settings](https://docs.scrapy.org/en/latest/topics/feed-exports.html#google-cloud-storage-gcs)
   exist to define, for example, a custom access-control list.

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Google Cloud Storage location.

## Exporting to Google Drive with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to
[Google Drive](https://www.google.com/drive/):

1. You need Python 3.8 or higher.

   If you are using Scrapy Cloud, make sure you are
   using [stack](https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks) `scrapy:1.7-py38` or higher. Using the latest stack
   (`scrapy:2.14-20260217`) is generally recommended.
2. Install [scrapy-feedexporter-google-drive](https://github.com/scrapy-plugins/scrapy-feedexporter-google-drive):
   ```bash
   pip install git+https://github.com/scrapy-plugins/scrapy-feedexporter-google-drive
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   scrapy-feedexporter-google-drive @ git+https://github.com/scrapy-plugins/scrapy-feedexporter-google-drive
   ```
3. In your `settings.py` file, define `FEED_STORAGES` as follows:
   settings.py
   ```python
   FEED_STORAGES = {
       "gdrive": "scrapy_gdrive_exporter.gdrive_exporter.GoogleDriveFeedStorage",
   }
   ```

   If the setting already exists in your `settings.py` file, modify the existing
   setting to add the key-value pair above, instead of re-defining the setting.
4. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
5. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "gdrive://drive.google.com/<FOLDER ID>/<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<FOLDER ID>` is the ID of the target root folder, e.g.
     `1uWBpSBe3CvF8u21qTrzDqjZ6uexample`.
     > ###### TIP
     >
     > When inside a folder, the URL ends with the folder ID, e.g:
     > `https://drive.google.com/drive/folders/1uWBpSBe3CvF8u21qTrzDqjZ6uexample`.
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
     > ###### NOTE
     >
     > [scrapy-feedexporter-google-drive](https://github.com/scrapy-plugins/scrapy-feedexporter-google-drive) does not support
     > overwriting or appending to files, it can only create new files
     > every time.
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).
6. Define the `GDRIVE_SERVICE_ACCOUNT_CREDENTIALS_JSON` setting as a Python
   string containing your [service account credentials](https://developers.google.com/identity/protocols/oauth2/service-account) in JSON format:
   settings.py
   ```python
   GDRIVE_SERVICE_ACCOUNT_CREDENTIALS_JSON = '{ "type": "service_account", "project_id": "myproject", "private_key_id": "…", "private_key": "…", "client_email": "…@email.iam.gserviceaccount.com", "client_id": "…", "auth_uri": "…", "token_uri": "…", "auth_provider_x509_cert_url": "…", "client_x509_cert_url": "…" }'
   ```

   Make sure you give your service account write access on the target folder.
   You can do that by sharing the folder with the email of the service account
   (`client_email` in the JSON above).

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Google Drive location.

## Exporting to Google Sheets with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to
[Google Sheets](https://www.google.com/sheets/about/):

1. You need Python 3.8 or higher.

   If you are using Scrapy Cloud, make sure you are
   using [stack](https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks) `scrapy:1.7-py38` or higher. Using the latest stack
   (`scrapy:2.14-20260217`) is generally recommended.
2. Install [scrapy-feedexporter-google-sheets](https://github.com/scrapy-plugins/scrapy-feedexporter-google-sheets):
   ```bash
   pip install git+https://github.com/scrapy-plugins/scrapy-feedexporter-google-sheets
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   scrapy_google_sheets_exporter @ git+https://github.com/scrapy-plugins/scrapy-feedexporter-google-sheets
   ```
3. In your `settings.py` file, define `FEED_STORAGES` as follows:
   settings.py
   ```python
   FEED_STORAGES = {
       "gsheets": "scrapy_google_sheets_exporter.gsheets_exporter.GoogleSheetsFeedStorage",
   }
   ```

   If the setting already exists in your `settings.py` file, modify the existing
   setting to add the key-value pair above, instead of re-defining the setting.
4. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
5. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "gsheets://docs.google.com/spreadsheets/d/<SPREADSHEET ID>/edit#gid=<WORKSHEET ID>": {
           "format": "csv"
       }
   }
   ```

   Where:
   - You can find the right values for `<SPREADSHEET ID>` and
     `<WORKSHEET ID>` in the URL when you are looking at the target
     worksheet, e.g:
     `https://docs.google.com/spreadsheets/d/1fWJgq5yuOdeN3YnkBZiTD0VhB1MLzBNomz0s9YwBREo/edit#gid=1261678709`.
     > ###### NOTE
     >
     > If `/edit#gid=<WORKSHEET ID>` is omitted, the first
     > worksheet is used.

   To append to an existing worksheet, you should also:
   - Use the `fields` [feed option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or the
     [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all fields to
     export, in the expected order.
   - Set `item_export_kwargs.include_headers_line` to `False`, to not
     write the header row.

   For example:
   ```python
   {
       "gsheets://docs.google.com/spreadsheets/d/<SPREADSHEET ID>/edit#gid=<WORKSHEET ID>": {
           "format": "csv",
           "fields": ["field1", "field2"],
           "item_export_kwargs": {"include_headers_line": False}
       }
   }
   ```
6. Define the `GOOGLE_CREDENTIALS` setting as a Python dictionary containing
   your [service account credentials](https://developers.google.com/identity/protocols/oauth2/service-account) in JSON format:
   settings.py
   ```python
   GOOGLE_CREDENTIALS = {
       "type": "service_account",
       "project_id": "myproject",
       "private_key_id": "…",
       "private_key": "…",
       "client_email": "…@email.iam.gserviceaccount.com",
       "client_id": "…",
       "auth_uri": "…",
       "token_uri": "…",
       "auth_provider_x509_cert_url": "…",
       "client_x509_cert_url": "…"
   }
   ```

   Make sure you give your service account write access on the target
   spreadsheet. You can do that by sharing the spreadsheet with the email of
   the service account (`client_email` in the JSON above).

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Google Sheets worksheet.

## Exporting to an SFTP server with Scrapy

> ###### NOTE
>
> Not to be confused with [FTP](https://en.wikipedia.org/wiki/File_Transfer_Protocol) (see ftp)
> or [FTPS](https://en.wikipedia.org/wiki/FTPS).

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to an [SFTP
server](https://en.wikipedia.org/wiki/SSH_File_Transfer_Protocol):

1. Install [scrapy-feedexporter-sftp](https://github.com/scrapy-plugins/scrapy-feedexporter-sftp):
   ```bash
   pip install git+https://github.com/scrapy-plugins/scrapy-feedexporter-sftp
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   scrapy-feedexporter-sftp @ git+https://github.com/scrapy-plugins/scrapy-feedexporter-sftp
   ```
2. In your `settings.py` file, define `FEED_STORAGES` as follows:
   settings.py
   ```python
   FEED_STORAGES = {
       "sftp": "scrapy_feedexporter_sftp.SFTPFeedStorage",
   }
   ```

   If the setting already exists in your `settings.py` file, modify the existing
   setting to add the key-value pair above, instead of re-defining the setting.
3. Add a [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#std-setting-FEEDS)
   setting to your project or spider, if not added yet.

   The value of `FEEDS` must be a JSON object (`{}`).

   If you have `FEEDS` already defined with key-value pairs, you can keep
   those if you want — `FEEDS` supports exporting data to multiple file
   storage service locations.

   To add `FEEDS` to a project, define it in your [Scrapy Cloud project
   settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   or add it to your `settings.py` file:
   settings.py
   ```python
   FEEDS = {}
   ```

   To add `FEEDS` to a spider, define it in your Scrapy Cloud
   spider-specific settings (open a spider in Scrapy Cloud and select the
   **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   spiders/myspider.py
   ```python
   class MySpider:
       custom_settings = {
           "FEEDS": {},
       }
   ```
4. Add the following key-value pair to `FEEDS`:
   ```python
   {
       "sftp://<USER>:<PASSWORD>@<HOST>/<PATH>": {
           "format": "<FORMAT>"
       }
   }
   ```

   Where:
   - `<USER>` and `<PASSWORD>` are your credentials for the SFTP server,
     [percent-encoded](https://en.wikipedia.org/wiki/Percent-encoding).
   - `<HOST>` is the SFTP server host, e.g. `sftp.example.com` or
     `203.0.113.123`.
   - `<PATH>` is the path where you want to store the scraped data file, e.g.
     `scraped/data.csv`.

     The path can include [placeholders](https://docs.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters) that are replaced at run time, such
     as `%(time)`, which is replaced by the current timestamp.
   - `<FORMAT>` is the desired [output file format](https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats).

     Possible values include: `csv`, `json`, `jsonlines`, `xml`. You can
     also [implement support for more formats](https://docs.scrapy.org/en/latest/topics/exporters.html).
     > ###### WARNING
     >
     > If you export in CSV format, and in your spider code you yield
     > items as Python dictionaries, only the fields present on the first yielded
     > item are exported for all items.
     >
     > One solution is to [customize output fields](https://docs.scrapy.org/en/latest/topics/exporters.html#scrapy.exporters.BaseItemExporter.fields_to_export) through the `fields` [feed
     > option](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-options) of [FEEDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds) or
     > through the [FEED_EXPORT_FIELDS](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feed-export-fields) Scrapy setting to explicitly indicate all
     > fields to export.
     >
     > You can alternatively yield something other than a Python dictionary that
     > supports declaring all possible fields, such as an [Item object](https://docs.scrapy.org/en/latest/topics/items.html#item-objects) or an
     > [attrs object](https://docs.scrapy.org/en/latest/topics/items.html#attr-s-objects).

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured SFTP server location.

## Exporting to Google BigQuery with Scrapy

To configure a [Scrapy](https://scrapy.org) project or spider to export scraped data to
[Google BigQuery](https://cloud.google.com/bigquery/):

1. You need Python 3.7 or higher and Scrapy 2.4 or higher.

   If you are using Scrapy Cloud, make sure you are
   using [stack](https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks) `scrapy:2.4` or higher. Using the latest stack
   (`scrapy:2.14-20260217`) is generally recommended.
2. Install [scrapy-bigquery](https://github.com/8W9aG/scrapy-bigquery):
   ```bash
   pip install scrapy-bigquery
   ```

   If you are using Scrapy Cloud, remember to add the
   following line to your `requirements.txt` file:
   ```none
   scrapy-bigquery
   ```
3. Define the `BIGQUERY_DATASET` and `BIGQUERY_TABLE` Scrapy settings to
   point to the target table. For example:
   settings.py
   ```python
   BIGQUERY_DATASET = "my-dataset"
   BIGQUERY_TABLE = "my-table"
   ```

   [Additional settings](https://github.com/8W9aG/scrapy-bigquery#bigquery_add_scraped_time-optional)
   are available.
   > ###### TIP
   >
   > To add Scrapy settings to a project, define them in your [Scrapy
   > Cloud project settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
   > or add them to your `settings.py` file.
   > settings.py
   > ```python
   > MY_SETTING = ...
   > ```
   >
   > To add settings to a spider, define them in your Scrapy Cloud
   > spider-specific settings (open a spider in Scrapy Cloud and select the
   > **Settings** tab) or add it to your spider code with the [update_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.update_settings)
   > method or the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class variable:
   > spiders/myspider.py
   > ```python
   > class MySpider:
   >     custom_settings = {
   >         "MY_SETTING": ...,
   >     }
   > ```
4. Define the `BIGQUERY_SERVICE_ACCOUNT` setting as a string with your
   [service account credentials](https://developers.google.com/identity/protocols/oauth2/service-account) in base64-encoded JSON format:
   settings.py
   ```python
   BIGQUERY_SERVICE_ACCOUNT = "eyJ0eX=="
   ```

   You can use the following command to generate the required value from your
   service account JSON file:
   ```shell
   cat service-account.json | jq . -c | base64
   ```

   Make sure you give your service account write access on the target table.
   You can do that by sharing the table with the email of the service account
   (`client_email` in the service account JSON).

Running your spider now, locally or on Scrapy Cloud, will export your scraped
data to the configured Google BigQuery table.

## Get started with Zyte API

Zyte API is a [web scraping API](https://www.zyte.com/zyte-web-scraping-api/) that avoids bans, enables
browser automation, enables automatic
extraction, and much more, all
cost-efficiently.

### Get started

> ##### Sign up
>
> Sign up now and get $5 free for a month.
>
> [Try for free!](https://app.zyte.com/account/signup/zyteapi)

> ##### Follow the tutorial
>
> Complete the web scraping tutorial covering Zyte API.

### Learn more

> ##### Usage
>
> Learn to use Zyte API.

> ##### Reference
>
> See the complete API reference.

> ##### Proxy mode
>
> Use Zyte API as a proxy.

> ##### Migrate
>
> Migrate your existing web scraping code.

> ##### Zyte IDE
>
> Write browser scripts and build and debug requests interactively.

## Zyte API usage documentation

### Initial setup

How would you prefer to use Zyte API?

> ##### Scrapy
>
> Use scrapy-zyte-api (tutorial).

> ##### Python
>
> Use [python-zyte-api](http://python-zyte-api.readthedocs.io/).

> ##### HTTP clients
>
> `POST` to `https://api.zyte.com/v1/extract` with your [Zyte API key](https://app.zyte.com/o/zyte-api/api-access) and parameters:
>
> ```shell
> curl \
>     --user YOUR_ZYTE_API_KEY: \
>     --header 'Content-Type: application/json' \
>     --data '{"url": "https://toscrape.com", "httpResponseBody": true}' \
>     --compressed \
>     https://api.zyte.com/v1/extract
> ```

> ##### Proxy mode
>
> Use `https://api.zyte.com:8011` as your proxy endpoint, with your [Zyte API key](https://app.zyte.com/o/zyte-api/api-access) and proxy headers:
>
> ```shell
> curl \
>     --proxy api.zyte.com:8011 \
>     --proxy-user YOUR_ZYTE_API_KEY: \
>     --compressed \
>     https://toscrape.com
> ```

> ###### TIP
>
> Learn about the different features of the
> HTTP API and the proxy mode
> before you choose one.

> ###### TIP
>
> Got an SSL error? Install our CA certificate.

### Basic usage

What do you want to do with Zyte API?

> ##### HTTP requests
>
> Send low-level HTTP requests, with custom method, headers and body,
> opt-out redirection following, device emulation, and more.
>
> HTTP

> ##### Browser automation
>
> Get browser-rendered HTML, take screenshots, interact with pages,
> capture background requests, and more.
>
> Browser

> ##### Automatic extraction
>
> Get structured data from single pages or entire websites, and
> enrich it with custom LLM prompts.
>
> Extraction

> ##### Search API
>
> Search Google with a typed interface and get back structured
> organic results — no URL construction needed.
>
> Search API

#### Additional features

Customize your Zyte API requests further to get what you want:

> ##### Geolocation
>
> Choose a location of origin for your request.

> ##### IP type
>
> Choose the type of IP address used by your request.

> ##### Cookies
>
> Get and set cookies to reproduce requests and maintain sessions.

> ##### Sessions
>
> Use the same IP address, cookie jar, network stack, etc. on multiple
> requests.

### Advanced topics

> ##### Proxy mode
>
> Use Zyte API as a proxy.

> ##### Rate limits
>
> Requests-over-time and concurrency limits.

> ##### Optimization
>
> Make the most out of Zyte API.

> ##### Error handling
>
> Error response handling.

> ##### API reference
>
> Complete API reference documentation.

> ##### Stats API
>
> Check your Zyte API usage details.

## Zyte API HTTP requests

To send HTTP requests through Zyte API, without browser rendering, set the [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody) request field to
`true`, and read the [Base64](https://en.wikipedia.org/wiki/Base64)-encoded response body from the
[httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) response field.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"httpResponseBody", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "httpResponseBody": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "httpResponseBody": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "httpResponseBody", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    httpResponseBody: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'httpResponseBody' => true,
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
```

#### Proxy mode

With the proxy mode, you always get a response
body.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com \
> output.html
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "httpResponseBody": True,
    },
)
http_response_body: bytes = b64decode(api_response.json()["httpResponseBody"])
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "httpResponseBody": True,
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

In transparent mode, when you target a text
resource (e.g. HTML, JSON), regular Scrapy requests work out of the
box:

```python
from scrapy import Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        http_response_text: str = response.text
```

While regular Scrapy requests also work for binary responses at the
moment, they may stop working in future versions of
scrapy-zyte-api, so passing
[httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody) is recommended when targeting binary
resources:

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "httpResponseBody": True,
                },
            },
        )

    def parse(self, response):
        http_response_body: bytes = response.body
```

Output (first 5 lines):

```html
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
```

For HTTP requests, Zyte API also supports:

- HTTP request attributes for method,
  body, and headers.
- Redirection,
  device emulation.
- Geolocation,
  IP type,
  cookies,
  sessions,
  response headers,
  and metadata.

> ###### TIP
>
> HTTP responses do not reflect HTML content rendered by a web browser
> that executes JavaScript code. To get browser HTML, use a browser request.
> See also zapi-raw-vs-browser.

### Request method

HTTP requests use the `GET` [HTTP method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods) by default. Use the
[httpRequestMethod](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpRequestMethod) field to set a different HTTP method.

> ###### TIP
>
> When using `POST`, `PUT` or similar, you probably want to also
> set a request body.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var method = responseData.RootElement.GetProperty("method").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "httpRequestMethod": "POST"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .method
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "httpRequestMethod": "POST"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .method
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "httpRequestMethod",
            "POST");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String method = data.get("method").getAsString();
          System.out.println(method);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    httpRequestMethod: 'POST'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const method = JSON.parse(httpResponseBody).method
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$method = json_decode($http_response_body)->method;
```

#### Proxy mode

With the proxy mode, the request method
from your requests is used automatically.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -X POST \
    https://httpbin.org/anything \
    | jq .method
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "httpRequestMethod": "POST",
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
method = json.loads(http_response_body)["method"]
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "httpRequestMethod": "POST",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    method = json.loads(http_response_body)["method"]
    print(method)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            method="POST",
        )

    def parse(self, response):
        method = json.loads(response.text)["method"]
```

Output:

```json
"POST"
```

### Request body

To include a body in your request, use one of the following fields:

- [httpRequestText](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpRequestText), for UTF-8-encoded text.
- [httpRequestBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpRequestBody), for anything else. It supports binary data
  as well, so the value must be [Base64](https://en.wikipedia.org/wiki/Base64)-encoded.

#### `httpRequestText` example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"},
    {"httpRequestText", "{\"foo\": \"bar\"}"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var requestBody = responseData.RootElement.GetProperty("data").ToString();

Console.WriteLine(requestBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "httpRequestMethod": "POST", "httpRequestText": "{\"foo\": \"bar\"}"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output .data
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "httpRequestMethod": "POST",
    "httpRequestText": "{\"foo\": \"bar\"}"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode \
| jq --raw-output .data
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "httpRequestMethod",
            "POST",
            "httpRequestText",
            "{\"foo\": \"bar\"}");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String body = data.get("data").getAsString();
          System.out.println(body);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    httpRequestMethod: 'POST',
    httpRequestText: '{"foo": "bar"}'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const body = JSON.parse(httpResponseBody).data
  console.log(body)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
        'httpRequestText' => '{"foo": "bar"}',
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$body = json_decode($http_response_body)->data;
echo $body.PHP_EOL;
```

#### Proxy mode

With the proxy mode, the request body from
your requests is used automatically, be it plain text or binary.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -X POST \
    -H "Content-Type: application/json" \
    --data '{"foo": "bar"}' \
    https://httpbin.org/anything \
    | jq .data
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "httpRequestMethod": "POST",
        "httpRequestText": '{"foo": "bar"}',
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
body: str = json.loads(http_response_body)["data"]
print(body)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "httpRequestMethod": "POST",
            "httpRequestText": '{"foo": "bar"}',
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"])
    body = json.loads(http_response_body)["data"]
    print(body)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            method="POST",
            body='{"foo": "bar"}',
        )

    def parse(self, response):
        body = json.loads(response.body)["data"]
        print(body)
```

Output:

```json
{"foo": "bar"}
```

#### `httpRequestBody` example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"},
    {"httpRequestBody", "Zm9v"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var requestBody = responseData.RootElement.GetProperty("data").ToString();

Console.WriteLine(requestBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "httpRequestMethod": "POST", "httpRequestBody": "Zm9v"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output .data
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "httpRequestMethod": "POST",
    "httpRequestBody": "Zm9v"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode \
| jq --raw-output .data
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "httpRequestMethod",
            "POST",
            "httpRequestBody",
            "Zm9v");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String body = data.get("data").getAsString();
          System.out.println(body);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    httpRequestMethod: 'POST',
    httpRequestBody: 'Zm9v'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const body = JSON.parse(httpResponseBody).data
  console.log(body)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
        'httpRequestBody' => 'Zm9v',
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$body = json_decode($http_response_body)->data;
echo $body.PHP_EOL;
```

#### Proxy mode

With the proxy mode, the request body from
your requests is used automatically, be it plain text or binary.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -X POST \
    -H "Content-Type: application/octet-stream" \
    --data foo \
    https://httpbin.org/anything \
    | jq .data
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "httpRequestMethod": "POST",
        "httpRequestBody": "Zm9v",
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
body: str = json.loads(http_response_body)["data"]
print(body)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "httpRequestMethod": "POST",
            "httpRequestBody": "Zm9v",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    body = json.loads(http_response_body)["data"]
    print(body)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            method="POST",
            body=b"foo",
        )

    def parse(self, response):
        body = json.loads(response.body)["data"]
        print(body)
```

Output:

```none
foo
```

### Request headers

In HTTP requests, use [customHttpRequestHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customHttpRequestHeaders) to set request
headers. You can set any header except `Cookie` (see
zapi-cookies).

> ###### TIP
>
> You can also set headers like `Accept`, `Accept-Encoding`,
> `Accept-Language` or `User-Agent`, but it is usually best to let Zyte
> API set those headers; it will use values consistent with the network stack
> and other request parameters (e.g. device,
> geolocation).

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {
        "customHttpRequestHeaders",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"name", "Accept-Language"},
                {"value", "fa"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var headerEnumerator = responseData.RootElement.GetProperty("headers").EnumerateObject();
var headers = new Dictionary<string, string>();
while (headerEnumerator.MoveNext())
{
    headers.Add(
        headerEnumerator.Current.Name.ToString(),
        headerEnumerator.Current.Value.ToString()
    );
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "customHttpRequestHeaders": [{"name": "Accept-Language", "value": "fa"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .headers
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "customHttpRequestHeaders": [
        {
            "name": "Accept-Language",
            "value": "fa"
        }
    ]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .headers
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> customHttpRequestHeader =
        ImmutableMap.of("name", "Accept-Language", "value", "fa");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "customHttpRequestHeaders",
            Collections.singletonList(customHttpRequestHeader));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          JsonObject headers = data.get("headers").getAsJsonObject();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(headers));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    customHttpRequestHeaders: [
      {
        name: 'Accept-Language',
        value: 'fa'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const headers = JSON.parse(httpResponseBody).headers
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'customHttpRequestHeaders' => [
            [
                'name' => 'Accept-Language',
                'value' => 'fa',
            ],
        ],
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
$data = json_decode($http_response_body);
$headers = $data->headers;
```

#### Proxy mode

With the proxy mode, the request headers
from your requests are used automatically.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Accept-Language: fa" \
    https://httpbin.org/anything \
    | jq .headers
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "customHttpRequestHeaders": [
            {
                "name": "Accept-Language",
                "value": "fa",
            },
        ],
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
headers = json.loads(http_response_body)["headers"]
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "customHttpRequestHeaders": [
                {
                    "name": "Accept-Language",
                    "value": "fa",
                },
            ],
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    headers = json.loads(http_response_body)["headers"]
    print(json.dumps(headers, indent=2))

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            headers={"Accept-Language": "fa"},
        )

    def parse(self, response):
        headers = json.loads(response.text)["headers"]
```

Output (first 5 lines):

```json
{
  "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
  "Accept-Encoding": "gzip, deflate, br",
  "Accept-Language": "fa",
  "Host": "httpbin.org",
```

### Redirection

HTTP requests follow [HTTP redirection](https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections) by default. Set
[followRedirect](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/followRedirect) to `False` to change that.

> ###### NOTE
>
> Redirection works differently in browser requests.

### Device emulation

In HTTP requests, use [device](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/device) to set a type of device emulation,
either `desktop` (default) or `mobile`, to use for your request.

This option exists because some websites return different content depending on
the type of device used to access them.

> ###### NOTE
>
> In a request where you set [device](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/device) to `mobile`, you
> cannot use [sessionContextParameters.actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters.actions).

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/user-agent"},
    {"httpResponseBody", true},
    {"device", "mobile"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var headerEnumerator = responseData.RootElement.EnumerateObject();
while (headerEnumerator.MoveNext())
{
    if (headerEnumerator.Current.Name.ToString() == "user-agent")
    {
        Console.WriteLine(headerEnumerator.Current.Value.ToString());
    }
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/user-agent", "httpResponseBody": true, "device": "mobile"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output '.["user-agent"]'
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/user-agent",
    "httpResponseBody": true,
    "device": "mobile"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output '.["user-agent"]'
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url", "https://httpbin.org/user-agent", "httpResponseBody", true, "device", "mobile");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String userAgent = data.get("user-agent").getAsString();
          System.out.println(userAgent);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/user-agent',
    httpResponseBody: true,
    device: 'mobile'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(JSON.parse(httpResponseBody)['user-agent'])
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/user-agent',
        'httpResponseBody' => true,
        'device' => 'mobile',
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
$data = json_decode($http_response_body);
echo $data->{'user-agent'}.PHP_EOL;
```

#### Proxy mode

With the proxy mode, use the
zyte-device header.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Device: mobile" \
    https://httpbin.org/user-agent \
    | jq --raw-output '.["user-agent"]'
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/user-agent",
        "httpResponseBody": True,
        "device": "mobile",
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
user_agent = json.loads(http_response_body)["user-agent"]
print(user_agent)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/user-agent",
            "httpResponseBody": True,
            "device": "mobile",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    user_agent = json.loads(http_response_body)["user-agent"]
    print(user_agent)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/user-agent",
            meta={
                "zyte_api_automap": {
                    "device": "mobile",
                }
            },
        )

    def parse(self, response):
        user_agent = json.loads(response.text)["user-agent"]
        print(user_agent)
```

Example output (may vary):

```none
Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Mobile Safari/537.36
```

### Submitting HTML forms

While it may be easier to submit HTML forms using a browser request with actions, it is also possible to
reproduce form-submission requests with HTTP requests.

Reproducing an HTML form request usually requires:

- Setting the right value of [httpRequestMethod](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpRequestMethod), often
  `POST`.
- Setting the `Content-Type` header to
  `application/x-www-form-urlencoded` through
  [customHttpRequestHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customHttpRequestHeaders).
- Setting the right payload, i.e. key-value pairs set by the form.

  For `GET` requests, that means setting those key-value pairs in the URL
  query string.

  For `POST` requests, that means encoding those key-value pairs as a query
  string (without the starting `?`) and using that as
  [httpRequestText](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpRequestText) or [httpRequestBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpRequestBody).
  > ###### TIP
  >
  > Your key-value pairs may need to include hidden form fields, often
  > used for [CSRF tokens](https://en.wikipedia.org/wiki/Cross-site_request_forgery) or to keep
  > the state of stateful pages (e.g. ASP.NET’s `__VIEWSTATE` field).

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

In [https://quotes.toscrape.com/search.aspx](https://quotes.toscrape.com/search.aspx) you get an HTML form that could be
stripped down to:

```html
<form action="/filter.aspx" method="post" >
    <select name="author">
        <option>----------</option>
        <option value="Albert Einstein">
            Albert Einstein
        </option>
        <!-- [more options] -->
    </select>
    <select name="tag">
        <option>----------</option>
    </select>
    <input type="hidden" name="__VIEWSTATE" value="ZTYzZDZ…">
</form>
```

When you select an **Author** (e.g. Albert Einstein), a form request is sent,
and the **Tag** options fill up.

To reproduce that:

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using System.Web;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input1 = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/search.aspx"},
    {"httpResponseBody", true}
};
var inputJson1 = JsonSerializer.Serialize(input1);
var content1 = new StringContent(inputJson1, Encoding.UTF8, "application/json");

HttpResponseMessage response1 = await client.PostAsync("https://api.zyte.com/v1/extract", content1);
var body1 = await response1.Content.ReadAsByteArrayAsync();
var data1 = JsonDocument.Parse(body1);
var base64HttpResponseBody1 = data1.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes1 = System.Convert.FromBase64String(base64HttpResponseBody1);
var httpResponseBody1 = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes1);
var htmlDocument1 = new HtmlDocument();
htmlDocument1.LoadHtml(httpResponseBody1);
var navigator1 = htmlDocument1.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator1.Evaluate("//*[@name='__VIEWSTATE']/@value");
nodeIterator.MoveNext();
var viewState = nodeIterator.Current.ToString();

var httpRequestTextParameters = new Dictionary<string, string>
{
    { "author", "Albert Einstein" },
    { "tag", "----------" },
    { "__VIEWSTATE", viewState}
};
var httpRequestText = string.Join("&",
    httpRequestTextParameters.Select(kvp => $"{HttpUtility.UrlEncode(kvp.Key)}={HttpUtility.UrlEncode(kvp.Value)}"));

var input2 = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/filter.aspx"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"},
    {
        "customHttpRequestHeaders",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"name", "Content-Type"},
                {"value", "application/x-www-form-urlencoded"}
            }
        }
    },
    {"httpRequestText", httpRequestText}
};
var inputJson2 = JsonSerializer.Serialize(input2);
var content2 = new StringContent(inputJson2, Encoding.UTF8, "application/json");

HttpResponseMessage response2 = await client.PostAsync("https://api.zyte.com/v1/extract", content2);
var body2 = await response2.Content.ReadAsByteArrayAsync();
var data2 = JsonDocument.Parse(body2);
var base64HttpResponseBody2 = data2.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes2 = System.Convert.FromBase64String(base64HttpResponseBody2);
var httpResponseBody2 = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes2);
var htmlDocument2 = new HtmlDocument();
htmlDocument2.LoadHtml(httpResponseBody2);
var navigator2 = htmlDocument2.CreateNavigator();
var nodeIterator2 = (XPathNodeIterator)navigator2.Evaluate("//*[@name='tag']//option");
int tagCount = 0;
while (nodeIterator2.MoveNext())
{
    tagCount++;
}
Console.WriteLine($"{tagCount}");
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Base64;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.entity.UrlEncodedFormEntity;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.NameValuePair;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.apache.hc.core5.http.message.BasicNameValuePair;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters1 =
        ImmutableMap.of("url", "https://quotes.toscrape.com/search.aspx", "httpResponseBody", true);
    String requestBody1 = new Gson().toJson(parameters1);

    HttpPost request1 = new HttpPost("https://api.zyte.com/v1/extract");
    request1.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request1.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request1.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request1.setEntity(new StringEntity(requestBody1));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request1,
        (response1) -> {
          HttpEntity httpEntity1 = response1.getEntity();
          String httpApiResponse1 = EntityUtils.toString(httpEntity1, StandardCharsets.UTF_8);
          JsonObject httpJsonObject1 = JsonParser.parseString(httpApiResponse1).getAsJsonObject();
          String base64HttpResponseBody1 = httpJsonObject1.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes1 = Base64.getDecoder().decode(base64HttpResponseBody1);
          String httpResponseBody1 = new String(httpResponseBodyBytes1, StandardCharsets.UTF_8);
          Document document1 = Jsoup.parse(httpResponseBody1);
          String viewState = document1.select("[name='__VIEWSTATE']").attr("value");
          Map<String, String> params =
              ImmutableMap.of(
                  "author", "Albert Einstein",
                  "tag", "----------",
                  "__VIEWSTATE", viewState);
          List<NameValuePair> formParams = new ArrayList<>();
          for (Map.Entry<String, String> entry : params.entrySet()) {
            formParams.add(new BasicNameValuePair(entry.getKey(), entry.getValue()));
          }
          UrlEncodedFormEntity entity =
              new UrlEncodedFormEntity(formParams, StandardCharsets.UTF_8);
          String httpRequestText = EntityUtils.toString(entity);
          Map<String, Object> customHttpRequestHeader =
              ImmutableMap.of("name", "Content-Type", "value", "application/x-www-form-urlencoded");
          Map<String, Object> parameters2 =
              ImmutableMap.of(
                  "url",
                  "https://quotes.toscrape.com/filter.aspx",
                  "httpResponseBody",
                  true,
                  "httpRequestMethod",
                  "POST",
                  "customHttpRequestHeaders",
                  Collections.singletonList(customHttpRequestHeader),
                  "httpRequestText",
                  httpRequestText);
          String requestBody2 = new Gson().toJson(parameters2);

          HttpPost request2 = new HttpPost("https://api.zyte.com/v1/extract");
          request2.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
          request2.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
          request2.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
          request2.setEntity(new StringEntity(requestBody2));

          client.execute(
              request2,
              (response2) -> {
                HttpEntity httpEntity2 = response2.getEntity();
                String httpApiResponse2 = EntityUtils.toString(httpEntity2, StandardCharsets.UTF_8);
                JsonObject httpJsonObject2 =
                    JsonParser.parseString(httpApiResponse2).getAsJsonObject();
                String base64HttpResponseBody2 =
                    httpJsonObject2.get("httpResponseBody").getAsString();
                byte[] httpResponseBodyBytes2 = Base64.getDecoder().decode(base64HttpResponseBody2);
                String httpResponseBody2 =
                    new String(httpResponseBodyBytes2, StandardCharsets.UTF_8);
                Document document2 = Jsoup.parse(httpResponseBody2);
                Elements tags = document2.select("select[name='tag'] option");
                System.out.println(tags.size());
                return null;
              });

          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')
const querystring = require('querystring')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/search.aspx',
    httpResponseBody: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const $ = cheerio.load(httpResponseBody)
  const viewState = $('[name="__VIEWSTATE"]').get(0).attribs.value
  const httpRequestText = querystring.stringify(
    {
      author: 'Albert Einstein',
      tag: '----------',
      __VIEWSTATE: viewState
    }
  )
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://quotes.toscrape.com/filter.aspx',
      httpResponseBody: true,
      httpRequestMethod: 'POST',
      customHttpRequestHeaders: [
        {
          name: 'Content-Type',
          value: 'application/x-www-form-urlencoded'
        }
      ],
      httpRequestText
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((response) => {
    const httpResponseBody = Buffer.from(
      response.data.httpResponseBody,
      'base64'
    )
    const $ = cheerio.load(httpResponseBody)
    console.log($('select[name="tag"] option').length)
  })
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response_1 = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/search.aspx',
        'httpResponseBody' => true,
    ],
]);
$data = json_decode($response_1->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$doc = new DOMDocument();
$doc->loadHTML($http_response_body);
$xpath_1 = new DOMXPath($doc);
$view_state = $xpath_1->query('//*[@name="__VIEWSTATE"]/@value')->item(0)->nodeValue;
$http_request_text = http_build_query(
    [
        'author' => 'Albert Einstein',
        'tag' => '----------',
        '__VIEWSTATE' => $view_state,
    ]
);
$response_2 = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/filter.aspx',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
        'customHttpRequestHeaders' => [
            [
                'name' => 'Content-Type',
                'value' => 'application/x-www-form-urlencoded',
            ],
        ],
        'httpRequestText' => $http_request_text,
    ],
]);
$data = json_decode($response_2->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$doc->loadHTML($http_response_body);
$xpath_2 = new DOMXPath($doc);
$tags = $xpath_2->query('//*[@name="tag"]/option');
echo count($tags).PHP_EOL;
```

#### Python

Install form2request, which makes it easier
to handle HTML forms in Python.

Then:

```python
from base64 import b64decode

from form2request import form2request
from parsel import Selector
import requests

api_response_1 = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/search.aspx",
        "httpResponseBody": True,
    },
)
api_response_1_data = api_response_1.json()
http_response_body_1 = b64decode(api_response_1_data["httpResponseBody"])
selector_1 = Selector(body=http_response_body_1, base_url=api_response_1_data["url"])
form = selector_1.css("form")
request = form2request(form, {"author": "Albert Einstein"}, click=False)
api_response_2 = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": request.url,
        "httpRequestMethod": request.method,
        "customHttpRequestHeaders": [
            {"name": k, "value": v} for k, v in request.headers
        ],
        "httpRequestText": request.body.decode(),
        "httpResponseBody": True,
    },
)
http_response_body_2 = b64decode(api_response_2.json()["httpResponseBody"])
selector_2 = Selector(body=http_response_body_2)
print(len(selector_2.css("select[name='tag'] option")))
```

#### Python client

Install form2request, which makes it easier
to handle HTML forms in Python.

Then:

```python
import asyncio
from base64 import b64decode

from form2request import form2request
from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response_1 = await client.get(
        {
            "url": "https://quotes.toscrape.com/search.aspx",
            "httpResponseBody": True,
        }
    )
    http_response_body_1 = b64decode(api_response_1["httpResponseBody"])
    selector_1 = Selector(body=http_response_body_1, base_url=api_response_1["url"])
    form = selector_1.css("form")
    request = form2request(form, {"author": "Albert Einstein"}, click=False)
    api_response_2 = await client.get(
        {
            "url": request.url,
            "httpRequestMethod": request.method,
            "customHttpRequestHeaders": [
                {"name": k, "value": v} for k, v in request.headers
            ],
            "httpRequestText": request.body.decode(),
            "httpResponseBody": True,
        }
    )
    http_response_body_2 = b64decode(api_response_2["httpResponseBody"])
    selector_2 = Selector(body=http_response_body_2)
    print(len(selector_2.css("select[name='tag'] option")))

asyncio.run(main())
```

#### Scrapy

Install form2request, which makes it easier
to handle HTML forms in Scrapy.

Then, use it and let transparent mode take care of
the rest:

```python
from form2request import form2request
from scrapy import Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"
    start_urls = ["https://quotes.toscrape.com/search.aspx"]

    def parse(self, response):
        form = response.css("form")
        request = form2request(form, {"author": "Albert Einstein"}, click=False)
        yield request.to_scrapy(callback=self.parse_tags)

    def parse_tags(self, response):
        print(len(response.css("select[name='tag'] option")))
```

Output (number of **Tag** options):

```json
25
```

### Decoding HTML

HTML extracted as a response body needs to be decoded.

HTML content can be encoded with one of many character encodings, and
you must determine the character encoding used so that you can decode that
HTML content accordingly.

The best way to determine the encoding of HTML content is to follow the
[encoding sniffing algorithm](https://html.spec.whatwg.org/#determining-the-character-encoding) defined in the HTML standard.

In addition to the HTML content, the HTML encoding sniffing algorithm takes
into account any character encoding provided in the optional `charset`
parameter of media types declared in the `Content-Type` response header, so
make sure you get the response headers in
addition to the response body if you are following the HTML encoding sniffing
algorithm.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### curl

Use [file](https://www.darwinsys.com/file/) to find the media type of a previously-downloaded
response based solely on its body (i.e. not
following the HTML encoding sniffing algorithm).

```shell
file --mime-encoding output.html
```

#### JS

Use [content-type-parser](https://www.npmjs.com/package/content-type-parser), [html-encoding-sniffer](https://www.npmjs.com/package/html-encoding-sniffer) and [whatwg-encoding](https://www.npmjs.com/package/whatwg-encoding):

```js
const contentTypeParser = require('content-type-parser')
const htmlEncodingSniffer = require('html-encoding-sniffer')
const whatwgEncoding = require('whatwg-encoding')

// …

const httpResponseHeaders = response.data.httpResponseHeaders
let contentTypeCharset
httpResponseHeaders.forEach(function (item) {
  if (item.name.toLowerCase() === 'content-type') {
    contentTypeCharset = contentTypeParser(item.value).get('charset')
  }
})
const httpResponseBody = Buffer.from(response.data.httpResponseBody, 'base64')
const encoding = htmlEncodingSniffer(httpResponseBody, {
  transportLayerEncodingLabel: contentTypeCharset
})
const html = whatwgEncoding.decode(httpResponseBody, encoding)
```

#### Python

[web-poet](https://web-poet.readthedocs.io/en/stable/index.html) provides a response wrapper that automatically decodes the
response body following an encoding sniffing algorithm similar to the
one defined in the HTML standard.

Provided that you have extracted a response with both body and
headers, and you have Base64-decoded the
response body, you can decode the HTML bytes as
follows:

```python
from web_poet import HttpResponse

# …

headers = tuple(
    (item['name'], item['value'])
    for item in http_response_headers
)
response = HttpResponse(
    url='https://example.com',
    body=http_response_body,
    status=200,
    headers=headers,
)
html = response.text
```

#### Scrapy

In transparent mode, regular Scrapy requests
targeting HTML resources decode them by default. See
zapi-text.

### HTML and browser HTML

HTML found in [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) is usually different from HTML
found in [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) (browser HTML):

- [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) does not reflect changes that a webpage
  makes at run time using JavaScript, such as loading content from additional
  URLs, or moving or reformatting content within the webpage.
- [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) includes a normalization of the HTML from the
  underlying HTTP response, which web browsers perform according to the HTML5
  specification. So the content of HTML and browser HTML could be different
  even when there is no JavaScript involved.

  Parsing HTML from [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) with libraries that do
  not implement HTML5 parsing, such as [lxml.html](https://lxml.de/lxmlhtml.html) (used by [Scrapy](https://scrapy.org/)
  by default), results in a different tree structure.

  With an HTML5-compatible parser the resulting tree structure would be the
  same, provided JavaScript does not cause any other difference.

Because of these differences, switching between these HTML inputs can break
your existing parsing code and require changes, such as updating XPath or CSS
selectors.

## Zyte API browser automation

You can use browser automation through Zyte API to get browser-rendered
HTML, screenshots, or
both.

For browser requests, Zyte API also supports:

- Actions,
  network capture,
  request headers,
  redirection,
  and toggling JavaScript.
- Geolocation,
  IP type,
  cookies,
  sessions,
  redirection,
  response headers,
  and metadata.

Unlike HTTP requests, browser requests do not support:

- An HTTP request method, body, or header other than Referer.
  > ###### NOTE
  >
  > This only affects the initial request. During a browser request,
  > as a result of redirection, JavaScript, or actions, additional requests may be sent with no
  > limitation on method, body or headers, and may be captured.
- Returning non-HTML response data, other than a screenshot.

All browser request features are also available for automatic extraction requests that use a browser request as extraction
source.

### Browser HTML

Browser HTML is the HTML representation of the [Document Object Model](https://en.wikipedia.org/wiki/Document_Object_Model) (DOM)
of a webpage after it has been rendered in a browser.

To get browser HTML, set the [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/browserHtml) request field to
`true`. The [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) response field is the browser HTML
as a string.

> ###### NOTE
>
> By default, [iframes](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe) in [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) are empty. Set
> [includeIframes](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/includeIframes) to `true` to embed iframe content in
> `browserHtml`.
>
> To access content from the [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM), check out the corresponding
> example under zapi-actions.

See also zapi-raw-vs-browser.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"browserHtml", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "browserHtml": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "browserHtml", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          System.out.println(browserHtml);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    browserHtml: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'browserHtml' => true,
    ],
]);
$api = json_decode($response->getBody());
$browser_html = $api->browserHtml;
```

#### Proxy mode

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Browser-Html: true" \
    https://toscrape.com
```

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "browserHtml": True,
    },
)
browser_html: str = api_response.json()["browserHtml"]
```

#### Python client

```python
import asyncio

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "browserHtml": True,
        }
    )
    print(api_response["browserHtml"])

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                },
            },
        )

    def parse(self, response):
        browser_html: str = response.text
```

Output (first 5 lines):

```html
<!DOCTYPE html><html lang="en"><head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        <link href="./css/bootstrap.min.css" rel="stylesheet">
        <link href="./css/main.css" rel="stylesheet">
```

### Screenshot

To get a webpage screenshot in browser requests, set the
[screenshot](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/screenshot) request field to `true` . The
[screenshot](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/screenshot) response field is the [Base64](https://en.wikipedia.org/wiki/Base64)-encoded screenshot
file data.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"screenshot", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64Screenshot = data.RootElement.GetProperty("screenshot").ToString();
var screenshot = System.Convert.FromBase64String(base64Screenshot);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "screenshot": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .screenshot \
    | base64 --decode \
    > screenshot.jpg
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "screenshot": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .screenshot \
    | base64 --decode \
    > screenshot.jpg
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "screenshot", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64Screenshot = jsonObject.get("screenshot").getAsString();
          byte[] screenshot = Base64.getDecoder().decode(base64Screenshot);
          try (FileOutputStream fos = new FileOutputStream("screenshot.jpg")) {
            fos.write(screenshot);
          }
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    screenshot: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const screenshot = Buffer.from(response.data.screenshot, 'base64')
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'screenshot' => true,
    ],
]);
$api = json_decode($response->getBody());
$screenshot = base64_decode($api->screenshot);
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "screenshot": True,
    },
)
screenshot: bytes = b64decode(api_response.json()["screenshot"])
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "screenshot": True,
        }
    )
    screenshot = b64decode(api_response["screenshot"])
    with open("screenshot.jpg", "wb") as f:
        f.write(screenshot)

asyncio.run(main())
```

#### Scrapy

```python
from base64 import b64decode

from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "screenshot": True,
                },
            },
        )

    def parse(self, response):
        screenshot: bytes = b64decode(response.raw_api_response["screenshot"])
```

Output:

![](zyte-api/usage/code-examples/output/screenshot.jpg)

### Actions

In browser requests use the [actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/actions) request field to define a
sequence of browser actions to perform before output generation.

> ###### SEE ALSO
>
> Web scraping tutorial (tutorial-actions).

#### Example: scrollBottom

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/scroll"},
    {"browserHtml", true},
    {
        "actions",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"action", "scrollBottom"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var quoteCount = (double)navigator.Evaluate("count(//*[@class=\"quote\"])");
```

#### CLI client

input.jsonl
```json
{"url": "https://quotes.toscrape.com/scroll", "browserHtml": true, "actions": [{"action": "scrollBottom"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
```

#### curl

input.json
```json
{
    "url": "https://quotes.toscrape.com/scroll",
    "browserHtml": true,
    "actions": [
        {
            "action": "scrollBottom"
        }
    ]
}
```

```shell

curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> action = ImmutableMap.of("action", "scrollBottom");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://quotes.toscrape.com/scroll",
            "browserHtml",
            true,
            "actions",
            Collections.singletonList(action));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          int quoteCount = document.select(".quote").size();
          System.out.println(quoteCount);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/scroll',
    browserHtml: true,
    actions: [
      {
        action: 'scrollBottom'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
  const $ = cheerio.load(browserHtml)
  const quoteCount = $('.quote').length
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/scroll',
        'browserHtml' => true,
        'actions' => [
            ['action' => 'scrollBottom'],
        ],
    ],
]);
$data = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($data->browserHtml);
$xpath = new DOMXPath($doc);
$quote_count = $xpath->query("//*[@class='quote']")->count();
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/scroll",
        "browserHtml": True,
        "actions": [
            {
                "action": "scrollBottom",
            },
        ],
    },
)
browser_html = api_response.json()["browserHtml"]
quote_count = len(Selector(browser_html).css(".quote"))
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://quotes.toscrape.com/scroll",
            "browserHtml": True,
            "actions": [
                {
                    "action": "scrollBottom",
                },
            ],
        },
    )
    browser_html = api_response["browserHtml"]
    quote_count = len(Selector(browser_html).css(".quote"))
    print(quote_count)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "scrollBottom",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        quote_count = len(response.css(".quote"))
```

Output:

```none
100
```

#### Example: Read from the [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM)

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

To get content from the [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM), use the `evaluate` action to create an
invisible DOM element, which you will get in [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml), and
fill it with the desired content from the shadow DOM.

> ###### TIP
>
> If your `evaluate` action does not work as expected, check the
> [actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/actions) response field for errors.

The following example code shows how to access the shadow DOM paragraph from
[a shadow DOM example in CodePen](https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=) using the
`evaluate` action with the following `source`:

```js
const div = document.createElement('div')
div.setAttribute('id', 'shadow-root-content')
// Hide, in case you also want to take a screenshot.
div.style.display = 'none'
const iframe = document.getElementById('result')
div.innerText = iframe
  .contentWindow.document
  .getElementById('shadow-root')
  .shadowRoot.querySelector('p').textContent
document.body.appendChild(div)
```

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view="},
    {"browserHtml", true},
    {
        "actions",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"action", "evaluate"},
                {"source", @"
                  const div = document.createElement('div')
                  div.setAttribute('id', 'shadow-root-content')
                  div.style.display = 'none'
                  const iframe = document.getElementById('result')
                  div.innerText = iframe
                    .contentWindow.document
                    .getElementById('shadow-root')
                    .shadowRoot.querySelector('p').textContent
                  document.body.appendChild(div)
                "}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//*[@id=\"shadow-root-content\"]/text()");
nodeIterator.MoveNext();
var shadowText = nodeIterator.Current.ToString();
Console.WriteLine(shadowText);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> actions =
        ImmutableMap.of(
            "action",
            "evaluate",
            "source",
            "const div = document.createElement('div')\n"
                + "div.setAttribute('id', 'shadow-root-content')\n"
                + "div.style.display = 'none'\n"
                + "const iframe = document.getElementById('result')\n"
                + "div.innerText = iframe\n"
                + "  .contentWindow.document\n"
                + "  .getElementById('shadow-root')\n"
                + "  .shadowRoot.querySelector('p').textContent\n"
                + "document.body.appendChild(div)");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
            "browserHtml",
            true,
            "actions",
            Collections.singletonList(actions));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          String shadowText = document.select("#shadow-root-content").text();
          System.out.println(shadowText);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=',
    browserHtml: true,
    actions: [
      {
        action: 'evaluate',
        source: `
          const div = document.createElement('div')
          div.setAttribute('id', 'shadow-root-content')
          div.style.display = 'none'
          const iframe = document.getElementById('result')
          div.innerText = iframe
            .contentWindow.document
            .getElementById('shadow-root')
            .shadowRoot.querySelector('p').textContent
          document.body.appendChild(div)
        `
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
  const $ = cheerio.load(browserHtml)
  const shadowText = $('#shadow-root-content').text()
  console.log(shadowText)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=',
        'browserHtml' => true,
        'actions' => [
            [
                'action' => 'evaluate',
                'source' => "
                    const div = document.createElement('div')
                    div.setAttribute('id', 'shadow-root-content')
                    div.style.display = 'none'
                    const iframe = document.getElementById('result')
                    div.innerText = iframe
                      .contentWindow.document
                      .getElementById('shadow-root')
                      .shadowRoot.querySelector('p').textContent
                    document.body.appendChild(div)
                ",
            ],
        ],
    ],
]);
$data = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($data->browserHtml);
$xpath = new DOMXPath($doc);
$shadow_text = $xpath->query("//*[@id='shadow-root-content']")->item(0)->textContent;
echo $shadow_text.PHP_EOL;
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
        "browserHtml": True,
        "actions": [
            {
                "action": "evaluate",
                "source": """
                    const div = document.createElement('div')
                    div.setAttribute('id', 'shadow-root-content')
                    div.style.display = 'none'
                    const iframe = document.getElementById('result')
                    div.innerText = iframe
                      .contentWindow.document
                      .getElementById('shadow-root')
                      .shadowRoot.querySelector('p').textContent
                    document.body.appendChild(div)
                """,
            },
        ],
    },
)
browser_html = api_response.json()["browserHtml"]
shadow_text = Selector(browser_html).css("#shadow-root-content::text").get()
print(shadow_text)
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
            "browserHtml": True,
            "actions": [
                {
                    "action": "evaluate",
                    "source": """
                        const div = document.createElement('div')
                        div.setAttribute('id', 'shadow-root-content')
                        div.style.display = 'none'
                        const iframe = document.getElementById('result')
                        div.innerText = iframe
                          .contentWindow.document
                          .getElementById('shadow-root')
                          .shadowRoot.querySelector('p').textContent
                        document.body.appendChild(div)
                    """,
                },
            ],
        },
    )
    browser_html = api_response["browserHtml"]
    shadow_text = Selector(browser_html).css("#shadow-root-content::text").get()
    print(shadow_text)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class CodePenSpider(Spider):
    name = "codepen"

    async def start(self):
        yield Request(
            "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "evaluate",
                            "source": """
                                const div = document.createElement('div')
                                div.setAttribute('id', 'shadow-root-content')
                                div.style.display = 'none'
                                const iframe = document.getElementById('result')
                                div.innerText = iframe
                                  .contentWindow.document
                                  .getElementById('shadow-root')
                                  .shadowRoot.querySelector('p').textContent
                                document.body.appendChild(div)
                            """,
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        shadow_text = response.css("#shadow-root-content::text").get()
        print(shadow_text)
```

Output:

```none
Shadow Paragraph
```

#### Action types

Zyte API supports 3 types of browser actions:

- **Generic actions** work on every website. They allow you to type text into
  input fields, emulate mouse input, and wait for events or time.

- **Special actions** expose functionality that requires specific knowledge
  of the target website, such as using their search box or filling a form.

  They are only available for certain websites. To find out if an action is
  available for a given website, send a test request using that action. If
  the action is not supported, you will get an error API response indicating
  so.
- Browser scripts.

#### Action limits

You are free to use as many browser actions as you wish, but total browser
execution time is limited to 60 seconds. If your actions are still running by
that time, the on-going action is interrupted, follow-up actions are not
executed at all, and you get your requested output (browser HTML, screenshot) as it was rendered
at that time.

The Zyte API response includes an `action` key that provides details about
action execution, including `elapsedTime`, `error`, and `status` fields
to help you debug your actions, e.g. to find out which actions were executed
successfully and which actions were not.

#### Action selectors

Browser actions that interact with a webpage element all have a `selector`
key that allows you to define how to find the target webpage element.

You must define a query to find the target webpage element in the
`selector.value` field.

You must specify the language of your query in the `selector.type` field,
which supports the following values: CSS Selector (`css`), XPath 1.0
(`xpath`). For information about these query languages, see [Learning CSS and
XPath](https://parsel.readthedocs.io/en/latest/usage.html#learning-css-and-xpath).

Note that selectors cannot interact with [iframes](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe) or with the [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM),
only the evaluate action and browser scripts can.

#### Wait actions

You can use the following browser actions to introduce wait times in your
browser action sequences or in your browser scripts:
`waitForSelector`, `waitForRequest`, `waitForResponse`, and
`waitForTimeout`.

Whenever you need to wait for something to happen on a webpage, your should
consider using `waitForSelector` first. It waits for an element matching a
given selector. By default, it waits for a matching
*visible* element, but you can change `selector.state` to `attached`, to
wait for an element to exist regardless of visibility, or to `hidden`, to
wait for a matching *invisible* element.

> ###### TIP
>
> For a usage example of `waitForSelector`, see the web scraping
> tutorial.

`waitForRequest` and `waitForResponse` wait for a request to be sent or for
a response to be received, filtering by URL pattern.

`waitForTimeout` pauses your sequence of actions or your browser script for
the specified amount of time. Because action run time is limited, you should avoid using this type of action when an
alternative waiting action can replace it. However, this action can be
necessary for certain scenarios, such as following organic website-access
patterns.

### Network capture

In browser requests, use the [networkCapture](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/networkCapture) request field to
define filters to capture network responses received during browser rendering
(including action execution).

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/scroll"},
    {"browserHtml", true},
    {
        "networkCapture",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"filterType", "url"},
                {"httpResponseBody", true},
                {"value", "/api/"},
                {"matchType", "contains"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var apiBody = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(apiBody);
var captureEnumerator = data.RootElement.GetProperty("networkCapture").EnumerateArray();
captureEnumerator.MoveNext();
var capture = captureEnumerator.Current;
var base64Body = capture.GetProperty("httpResponseBody").ToString();
var body = System.Convert.FromBase64String(base64Body);

var captureData = JsonDocument.Parse(body);
var quoteEnumerator = captureData.RootElement.GetProperty("quotes").EnumerateArray();
quoteEnumerator.MoveNext();
var quote = quoteEnumerator.Current;
var authorEnumerator = quote.GetProperty("author").EnumerateObject();
while (authorEnumerator.MoveNext())
{
    if (authorEnumerator.Current.Name.ToString() == "name")
    {
        Console.WriteLine(authorEnumerator.Current.Value.ToString());
        break;
    }
}
```

#### CLI client

input.jsonl
```json
{"url": "https://quotes.toscrape.com/scroll", "browserHtml": true, "networkCapture": [{"filterType": "url", "httpResponseBody": true, "value": "/api/", "matchType": "contains"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output ".networkCapture[0].httpResponseBody" \
    | base64 --decode \
    | jq --raw-output ".quotes[0].author.name"
```

#### curl

input.json
```json
{
    "url": "https://quotes.toscrape.com/scroll",
    "browserHtml": true,
    "networkCapture": [
        {
            "filterType": "url",
            "httpResponseBody": true,
            "value": "/api/",
            "matchType": "contains"
        }
    ]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output ".networkCapture[0].httpResponseBody" \
    | base64 --decode \
    | jq --raw-output ".quotes[0].author.name"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> filter =
        ImmutableMap.of(
            "filterType",
            "url",
            "httpResponseBody",
            true,
            "value",
            "/api/",
            "matchType",
            "contains");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://quotes.toscrape.com/scroll",
            "browserHtml",
            true,
            "networkCapture",
            Collections.singletonList(filter));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonArray captures = jsonObject.get("networkCapture").getAsJsonArray();
          JsonObject capture = captures.get(0).getAsJsonObject();
          byte[] bodyBytes =
              Base64.getDecoder().decode(capture.get("httpResponseBody").getAsString());
          String body = new String(bodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(body).getAsJsonObject();
          JsonObject quote = data.get("quotes").getAsJsonArray().get(0).getAsJsonObject();
          String authorName = quote.get("author").getAsJsonObject().get("name").getAsString();
          System.out.println(authorName);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/scroll',
    browserHtml: true,
    networkCapture: [
      {
        filterType: 'url',
        httpResponseBody: true,
        value: '/api/',
        matchType: 'contains'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const capture = response.data.networkCapture[0]
  const data = JSON.parse(Buffer.from(capture.httpResponseBody, 'base64'))
  console.log(data.quotes[0].author.name)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/scroll',
        'browserHtml' => true,
        'networkCapture' => [
            [
                'filterType' => 'url',
                'httpResponseBody' => true,
                'value' => '/api/',
                'matchType' => 'contains',
            ],
        ],
    ],
]);
$api_response = json_decode($response->getBody());
$capture = $api_response->networkCapture[0];
$data = json_decode(base64_decode($capture->httpResponseBody));
echo $data->quotes[0]->author->name.PHP_EOL;
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/scroll",
        "browserHtml": True,
        "networkCapture": [
            {
                "filterType": "url",
                "httpResponseBody": True,
                "value": "/api/",
                "matchType": "contains",
            },
        ],
    },
)
capture = api_response.json()["networkCapture"][0]
data = json.loads(b64decode(capture["httpResponseBody"]).decode())
print(data["quotes"][0]["author"]["name"])
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://quotes.toscrape.com/scroll",
            "browserHtml": True,
            "networkCapture": [
                {
                    "filterType": "url",
                    "httpResponseBody": True,
                    "value": "/api/",
                    "matchType": "contains",
                },
            ],
        },
    )
    capture = api_response["networkCapture"][0]
    data = json.loads(b64decode(capture["httpResponseBody"]).decode())
    print(data["quotes"][0]["author"]["name"])

asyncio.run(main())
```

#### Scrapy

```python
import json
from base64 import b64decode

from scrapy import Request, Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "networkCapture": [
                        {
                            "filterType": "url",
                            "httpResponseBody": True,
                            "value": "/api/",
                            "matchType": "contains",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        capture = response.raw_api_response["networkCapture"][0]
        data = json.loads(b64decode(capture["httpResponseBody"]).decode())
        print(data["quotes"][0]["author"]["name"])
```

Output:

```none
Albert Einstein
```

See also tutorial-network-capture in the web scraping tutorial.

### Request headers

In browser requests, use the [requestHeaders.referer](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestHeaders.referer) request
field to set the [Referer header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer).

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"browserHtml", true},
    {
        "requestHeaders",
        new Dictionary<string, object>()
        {
            {"referer", "https://example.org/"}
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//text()");
nodeIterator.MoveNext();
var responseJson = nodeIterator.Current.ToString();
var responseData = JsonDocument.Parse(responseJson);
var headerEnumerator = responseData.RootElement.GetProperty("headers").EnumerateObject();
var headers = new Dictionary<string, string>();
while (headerEnumerator.MoveNext())
{
    headers.Add(
        headerEnumerator.Current.Name.ToString(),
        headerEnumerator.Current.Value.ToString()
    );
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "browserHtml": true, "requestHeaders": {"referer": "https://example.org/"}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//text()' - 2> /dev/null \
    | jq .headers
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "browserHtml": true,
    "requestHeaders": {
        "referer": "https://example.org/"
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//text()' - 2> /dev/null \
    | jq .headers
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> requestHeaders = ImmutableMap.of("referer", "https://example.org/");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "browserHtml",
            true,
            "requestHeaders",
            requestHeaders);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          JsonObject data = JsonParser.parseString(document.text()).getAsJsonObject();
          JsonObject headers = data.get("headers").getAsJsonObject();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(headers));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    browserHtml: true,
    requestHeaders: {
      referer: 'https://example.org/'
    }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const $ = cheerio.load(response.data.browserHtml)
  const data = JSON.parse($.text())
  const headers = data.headers
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'browserHtml' => true,
        'requestHeaders' => [
            'referer' => 'https://example.org/',
        ],
    ],
]);
$api = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($api->browserHtml);
$data = json_decode($doc->textContent);
$headers = $data->headers;
```

#### Python

```python
import json

import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "browserHtml": True,
        "requestHeaders": {
            "referer": "https://example.org/",
        },
    },
)
browser_html = api_response.json()["browserHtml"]
selector = Selector(browser_html)
response_json = selector.xpath("//text()").get()
response_data = json.loads(response_json)
headers = response_data["headers"]
```

#### Python client

```python
import asyncio
import json

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "browserHtml": True,
            "requestHeaders": {
                "referer": "https://example.org/",
            },
        }
    )
    browser_html = api_response["browserHtml"]
    selector = Selector(browser_html)
    response_json = selector.xpath("//text()").get()
    response_data = json.loads(response_json)
    print(json.dumps(response_data["headers"], indent=2))

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            headers={"Referer": "https://example.org/"},
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                },
            },
        )

    def parse(self, response):
        response_json = response.xpath("//text()").get()
        response_data = json.loads(response_json)
        headers = response_data["headers"]
```

Output (`"Referer"` line):

```json
  "Referer": "https://example.org/",
```

At the moment, only the [Referer header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer) can be overridden this way. If you
need to override additional headers, use HTTP requests with their [customHttpRequestHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customHttpRequestHeaders) request
field instead.

### Redirection

Browser requests always follow [HTTP redirection](https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections) and other URL changes
triggered during browser rendering, e.g. by HTML or by JavaScript.

> ###### TIP
>
> HTTP requests support not following redirection.

### JavaScript

Browser requests have JavaScript execution enabled by default for most
websites. For some websites, however, JavaScript execution is disabled by
default because it helps avoiding bans or automating extraction.

Use the [javascript](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/javascript) request field to force whether or not
JavaScript execution should be enabled on a browser request.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://www.whatismybrowser.com/detect/is-javascript-enabled"},
    {"browserHtml", true},
    {"javascript", false}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//*[@id=\"detected_value\"]/text()");
nodeIterator.MoveNext();
var isJavaScriptEnabled = nodeIterator.Current.ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://www.whatismybrowser.com/detect/is-javascript-enabled", "browserHtml": true, "javascript": false}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//*[@id="detected_value"]/text()' - 2> /dev/null
```

#### curl

input.json
```json
{
    "url": "https://www.whatismybrowser.com/detect/is-javascript-enabled",
    "browserHtml": true,
    "javascript": false
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//*[@id="detected_value"]/text()' - 2> /dev/null
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://www.whatismybrowser.com/detect/is-javascript-enabled",
            "browserHtml",
            true,
            "javascript",
            false);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          String isJavaScriptEnabled = document.select("#detected_value").text();
          System.out.println(isJavaScriptEnabled);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://www.whatismybrowser.com/detect/is-javascript-enabled',
    browserHtml: true,
    javascript: false
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const $ = cheerio.load(response.data.browserHtml)
  const isJavaScriptEnabled = $('#detected_value').text()
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://www.whatismybrowser.com/detect/is-javascript-enabled',
        'browserHtml' => true,
        'javascript' => false,
    ],
]);
$api = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($api->browserHtml);
$xpath = new DOMXPath($doc);
$is_javascript_enabled = $xpath->query("//*[@id='detected_value']")->item(0)->textContent;
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://www.whatismybrowser.com/detect/is-javascript-enabled",
        "browserHtml": True,
        "javascript": False,
    },
)
browser_html = api_response.json()["browserHtml"]
selector = Selector(browser_html)
is_javascript_enabled: str = selector.css("#detected_value::text").get()
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://www.whatismybrowser.com/detect/is-javascript-enabled",
            "browserHtml": True,
            "javascript": False,
        }
    )
    browser_html = api_response["browserHtml"]
    selector = Selector(browser_html)
    is_javascript_enabled = selector.css("#detected_value::text").get()
    print(is_javascript_enabled)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class WhatIsMyBrowserComSpider(Spider):
    name = "whatismybrowser_com"

    async def start(self):
        yield Request(
            "https://www.whatismybrowser.com/detect/is-javascript-enabled",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "javascript": False,
                },
            },
        )

    def parse(self, response):
        is_javascript_enabled: str = response.css("#detected_value::text").get()
```

Output:

```none
No
```

## Zyte API automatic extraction

**Automatic extraction** gets you structured data from web data.

Automatic extraction supports AI-powered extraction
of e-commerce, article and job posting data from any website, as well as
**non-AI extraction** of Google Search results.

You can use Zyte API requests to get structured data
from webpages.

### Structured data types

In a Zyte API request, enable any of the following
fields to get matching structured data:

> ###### NOTE
>
> You can only enable 1 of these fields per Zyte API request.

> ##### E-commerce
> [product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/product)) ai
> [productList](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/productList) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/productList)) ai
> [productNavigation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/productNavigation) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/productNavigation)) ai

> ##### Articles
> [article](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/article) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/article)) ai
> [articleList](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/articleList) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/articleList)) ai
> [articleNavigation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/articleNavigation) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/articleNavigation)) ai
> [forumThread](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/forumThread) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/forumThread)) ai

> ##### Job postings
> [jobPosting](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/jobPosting) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/jobPosting)) ai
> [jobPostingNavigation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/jobPostingNavigation) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/jobPostingNavigation)) ai

> ##### Generic
> [pageContent](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/pageContent) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/pageContent)) ai

> ##### Google Search
>
> [serp](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/serp) ([output](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/serp)) non-ai

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"},
    {"product", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var product = data.RootElement.GetProperty("product").ToString();

Console.WriteLine(product);
```

#### CLI client

input.jsonl
```json
{"url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html", "product": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .product
```

#### curl

input.json
```json
{
    "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
    "product": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .product
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
            "product",
            true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonObject product = jsonObject.get("product").getAsJsonObject();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(product));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html',
    product: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const product = response.data.product
  console.log(product)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html',
        'product' => true,
    ],
]);
$data = json_decode($response->getBody());
$product = json_encode($data->product);
echo $product.PHP_EOL;
```

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": (
            "https://books.toscrape.com/catalogue"
            "/a-light-in-the-attic_1000/index.html"
        ),
        "product": True,
    },
)
product = api_response.json()["product"]
print(product)
```

#### Python client

```python
import asyncio
import json

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": (
                "https://books.toscrape.com/catalogue"
                "/a-light-in-the-attic_1000/index.html"
            ),
            "product": True,
        }
    )
    product = api_response["product"]
    print(json.dumps(product, indent=2, ensure_ascii=False))

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class BooksToScrapeComSpider(Spider):
    name = "books_toscrape_com"

    async def start(self):
        yield Request(
            (
                "https://books.toscrape.com/catalogue"
                "/a-light-in-the-attic_1000/index.html"
            ),
            meta={
                "zyte_api_automap": {
                    "product": True,
                },
            },
        )

    def parse(self, response):
        product = response.raw_api_response["product"]
        print(product)
```

Output (first 5 lines):

```json
{
  "name": "A Light in the Attic",
  "price": "51.77",
  "currency": "GBP",
  "currencyRaw": "£",
```

### AI-powered extraction

Automatic extraction uses AI-powered extraction for the following structured
data types: [product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product), [productList](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/productList),
[productNavigation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/productNavigation), [article](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/article),
[articleList](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/articleList), [articleNavigation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/articleNavigation),
[forumThread](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/forumThread), [jobPosting](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/jobPosting),
[jobPostingNavigation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/jobPostingNavigation), [pageContent](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/pageContent).

AI-powered extraction also supports LLM-based extraction of custom
attributes, as well as:
geolocation,
IP type,
cookies,
sessions,
redirection,
response headers, and
metadata,
plus additional features depending on your
extraction source.

#### Extraction source

Use the corresponding `extractFrom` option, e.g.
[productOptions.extractFrom](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/productOptions.extractFrom) when extracting a
[product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product), to indicate which sources to use for automatic
extraction:

- `httpResponseBody` extracts from [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody). It is
  usually faster and cheaper.
- `browserHtmlOnly` extracts from [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml). It
  typically improves quality over `httpResponseBody` on JavaScript-heavy
  web pages.
- `browserHtml` extracts from both [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) and visual
  features of the rendered web page. It typically improves quality over
  `browserHtmlOnly`, but is not as robust in case of rendering issues.

If not specified, `browserHtml` is currently used by default for AI
extraction, while `httpResponseBody` is used by default for
non-AI extraction. In the future, the default value
may depend on the target website.

Automatic extraction using an HTTP request (`httpResponseBody`) supports HTTP
request attributes for method, body, and headers.

Automatic extraction using a browser request (`browserHtmlOnly` or
`browserHtml`) supports browser HTML,
screenshots, some request headers, actions, network
capture, and toggling JavaScript. The limitations of browser requests also apply in this case.

#### Model pinning

The AI models of AI-powered extraction are retrained regularly, usually a few
times per year. While new model versions aim to improve overall accuracy, they
may become less accurate for specific fields of specific websites.

For certain data types, we provide an option to pin a specific model version,
which allows you to postpone an update to the latest model.

To pin a model, use the corresponding `model` option, e.g.
[productOptions.model](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/productOptions.model) when extracting a [product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product).

Model versions remain available for at least 1 year after their release. For
example, a product model version `"2024-02-01"` would remain available at
least until the 1st of February 2025.

When we decide to remove a model version, we announce its end-of-life date by
email to its users at least 3 months in advance, and we list that date in the
table below.

| Data type   | Model name   | Description           |
|-------------|--------------|-----------------------|
| product     | 2024-02-01   |                       |
| product     | 2024-09-16   | Default product model |

## Custom attributes extraction

When you use AI-powered extraction, you can also extract
arbitrary additional attributes defined by yourself: **custom attributes**.

The extraction of custom attributes uses a Large Language Model (LLM) operated
by Zyte that receives a schema defined by you, as well as text extracted from
the target webpage, and performs extraction of structured data according to
your schema.

When custom attributes extraction is requested,
a standard extraction field must also be
specified (e.g. [product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product)).
This determines the part of the web page which would be passed to the LLM for custom attributes extraction,
e.g. when a web page is a product, we’re only going to pass the product information,
ignoring other parts of the page, such as menu or footer, which makes extraction cheaper and more accurate.
Any of the standard extraction fields can be used, except for [serp](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/serp).

The schema is passed in the
[customAttributes](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributes)
request field, and additional options can be customized in the
[customAttributesOptions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributesOptions)
field.

Extracted values are available in the
[customAttributes.values](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/customAttributes.values)
field in the response.

Here is an example body of a request to Zyte API which performs custom attributes extraction,
adding “summary” and “article_sentiment” attributes:

```json
{
  "url": "https://www.zyte.com/blog/intercept-network-patterns-within-zyte-api/",
  "article": true,
  "customAttributes": {
    "summary": {
      "type": "string",
      "description": "A two sentence article summary"
    },
    "article_sentiment": {
      "type": "string",
      "enum": ["positive", "negative", "neutral"]
    }
  }
}
```

And here is an example response body, with “article” and “metadata” values omitted:

```json
{
  "url": "https://www.zyte.com/blog/intercept-network-patterns-within-zyte-api/",
  "statusCode": 200,
  "article": {

  },
  "customAttributes": {
    "values": {
      "summary": "The Zyte API now allows developers to intercept network patterns, enabling better web scraping and bypassing challenges posed by modern websites with dynamic content and anti-bot measures. This feature allows for enhanced ban-handling strategies and more efficient scraping.",
      "article_sentiment": "positive"
    },
    "metadata": {

    }
  }
}
```

Refer to examples of making Zyte API requests with different languages and libraries.

### Method of extraction

[customAttributesOptions.method](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributesOptions.method)
allows to select the method of custom attribute extraction:

* “generate” (default) generates extracted data with the help of a generative Large Language Model (LLM).
  It is the most powerful and versatile extraction method, but also the most expensive one,
  with variable per-request cost.
* “extract” locates extracted data in the requested web page with the help of a non-generative LLM.
  It only supports a subset of the schema (only string, integer and number types),
  and can’t perform generative tasks such as summarization or data transformation.
  It is however much cheaper compared to the generative method and has a
  fixed per-request cost.

### Schema for the generative method

The schema of the custom attributes is passed in the
[customAttributes](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributes)
request field, and is a subset of the OpenAPI specification, using JSON syntax.
Here is an example custom attributes schema, showcasing the main features and good practices
with the default “generate” method:

```json
{
  "pockets": {
    "type": "integer",
    "description": "how many pockets the piece of clothing has"
  },
  "has_reflective_elements": {
    "type": "boolean",
    "description": "does the piece of clothing have reflective elements?"
  },
  "pattern_orientation": {
    "type": "string",
    "description": "if the piece of clothing has a pattern, the orientation of this pattern",
    "enum": ["horizontal", "vertical", "diagonal"]
  },
  "materials": {
    "type": "array",
    "description": "the materials the product is made of",
    "items": {"type": "string"}
  },
  "materials_details": {
    "type": "array",
    "description": "information about the materials the product is made of",
    "items": {
      "type": "object",
      "properties": {
        "name": {
          "type": "string",
          "description": "the name of the material"
        },
        "percentage": {
          "type": "number",
          "description": "the percentage of the material in the product"
        }
      }
    }
  },
  "price": {
    "type": "object",
    "properties": {
      "regular": {
        "type": "number",
        "description": "the regular price of the product. This is, without any discount"
      },
      "discounted": {
        "type": "number",
        "description": "the current price of the product, with the discount"
      },
      "unit": {
        "type": "string",
        "description": "the currency code of the price, usually given as a 3-letter code, e.g. USD, EUR, GBP, etc."
      }
    }
  }
}
```

An example output which may be produced for this schema would be:

```json
{
  "pockets": 3,
  "has_reflective_elements": false,
  "materials": ["cotton", "polyester", "elastane"],
  "materials_details": [
    {"name": "cotton", "percentage": 70},
    {"name": "polyester", "percentage": 25},
    {"name": "elastane"}
  ],
  "price": {
    "regular": 100,
    "discounted": 99,
    "unit": "EUR"
  }
}
```

Note that `"pattern_orientation"` is  missing from the response, as well as `"percentage"` for one of the materials:
this is due to all attributes being implicitly nullable, so if an attribute can not be extracted, it will not be returned.

The returned value is guaranteed to conform to the requested schema.

The following attribute data types are supported:

- `string`
- `boolean`
- `number`
- `integer`
- `array` of any data type except for `array`
- `object` with `string`, `boolean`, `number` and `integer` sub-fields

When the type is `string`, `number` or `integer`, an `enum` can also be indicated,
and the extraction value for that attribute will always be one of these options,
or empty when it cannot be extracted - see the `"pattern_orientation"` in the example above.
This is especially useful in data analysis use cases, where one might need
to split the dataset into pre-defined groups.

#### Generative attributes

Custom Attributes don’t need to be restricted to extract data as it appears on the web site verbatim.
They can be used for different operations on data that can only be achieved with generative extraction.
You can find some examples below that take advantage of this aspect.

##### Normalization

A custom attribute can be extracted following some data normalization,
when specified in the description, usually in some explicit format or via an example.

This is especially useful for later parsing, e.g. for visualization or data analysis, for example:

```json
{
  "datetime_posted": {
    "type": "string",
    "description": "the date when the article was created, in the following format: YYYY/MM/DD"
  }
}
```

Example output:

```json
{
  "datetime_posted": "2021/12/30"
}
```

##### Summarization

Sometimes, attributes that are summaries rather than the whole text can be useful,
especially to save tokens needed to generate them, or when some simplification of the content of the page is needed.

Example schema:

```json
{
  "summary": {
    "type": "string",
    "description": "a brief summary of the article. Max 2 phrases. Explain it as a third person, e.g. start like this: The article.."
  }
}
```

Example output:

```json
{
  "summary": "The article describes the scenic beauty and vast adventure opportunities of the Grand Canyon National Park, highlighting its colorful landscapes and the meandering Colorado River. It provides practical information for visitors, such as entrance fees, lodging options, and tips for hiking and rafting."
}
```

##### Translation

An extract-and-translate can be done on the fly. Just specify the conditions and/or details in the description.

Example text:

```
[...]
Couleurs du produit disponibles: jaune, rouge
[...]
```

Example schema:

```json
{
  "colors": {
    "type": "array",
    "description": "the available colors of the product. Translate to English if needed.",
    "items": {"type": "string"}
  }
}
```

Output:

```json
{
  "colors": ["yellow", "red"]
}
```

When there are several attributes in the schema,
these kinds of specifications made to custom attributes may apply to later attributes, so in these cases,
if you want different behavior, it’s recommended to specify so in the description of each attribute,
for example:

```json
{
  "colors": {
    "type": "array",
    "description": "the available colors of the product. Translate to English if needed.",
    "items": {"type": "string"}
  },
  "materials": {
    "type": "array",
    "description": "the materials the product is made of. Extract as they appear on the page, without translating them.",
    "items": {"type": "string"}
  }
}
```

##### Explanation

We can make the LLM perform an analysis and explain the page content (or other details)
before doing the actual extraction, in the same attribute or in another attribute after the explanatory one.

This is especially useful to force the LLM to develop a “logic” before doing the actual extraction,
which has been [demonstrated](https://arxiv.org/abs/2201.11903) to improve the final answer,
for example:

```json
{
  "explain is a toy": {
    "type": "string",
    "description": "analyze the content of the page and detailedly explain it, explaining if it is a single product page and if the product is a toy or not."
  },
  "is a toy": {
    "type": "boolean",
    "description": "whether the product is a toy or not"
  }
}
```

Example output:

```json
{
  "explain is a toy": "The content of the page is a product page for \"Roasted & Salted Plantain Chips\", which is a type of snack food. It includes details such as brand, price, ratings, ingredients, and product description. It does not mention anything about toys or games, so it is not a single product page for a toy.",
  "is a toy": false
}
```

Overall, we’d expect the extraction of the “is a toy” custom attribute in the example above to be more accurate
if we use the “explain is a toy” before it, especially in hard or ambiguous cases (e.g. the product is a manual for a toy).

> ###### NOTE
>
> Since these kind of explanations need to generate a fair amount of
> tokens, it can considerably increase the extraction cost.
>
> Also, it is important that the attribute that does the explanation/analysis
> (“explain is a toy”) comes before the final one where the final extraction
> is made (“is a toy”).

#### Other tips and tricks

##### Avoiding mathematical transformations

We recommend doing a simple extraction when possible,
and then apply your rules or transformations as a post-processing of the extraction.

For example, imagine you want to extract the height of a product, but always in inches.
However, you’re scraping a lot of product pages and some web sites might display the height of the product in cm, m, ft, etc.
One option is to explicitly ask in the schema to “transform to X metric if found in Y metric”.
The LLM generally has the capacity to do this conversion internally,
but we cannot ensure the result will always be correct, and it will overcomplicate extraction for the LLM.

Example Text:

```
Vacuum cleaner Turbo master 2000
Price: 200 $
Specified height by the manufacturer is 1.2 meters
```

Example Schema:

```json
{
  "height": {
    "type": "number",
    "description": "height of the product, in inches. Transform it to inches if found in other metric"
  }
}
```

Extraction result:

```json
{
  "height": 47.24
}
```

The result is correct. However, when the schema is bigger (i.e. there are more custom attributes to extract),
the LLM attention is more spread, it has a higher chance to fail these internal conversions, which cannot be verified.
For this reason, we recommend writing a schema that allows the LLM to extract the desired data verbatim from the page,
with the necessary fields to do your own transformation in your favorite programming language.
This extraction is easier for the LLM and has a lower chance of being extracted incorrectly.

Example schema:

```json
{
  "product_height": {
    "type": "object",
    "description": "info about the height of the product",
    "properties": {
      "value": {
        "type": "number",
        "description": "the value of the height"
      },
      "unit_normalized": {
        "type": "string",
        "description": "the normalized unit of measurement for the height",
        "enum": ["cm", "m", "in", "ft", "mm", "other"]
      }
    }
  }
}
```

Extraction result:

```json
{
  "product_height": {
    "value": 1.2,
    "unit_normalized": "m"
  }
}
```

And then do the necessary transformation. For example, using Python:

```python
if values["product_height"]["unit_normalized"] == "m":
   return values["product_height"]["value"] * 39.37  # meters to inches
```

##### Reducing the number of attributes

A lower number of attributes generally means better extraction quality.
The easier it is to solve a problem, the better the LLM will be at solving that problem.
For that reason, the more attributes there are in the schema,
the harder the overall extraction will be for the LLM.
In the latter case, the LLM tends to miss some details in the descriptions, or in the web page.

Generally, the fewer the attributes, the more the LLM can focus on those, and
the better the extraction quality will be. It is especially important when some
attributes are already hard or complex on their own (e.g. array of objects).

### Schema for the extractive method

The main use case for the extractive method is the extraction of simple,
not-too-large attributes that do not require any transformation, such as memory
capacity or screen resolution for [product](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/product).

When you select the “extract” custom attributes extraction method,
and your schema contains attributes that
are not supported by the extractive method, e.g. objects, lists, or booleans,
those attributes are ignored during extraction.

When creating a schema with the extractive method, we recommend to start
without attribute descriptions. If an attribute name alone is not enough to
reach the desired quality, we recommend writing a description for that
attribute, but formulating it as a question, and describing it in detail
without assuming that the attribute name will be implicit as context for that
question.

For example, you might start with the following:

```json
{
  "number_of_pockets": {
    "type": "integer"
  }
}
```

And then make it more specific:

```json
{
  "pockets": {
    "type": "integer",
    "description": "What is the number of pockets in this garment?"
  }
}
```

But we don’t recommend having an incomplete description that relies on the attribute name,
or a description that is not a question:

```json
{
  "pockets": {
    "type": "integer",
    "description": "number of them in this garment"
  }
}
```

When doing extraction of values with units, we recommend to extract the whole value as one attribute,
instead of splitting the value and the unit, for example do this:

```json
{
  "memory_capacity": {
    "type": "string"
  }
}
```

instead of this, which is less likely to work well:

```json
{
  "memory_capacity_value": {
    "type": "integer"
  },
  "memory_capacity_unit": {
    "type": "string"
  }
}
```

## Zyte API shared features

Learn here about Zyte API features that you can use with HTTP
requests, browser requests, and
automatic extraction:
geolocation,
IP type,
cookies,
sessions,
response headers, and
metadata.

### Geolocation

The geographical point of origin of a request in terms of IP address can
influence the response content. Some websites adjust the language or currency
based on the country of origin. Some websites only allow traffic from specific
countries.

By default, Zyte API uses the most fitting geolocation based on the target
website. You can override the country of origin used for a given request with
the [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) request field.

> ###### NOTE
>
> Zyte API provides 2 sets of geolocations, standard and extended,
> listed in the reference documentation of [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation).
>
> Setting [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) explicitly on a request using an
> extended geolocation, instead of letting Zyte API choose the right
> geolocation based on the target website, affects request cost.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "http://ip-api.com/json"},
    {"httpResponseBody", true},
    {"geolocation", "AU"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var countryCode = responseData.RootElement.GetProperty("countryCode").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "http://ip-api.com/json", "httpResponseBody": true, "geolocation": "AU"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .countryCode
```

#### curl

input.json
```json
{
    "url": "http://ip-api.com/json",
    "httpResponseBody": true,
    "geolocation": "AU"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .countryCode
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url", "http://ip-api.com/json", "httpResponseBody", true, "geolocation", "AU");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String countryCode = data.get("countryCode").getAsString();
          System.out.println(countryCode);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'http://ip-api.com/json',
    httpResponseBody: true,
    geolocation: 'AU'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const data = JSON.parse(httpResponseBody)
  const countryCode = data.countryCode
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'http://ip-api.com/json',
        'httpResponseBody' => true,
        'geolocation' => 'AU',
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
$data = json_decode($http_response_body);
$country_code = $data->countryCode;
```

#### Proxy mode

With the proxy mode, use the
zyte-geolocation header.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Geolocation: US" \
    http://ip-api.com/json \
    | jq .countryCode
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "http://ip-api.com/json",
        "httpResponseBody": True,
        "geolocation": "AU",
    },
)
http_response_body: bytes = b64decode(api_response.json()["httpResponseBody"])
response_data = json.loads(http_response_body)
country_code = response_data["countryCode"]
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "http://ip-api.com/json",
            "httpResponseBody": True,
            "geolocation": "AU",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    response_data = json.loads(http_response_body)
    print(response_data["countryCode"])

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class IPAPIComSpider(Spider):
    name = "ip_api_com"

    async def start(self):
        yield Request(
            "http://ip-api.com/json",
            meta={
                "zyte_api_automap": {
                    "geolocation": "AU",
                },
            },
        )

    def parse(self, response):
        response_data = json.loads(response.body)
        country_code = response_data["countryCode"]
```

Output:

```none
AU
```

### IP type

IP addresses can be categorized in one of the following types:

- **Data center** IP addresses are server-hosted IP addresses provided by web
  hosting providers, ISPs, etc.
- **Residential** IP addresses are IP addresses provided by end-user devices
  with explicit user consent for bandwidth sharing.

> ###### SEE ALSO
>
> zapi-permissions-control

The type of IP address of a request can influence the response content. Some
websites return different content depending on the IP type, or only allow
requests from device residential IP addresses.

By default, Zyte API uses the most fitting IP type based on the target website.
You can override the IP type used for a given request by setting the
[ipType](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType) request field to either `datacenter` or
`residential`.

> ###### WARNING
>
> Setting [ipType](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType) explicitly to `residential`,
> instead of letting Zyte API choose the right IP type based on the target
> website, requires completing our KYC procedure and affects
> request cost.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

string[] ipTypes = { "datacenter", "residential" };
for (int i = 0; i < ipTypes.Length; i++)
{
    var input = new Dictionary<string, object>(){
        {"url", "https://www.whatismyisp.com/"},
        {"httpResponseBody", true},
        {"ipType", ipTypes[i]}
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

    HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
    var body = await response.Content.ReadAsByteArrayAsync();

    var data = JsonDocument.Parse(body);
    var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
    var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
    var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);
    var htmlDocument = new HtmlDocument();
    htmlDocument.LoadHtml(httpResponseBody);
    var navigator = htmlDocument.CreateNavigator();
    var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//h1/span/text()");
    nodeIterator.MoveNext();
    var isp = nodeIterator.Current.ToString();

    Console.WriteLine(isp);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "datacenter"}
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "residential"}
```

```shell
zyte-api input.jsonl 2> /dev/null \
    | xargs -d\\n -n 1 \
    bash -c "
        jq --raw-output .httpResponseBody <<< \"\$0\" \
        | base64 --decode \
        | xmllint --html --xpath 'string(//h1/span/text())' --noblanks - 2> /dev/null
"
```

#### curl

input.jsonl
```json
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "datacenter"}
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "residential"}
```

```shell
cat input.jsonl \
    | xargs -P 2 -d\\n -n 1 \
    bash -c "
        curl \
                --user YOUR_ZYTE_API_KEY: \
                --header 'Content-Type: application/json' \
                --data \"\$0\" \
                --compressed \
                https://api.zyte.com/v1/extract \
            2> /dev/null \
            | jq --raw-output .httpResponseBody \
            | base64 --decode \
            | xmllint --html --xpath 'string(//h1/span/text())' --noblanks - 2> /dev/null
"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    String[] ipTypes = {"datacenter", "residential"};
    for (String ipType : ipTypes) {
      Map<String, Object> parameters =
          ImmutableMap.of(
              "url", "https://www.whatismyisp.com/", "httpResponseBody", true, "ipType", ipType);
      String requestBody = new Gson().toJson(parameters);

      HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
      request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
      request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
      request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
      request.setEntity(new StringEntity(requestBody));

      CloseableHttpClient client = HttpClients.createDefault();
      client.execute(
          request,
          response -> {
            HttpEntity entity = response.getEntity();
            String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
            JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
            String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
            byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
            String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
            Document document = Jsoup.parse(httpResponseBody);
            String logout = document.select("h1 > span:first-of-type").text();
            System.out.println(logout);
            return null;
          });
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

const ipTypes = ['datacenter', 'residential']
for (const ipType of ipTypes) {
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://www.whatismyisp.com/',
      httpResponseBody: true,
      ipType
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((response) => {
    const httpResponseBody = Buffer.from(
      response.data.httpResponseBody,
      'base64'
    )
    const $ = cheerio.load(httpResponseBody)
    const logout = $('h1 > span:first-of-type').text()
    console.log(logout)
  })
}
```

#### PHP

```php
<?php

error_reporting(E_ERROR | E_PARSE);
$client = new GuzzleHttp\Client();
$ip_types = ['datacenter', 'residential'];
foreach ($ip_types as &$ip_type) {
    $response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => 'https://www.whatismyisp.com/',
            'httpResponseBody' => true,
            'ipType' => $ip_type,
        ],
    ]);
    $data = json_decode($response->getBody());
    $http_response_body = base64_decode($data->httpResponseBody);
    $doc = new DOMDocument();
    $doc->loadHTML($http_response_body);
    $xpath = new DOMXPath($doc);
    $logout = $xpath->query('//h1/span/text()')->item(0)->nodeValue;
    echo $logout.PHP_EOL;
}
```

#### Proxy mode

With the proxy mode, use the
zyte-iptype header.

```shell
for ip_type in datacenter residential
do
    curl \
        --proxy api.zyte.com:8011 \
        --proxy-user YOUR_ZYTE_API_KEY: \
        --header "Zyte-IPType: $ip_type" \
        --compressed \
        https://www.whatismyisp.com/ \
        2> /dev/null \
        | xmllint --html --xpath 'string(//h1/span/text())' --noblanks - 2> /dev/null
done
```

#### Python

```python
from base64 import b64decode

import requests
from parsel import Selector

for ip_type in ("datacenter", "residential"):
    api_response = requests.post(
        "https://api.zyte.com/v1/extract",
        auth=("YOUR_ZYTE_API_KEY", ""),
        json={
            "url": "https://www.whatismyisp.com/",
            "httpResponseBody": True,
            "ipType": ip_type,
        },
    )
    http_response_body_bytes = b64decode(api_response.json()["httpResponseBody"])
    http_response_body = http_response_body_bytes.decode()
    logout = Selector(http_response_body).css("h1 > span::text").get()
    print(logout)
```

#### Python client

```python
import asyncio
from base64 import b64decode

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    for ip_type in ("datacenter", "residential"):
        api_response = await client.get(
            {
                "url": "https://www.whatismyisp.com/",
                "httpResponseBody": True,
                "ipType": ip_type,
            },
        )
        http_response_body_bytes = b64decode(api_response["httpResponseBody"])
        http_response_body = http_response_body_bytes.decode()
        logout = Selector(http_response_body).css("h1 > span::text").get()
        print(logout)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class WhatIsMyIspComSpider(Spider):
    name = "whatismyisp_com"

    async def start(self):
        for ip_type in ("datacenter", "residential"):
            yield Request(
                "https://www.whatismyisp.com/",
                meta={
                    "zyte_api_automap": {
                        "ipType": ip_type,
                    },
                },
            )

    def parse(self, response):
        print(response.css("h1 > span::text").get())
```

Output:

```none
[A web hosting company]
[An Internet service provider]
```

### Cookies

Some websites use [cookies]() to track sessions and user preferences like
language, address, etc.

Use the [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies) and [responseCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/responseCookies)
request fields to set and get cookies. See **Example 1** below.

A common usage pattern with cookies is to send a browser request with the [responseCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/responseCookies) request field set
to `true` to a webpage that requires a browser to generate a valid session
cookie, and then copy the [responseCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/responseCookies) response field value
into the [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies) request field of follow-up HTTP
requests. This allows using sessions on websites as long as
the target website only checks for the cookie presence, which is often the
case (if not, use sessions). See **Example 2** below.

If you do not set request cookies, Zyte API may set some request cookies anyway
to minimize bans. If you do not want that, set the
[cookieManagement](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/cookieManagement) request field to `"discard"`;
[requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies) will still be used if defined.

#### Example 1: Set a cookie and get it back

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

The following code example sends a cookie to [httpbin.org](https://httpbin.org) and prints the
cookies that [httpbin.org](https://httpbin.org) reports to have received:

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/cookies"},
    {"httpResponseBody", true},
    {
        "requestCookies",
        new List<Dictionary<string, string>>()
        {
            new Dictionary<string, string>()
            {
                {"name", "foo"},
                {"value", "bar"},
                {"domain", "httpbin.org"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);
var result = System.Text.Encoding.UTF8.GetString(httpResponseBody);

Console.WriteLine(result);
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/cookies", "httpResponseBody": true, "requestCookies": [{"name": "foo", "value": "bar", "domain": "httpbin.org"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/cookies",
    "httpResponseBody": true,
    "requestCookies": [
        {
            "name": "foo",
            "value": "bar",
            "domain": "httpbin.org"
        }
    ]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, String> cookies =
        ImmutableMap.of("name", "foo", "value", "bar", "domain", "httpbin.org");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/cookies",
            "httpResponseBody",
            true,
            "requestCookies",
            Collections.singletonList(cookies));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/cookies',
    httpResponseBody: true,
    requestCookies: [
      {
        name: 'foo',
        value: 'bar',
        domain: 'httpbin.org'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(httpResponseBody.toString())
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/cookies',
        'httpResponseBody' => true,
        'requestCookies' => [
            [
                'name' => 'foo',
                'value' => 'bar',
                'domain' => 'httpbin.org',
            ],
        ],
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
echo $http_response_body;
```

#### Proxy mode

With the proxy mode, the request
`Cookie` header from your requests is used automatically to set
cookies for the target URL domain.

> ###### NOTE
>
> Setting cookies for additional domains is not supported.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Cookie: foo=bar" \
    https://httpbin.org/cookies
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/cookies",
        "httpResponseBody": True,
        "requestCookies": [
            {
                "name": "foo",
                "value": "bar",
                "domain": "httpbin.org",
            },
        ],
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
print(http_response_body.decode())
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/cookies",
            "httpResponseBody": True,
            "requestCookies": [
                {
                    "name": "foo",
                    "value": "bar",
                    "domain": "httpbin.org",
                },
            ],
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/cookies",
            meta={
                "zyte_api_automap": {
                    "requestCookies": [
                        {
                            "name": "foo",
                            "value": "bar",
                            "domain": "httpbin.org",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        print(response.text)
```

Output:

```json
{
  "cookies": {
    "foo": "bar"
  }
}
```

#### Example 2: Reuse browser cookies in HTTP requests

Send a browser request to the home page of a website, and use its response
cookies as request cookies in an HTTP request to a different URL of that
website.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var browserInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"browserHtml", true},
    {"responseCookies", true}
};
var browserInputJson = JsonSerializer.Serialize(browserInput);
var browserContent = new StringContent(browserInputJson, Encoding.UTF8, "application/json");
HttpResponseMessage browserResponse = await client.PostAsync("https://api.zyte.com/v1/extract", browserContent);
var browserResponseBody = await browserResponse.Content.ReadAsByteArrayAsync();
var browserData = JsonDocument.Parse(browserResponseBody);

var httpInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"httpResponseBody", true},
    {"requestCookies", browserData.RootElement.GetProperty("responseCookies")}
};
var httpInputJson = JsonSerializer.Serialize(httpInput);
var httpContent = new StringContent(httpInputJson, Encoding.UTF8, "application/json");
HttpResponseMessage httpResponse = await client.PostAsync("https://api.zyte.com/v1/extract", httpContent);
var httpResponseBody = await httpResponse.Content.ReadAsByteArrayAsync();
var httpData = JsonDocument.Parse(httpResponseBody);
var base64HttpResponseBodyField = httpData.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyField = System.Convert.FromBase64String(base64HttpResponseBodyField);
var result = System.Text.Encoding.UTF8.GetString(httpResponseBodyField);

Console.WriteLine(result);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> browserParameters =
        ImmutableMap.of(
            "url", "https://toscrape.com/", "browserHtml", true, "responseCookies", true);
    String browserRequestBody = new Gson().toJson(browserParameters);

    HttpPost browserRequest = new HttpPost("https://api.zyte.com/v1/extract");
    browserRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    browserRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    browserRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    browserRequest.setEntity(new StringEntity(browserRequestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        browserRequest,
        browserResponse -> {
          HttpEntity browserEntity = browserResponse.getEntity();
          String browserApiResponse = EntityUtils.toString(browserEntity, StandardCharsets.UTF_8);
          JsonObject browserJsonObject =
              JsonParser.parseString(browserApiResponse).getAsJsonObject();

          Map<String, Object> httpParameters =
              ImmutableMap.of(
                  "url",
                  "https://books.toscrape.com/",
                  "httpResponseBody",
                  true,
                  "requestCookies",
                  browserJsonObject.get("responseCookies"));
          String httpRequestBody = new Gson().toJson(httpParameters);

          HttpPost httpRequest = new HttpPost("https://api.zyte.com/v1/extract");
          httpRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
          httpRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
          httpRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
          httpRequest.setEntity(new StringEntity(httpRequestBody));

          client.execute(
              httpRequest,
              httpResponse -> {
                HttpEntity httpEntity = httpResponse.getEntity();
                String httpApiResponse = EntityUtils.toString(httpEntity, StandardCharsets.UTF_8);
                JsonObject httpJsonObject =
                    JsonParser.parseString(httpApiResponse).getAsJsonObject();
                String base64HttpResponseBody =
                    httpJsonObject.get("httpResponseBody").getAsString();
                byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
                String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
                System.out.println(httpResponseBody);
                return null;
              });
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com/',
    browserHtml: true,
    responseCookies: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((browserResponse) => {
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://books.toscrape.com/',
      httpResponseBody: true,
      requestCookies: browserResponse.data.responseCookies
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((httpResponse) => {
    const httpResponseBody = Buffer.from(
      httpResponse.data.httpResponseBody,
      'base64'
    )
    console.log(httpResponseBody.toString())
  })
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$browser_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com/',
        'browserHtml' => true,
        'responseCookies' => true,
    ],
]);
$browser_data = json_decode($browser_response->getBody());
$http_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://books.toscrape.com/',
        'httpResponseBody' => true,
        'requestCookies' => $browser_data->responseCookies,
    ],
]);
$http_data = json_decode($http_response->getBody());
$http_response_body = base64_decode($http_data->httpResponseBody);
echo $http_response_body;
```

#### Python

```python
from base64 import b64decode

import requests

browser_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com/",
        "browserHtml": True,
        "responseCookies": True,
    },
)
http_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://books.toscrape.com/",
        "httpResponseBody": True,
        "requestCookies": browser_response.json()["responseCookies"],
    },
)
http_response_body = b64decode(http_response.json()["httpResponseBody"])
print(http_response_body.decode())
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    browser_response = await client.get(
        {
            "url": "https://toscrape.com/",
            "browserHtml": True,
            "responseCookies": True,
        }
    )
    http_response = await client.get(
        {
            "url": "https://books.toscrape.com/",
            "httpResponseBody": True,
            "requestCookies": browser_response["responseCookies"],
        }
    )
    http_response_body = b64decode(http_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com/",
            callback=self.parse_browser,
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "responseCookies": True,
                },
            },
        )

    def parse_browser(self, response):
        yield response.follow(
            "https://books.toscrape.com/",
            callback=self.parse_http,
            meta={
                "zyte_api_automap": {
                    "requestCookies": response.raw_api_response["responseCookies"],
                },
            },
        )

    def parse_http(self, response):
        print(response.text)
```

### Sessions

In web scraping, a session is a set of request conditions (IP address, cookie jar,
network stack, etc.) that, when shared by two or more requests, make those
requests *seem* part of an organic web browsing session.

For some websites, reusing cookies can be enough to
maintain a session. But on other websites, sessions get invalidated when their
requests do not share the same IP address, network stack, etc.

Zyte API supports 2 different ways to define request sessions:

- Client-managed sessions give you full control
  over session management.
- Server-managed sessions let Zyte API
  handle session management for you.

> ###### NOTE
>
> Sessions do *not* offer browser persistence. When two browser
> requests use the same session, they are not actually using
> the same browser tab, window, process or machine.

> ###### TIP
>
> scrapy-zyte-api also implements an
> alternative session management API,
> similar to that of server-managed sessions, but built on top of client-managed
> sessions.

Zyte API sessions can be specially useful for:

- Crawling stateful parts of websites, like multi-page forms, pagination or
  scrolling, where the time limit of actions
  can be a problem.
  > ###### NOTE
  >
  > Sessions do not maintain browser state, they only make it *seem*
  > so to target websites. In other words, when you send a 2nd request with
  > the same session, your request does not use the same browser instance
  > as the 1st request.
  >
  > Maintaining browser state between requests is a *planned* feature.
- Optimizing scenarios where you need to set initial, session conditions
  (language, country, currency, address, etc.) shared by many follow-up
  requests.

  For example:
  - If you have multiple browser requests that
    all share a set of initial actions for basic
    session setup, such as using the `setLocation` action or similar, sessions can get you faster responses
    and give you extra run time for other actions.
  - If you have multiple HTTP requests that need
    cookies from an earlier browser request, and
    you need those follow-up requests to be sent with the same session as
    the browser request, sessions can give you that.

#### Client-managed sessions

To create a client-managed session, when sending a request, set
[session.id](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session.id) to a [version 4 UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)).

When sending follow-up requests with the same session ID, the created session
will be reused, i.e. all requests will share the same IP address, network
stack, cookie jar, etc.

Compared to server-managed sessions,
client-managed sessions offer a lower-level API that lets you do more but also
requires you to do more. For example:

- You control the number of sessions being used. You decide how many sessions
  you want to use at a given time, you create those sessions, you rotate your
  pool of sessions among your requests, and you create new sessions as old
  sessions expire.
- You can stop using a specific session, e.g. if you can tell from a response
  that the target website invalidated the session.

See [session](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session) for details.

#### Example 1: Same-session requests use the same IP address

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var sessionId = Guid.NewGuid().ToString();

for (int i = 0; i < 2; i++)
{
    var input = new Dictionary<string, object>(){
        {"url", "https://httpbin.org/ip"},
        {"httpResponseBody", true},
        {
            "session",
            new Dictionary<string, string>()
            {
                {"id", sessionId}
            }
        }
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

    HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
    var body = await response.Content.ReadAsByteArrayAsync();

    var data = JsonDocument.Parse(body);
    var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
    var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
    var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);

    var responseData = JsonDocument.Parse(httpResponseBody);
    var ipAddress = responseData.RootElement.GetProperty("origin").ToString();

    Console.WriteLine(ipAddress);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/ip", "httpResponseBody": true, "session": {"id": "e07843b4-fd72-4a02-82b4-3376c6ceba92"}}
{"url": "https://httpbin.org/ip", "httpResponseBody": true, "session": {"id": "e07843b4-fd72-4a02-82b4-3376c6ceba92"}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output .origin
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/ip",
    "httpResponseBody": true,
    "session": {
        "id": "e07843b4-fd72-4a02-82b4-3376c6ceba92"
    }
}
```

```shell
for i in {1..2}
do
    curl \
        --user YOUR_ZYTE_API_KEY: \
        --header 'Content-Type: application/json' \
        --data @input.json \
        --compressed \
        https://api.zyte.com/v1/extract \
        | jq --raw-output .httpResponseBody \
        | base64 --decode \
        | jq --raw-output .origin
done
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import java.util.UUID;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    String sessionId = UUID.randomUUID().toString();
    CloseableHttpClient client = HttpClients.createDefault();

    for (int i = 0; i < 2; i++) {
      Map<String, Object> session = ImmutableMap.of("id", sessionId);
      Map<String, Object> parameters =
          ImmutableMap.of(
              "url", "https://httpbin.org/ip", "httpResponseBody", true, "session", session);
      String requestBody = new Gson().toJson(parameters);

      HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
      request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
      request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
      request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
      request.setEntity(new StringEntity(requestBody));

      client.execute(
          request,
          response -> {
            HttpEntity entity = response.getEntity();
            String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
            JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
            String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
            byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
            String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
            JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
            String body = data.get("origin").getAsString();
            System.out.println(body);
            return null;
          });
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const crypto = require('crypto')

const sessionId = String(crypto.randomUUID())

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/ip',
    httpResponseBody: true,
    session: { id: sessionId }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const body = JSON.parse(httpResponseBody).origin
  console.log(body)
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://httpbin.org/ip',
      httpResponseBody: true,
      session: { id: sessionId }
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((response) => {
    const httpResponseBody = Buffer.from(
      response.data.httpResponseBody,
      'base64'
    )
    const body = JSON.parse(httpResponseBody).origin
    console.log(body)
  })
})
```

#### PHP

```php
<?php

// https://stackoverflow.com/a/15875555
function uuidv4()
{
    $data = random_bytes(16);

    $data[6] = chr(ord($data[6]) & 0x0F | 0x40); // set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3F | 0x80); // set bits 6-7 to 10

    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

$client = new GuzzleHttp\Client();
$session_id = uuidv4();

for ($i = 0; $i < 2; ++$i) {
    $response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => 'https://httpbin.org/anything',
            'httpResponseBody' => true,
            'session' => ['id' => $session_id],
        ],
    ]);
    $data = json_decode($response->getBody());
    $http_response_body = base64_decode($data->httpResponseBody);
    $body = json_decode($http_response_body)->origin;
    echo $body.PHP_EOL;
}
```

#### Proxy mode

With the proxy mode, use the
`Zyte-Session-ID` header.

```shell
for i in {1..2}
do
    curl \
        --proxy api.zyte.com:8011 \
        --proxy-user YOUR_ZYTE_API_KEY: \
        --header 'Content-Type: application/json' \
        --header 'Zyte-Session-ID: e07843b4-fd72-4a02-82b4-3376c6ceba92' \
        --compressed \
        https://httpbin.org/ip \
        | jq --raw-output .origin
done
```

#### Python

```python
import json
from base64 import b64decode
from uuid import uuid4

import requests

session_id = str(uuid4())

for _ in range(2):
    api_response = requests.post(
        "https://api.zyte.com/v1/extract",
        auth=("YOUR_ZYTE_API_KEY", ""),
        json={
            "url": "https://httpbin.org/ip",
            "httpResponseBody": True,
            "session": {"id": session_id},
        },
    )
    http_response_body = b64decode(api_response.json()["httpResponseBody"])
    body: str = json.loads(http_response_body)["origin"]
    print(body)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode
from uuid import uuid4

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    session_id = str(uuid4())
    for i in range(2):
        api_response = await client.get(
            {
                "url": "https://httpbin.org/ip",
                "httpResponseBody": True,
                "session": {"id": session_id},
            },
        )
        http_response_body = b64decode(api_response["httpResponseBody"]).decode()
        data = json.loads(http_response_body)
        print(data["origin"])

asyncio.run(main())
```

#### Scrapy

> ###### TIP
>
> scrapy-zyte-api also provides its own session management
> API, similar to that of
> server-managed sessions, but
> built on top of client-managed sessions.

```python
import json
from uuid import uuid4

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        session_id = str(uuid4())
        yield Request(
            "https://httpbin.org/ip",
            cb_kwargs={"session_id": session_id},
            meta={"zyte_api_automap": {"session": {"id": session_id}}},
        )

    def parse(self, response, session_id):
        print(json.loads(response.body)["origin"])
        yield Request(
            "https://httpbin.org/ip",
            meta={"zyte_api_automap": {"session": {"id": session_id}}},
            dont_filter=True,
            callback=self.parse2,
        )

    def parse2(self, response):
        print(json.loads(response.body)["origin"])
```

Output:

```none
203.0.113.122
203.0.113.122
```

#### Example 2: Reuse browser cookies in HTTP requests

Start a session with a browser request to the home page of a website, and reuse
that session for an HTTP request to a different URL of that website.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var sessionId = Guid.NewGuid().ToString();

var browserInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"browserHtml", true},
    {
        "session",
        new Dictionary<string, string>()
        {
            {"id", sessionId}
        }
    }
};
var browserInputJson = JsonSerializer.Serialize(browserInput);
var browserContent = new StringContent(browserInputJson, Encoding.UTF8, "application/json");
await client.PostAsync("https://api.zyte.com/v1/extract", browserContent);

var httpInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"httpResponseBody", true},
    {
        "session",
        new Dictionary<string, string>()
        {
            {"id", sessionId}
        }
    }
};
var httpInputJson = JsonSerializer.Serialize(httpInput);
var httpContent = new StringContent(httpInputJson, Encoding.UTF8, "application/json");
HttpResponseMessage httpResponse = await client.PostAsync("https://api.zyte.com/v1/extract", httpContent);
var httpResponseBody = await httpResponse.Content.ReadAsByteArrayAsync();
var httpData = JsonDocument.Parse(httpResponseBody);
var base64HttpResponseBodyField = httpData.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyField = System.Convert.FromBase64String(base64HttpResponseBodyField);
var result = System.Text.Encoding.UTF8.GetString(httpResponseBodyField);

Console.WriteLine(result);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import java.util.UUID;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    String sessionId = UUID.randomUUID().toString();
    Map<String, Object> session = ImmutableMap.of("id", sessionId);

    Map<String, Object> browserParameters =
        ImmutableMap.of("url", "https://toscrape.com/", "browserHtml", true, "session", session);
    String browserRequestBody = new Gson().toJson(browserParameters);

    HttpPost browserRequest = new HttpPost("https://api.zyte.com/v1/extract");
    browserRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    browserRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    browserRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    browserRequest.setEntity(new StringEntity(browserRequestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        browserRequest,
        browserResponse -> {
          Map<String, Object> httpParameters =
              ImmutableMap.of(
                  "url",
                  "https://books.toscrape.com/",
                  "httpResponseBody",
                  true,
                  "session",
                  session);
          String httpRequestBody = new Gson().toJson(httpParameters);

          HttpPost httpRequest = new HttpPost("https://api.zyte.com/v1/extract");
          httpRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
          httpRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
          httpRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
          httpRequest.setEntity(new StringEntity(httpRequestBody));

          client.execute(
              httpRequest,
              httpResponse -> {
                HttpEntity httpEntity = httpResponse.getEntity();
                String httpApiResponse = EntityUtils.toString(httpEntity, StandardCharsets.UTF_8);
                JsonObject httpJsonObject =
                    JsonParser.parseString(httpApiResponse).getAsJsonObject();
                String base64HttpResponseBody =
                    httpJsonObject.get("httpResponseBody").getAsString();
                byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
                String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
                System.out.println(httpResponseBody);
                return null;
              });

          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const crypto = require('crypto')

const sessionId = String(crypto.randomUUID())

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com/',
    browserHtml: true,
    session: { id: sessionId }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((browserResponse) => {
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://books.toscrape.com/',
      httpResponseBody: true,
      session: { id: sessionId }
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((httpResponse) => {
    const httpResponseBody = Buffer.from(
      httpResponse.data.httpResponseBody,
      'base64'
    )
    console.log(httpResponseBody.toString())
  })
})
```

#### PHP

```php
<?php

// https://stackoverflow.com/a/15875555
function uuidv4()
{
    $data = random_bytes(16);

    $data[6] = chr(ord($data[6]) & 0x0F | 0x40); // set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3F | 0x80); // set bits 6-7 to 10

    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

$client = new GuzzleHttp\Client();
$session_id = uuidv4();

$browser_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com/',
        'browserHtml' => true,
        'session' => ['id' => $session_id],
    ],
]);
$http_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://books.toscrape.com/',
        'httpResponseBody' => true,
        'session' => ['id' => $session_id],
    ],
]);
$http_data = json_decode($http_response->getBody());
$http_response_body = base64_decode($http_data->httpResponseBody);
echo $http_response_body;
```

#### Python

```python
from base64 import b64decode
from uuid import uuid4

import requests

session_id = str(uuid4())

browser_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com/",
        "browserHtml": True,
        "session": {"id": session_id},
    },
)
http_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://books.toscrape.com/",
        "httpResponseBody": True,
        "session": {"id": session_id},
    },
)
http_response_body = b64decode(http_response.json()["httpResponseBody"])
print(http_response_body.decode())
```

#### Python client

```python
import asyncio
from base64 import b64decode
from uuid import uuid4

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    session_id = str(uuid4())
    browser_response = await client.get(
        {
            "url": "https://toscrape.com/",
            "browserHtml": True,
            "session": {"id": session_id},
        }
    )
    http_response = await client.get(
        {
            "url": "https://books.toscrape.com/",
            "httpResponseBody": True,
            "session": {"id": session_id},
        }
    )
    http_response_body = b64decode(http_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from uuid import uuid4

from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        session_id = str(uuid4())
        yield Request(
            "https://toscrape.com/",
            callback=self.parse_browser,
            cb_kwargs={"session_id": session_id},
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "session": {"id": session_id},
                },
            },
        )

    def parse_browser(self, response, session_id):
        yield response.follow(
            "https://books.toscrape.com/",
            callback=self.parse_http,
            meta={
                "zyte_api_automap": {
                    "session": {"id": session_id},
                },
            },
        )

    def parse_http(self, response):
        print(response.text)
```

#### Server-managed sessions

> ###### WARNING
>
> Pricing-wise, requests that do not
> reuse a previous session and use
> [sessionContextParameters.actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters.actions) count as browser requests,
> including action costs.

> ###### NOTE
>
> The proxy mode does not support
> server-managed sessions.

**Session contexts** let you request a server-managed session and define
prerequisites for it.

To assign a session context to a request:

- Set [sessionContext](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContext) to an arbitrary array of name-value pair
  objects that uniquely identify your session context.
- Set in [sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters) your session prerequisites.
  > ###### TIP
  >
  > Before using [sessionContextParameters.actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters.actions),
  > make sure your actions work on the target
  > website, e.g. send a test browser request
  > with those actions, and check their outcome in the
  > [actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/actions) response field.
  >
  > `setLocation` is a good example of an action commonly used in
  > [sessionContextParameters.actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters.actions) that is not available
  > on every website. If you want to use if for a website for which the
  > action is not yet available, please [reach out to us](https://support.zyte.com/support/tickets/new).

Every request that you send with the same value in
[sessionContext](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContext) will use a session that was initialized with
[sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters). All those requests should also always
include the [sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters) request field with the
same value.

Zyte API handles creation, reuse, and deletion of sessions requested through
[sessionContext](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContext), meaning:

- When you send requests with the same [sessionContext](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContext), they
  may use the same session, or use separate sessions that were both
  initialized with [sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters).
- You cannot invalidate a specific session.

  You *can* change the value of [sessionContext](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContext), even if
  [sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters) remains the same, so that your
  requests will not reuse sessions created with the previous value of
  [sessionContext](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContext), but you would be invalidating all sessions,
  not a single one.

  If you need to be able to invalidate specific sessions, e.g. based on
  response content, consider using client-managed sessions or scrapy-zyte-api’s
  session management API instead.

#### Example 1: Set a cookie on all sessions

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "http://httpbin.org/cookies"},
    {"httpResponseBody", true},
    {
        "sessionContext",
        new List<Dictionary<string, string>>()
        {
            new Dictionary<string, string>()
            {
                {"name", "id"},
                {"value", "cookies"}
            }
        }
    },
    {
        "sessionContextParameters",
        new Dictionary<string, object>()
        {
            {
                "actions",
                new List<Dictionary<string, object>>()
                {
                    new Dictionary<string, object>()
                    {
                        {"action", "goto"},
                        {"url", "http://httpbin.org/cookies/set/foo/bar"},
                    }
                }
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);

Console.WriteLine(httpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "http://httpbin.org/cookies", "httpResponseBody": true, "sessionContext": [{"name": "id", "value": "cookies"}], "sessionContextParameters": {"actions": [{"action": "goto", "url": "http://httpbin.org/cookies/set/foo/bar"}]}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### curl

input.json
```json
{
    "url": "http://httpbin.org/cookies",
    "httpResponseBody": true,
    "sessionContext": [
        {
            "name": "id",
            "value": "cookies"
        }
    ],
    "sessionContextParameters": {
        "actions": [
            {
                "action": "goto",
                "url": "http://httpbin.org/cookies/set/foo/bar"
            }
        ]
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### Java

```java
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "http://httpbin.org/cookies",
            "httpResponseBody",
            true,
            "sessionContext",
            ImmutableList.of(ImmutableMap.of("name", "id", "value", "cookies")),
            "sessionContextParameters",
            ImmutableMap.of(
                "actions",
                ImmutableList.of(
                    ImmutableMap.of(
                        "action", "goto", "url", "http://httpbin.org/cookies/set/foo/bar"))));

    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'http://httpbin.org/cookies',
    httpResponseBody: true,
    sessionContext: [
      {
        name: 'id',
        value: 'cookies'
      }
    ],
    sessionContextParameters: {
      actions: [
        {
          action: 'goto',
          url: 'http://httpbin.org/cookies/set/foo/bar'
        }
      ]
    }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(httpResponseBody.toString())
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'http://httpbin.org/cookies',
        'httpResponseBody' => true,
        'sessionContext' => [
            [
                'name' => 'id',
                'value' => 'cookies',
            ],
        ],
        'sessionContextParameters' => [
            'actions' => [
                [
                    'action' => 'goto',
                    'url' => 'http://httpbin.org/cookies/set/foo/bar',
                ],
            ],
        ],
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
echo $http_response_body.PHP_EOL;
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "http://httpbin.org/cookies",
        "httpResponseBody": True,
        "sessionContext": [
            {
                "name": "id",
                "value": "cookies",
            },
        ],
        "sessionContextParameters": {
            "actions": [
                {
                    "action": "goto",
                    "url": "http://httpbin.org/cookies/set/foo/bar",
                },
            ],
        },
    },
)
http_response_body_bytes = b64decode(api_response.json()["httpResponseBody"])
http_response_body = http_response_body_bytes.decode()
print(http_response_body)
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "http://httpbin.org/cookies",
            "httpResponseBody": True,
            "sessionContext": [
                {
                    "name": "id",
                    "value": "cookies",
                },
            ],
            "sessionContextParameters": {
                "actions": [
                    {
                        "action": "goto",
                        "url": "http://httpbin.org/cookies/set/foo/bar",
                    },
                ],
            },
        },
    )
    http_response_body_bytes = b64decode(api_response["httpResponseBody"])
    http_response_body = http_response_body_bytes.decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

> ###### TIP
>
> scrapy-zyte-api also provides its own session management
> API, similar to that of
> server-managed sessions, but
> built on top of client-managed sessions.

```python
from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "http://httpbin.org/cookies",
            meta={
                "zyte_api_automap": {
                    "sessionContext": [
                        {
                            "name": "id",
                            "value": "cookies",
                        },
                    ],
                    "sessionContextParameters": {
                        "actions": [
                            {
                                "action": "goto",
                                "url": "http://httpbin.org/cookies/set/foo/bar",
                            },
                        ],
                    },
                },
            },
        )

    def parse(self, response):
        print(response.text)
```

Output:

```json
{
  "cookies": {
    "foo": "bar"
  }
}
```

#### Example 2: Start sessions on a browser, use them in HTTP requests

Set a no-op action in [sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters) to force
sessions to start with a browser request, but use HTTP requests.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"httpResponseBody", true},
    {
        "sessionContext",
        new List<Dictionary<string, string>>()
        {
            new Dictionary<string, string>()
            {
                {"name", "id"},
                {"value", "browser"}
            }
        }
    },
    {
        "sessionContextParameters",
        new Dictionary<string, object>()
        {
            {
                "actions",
                new List<Dictionary<string, object>>()
                {
                    new Dictionary<string, object>()
                    {
                        {"action", "waitForTimeout"},
                        {"timeout", 0},
                    }
                }
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);

Console.WriteLine(httpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com/", "httpResponseBody": true, "sessionContext": [{"name": "id", "value": "browser"}], "sessionContextParameters": {"actions": [{"action": "waitForTimeout", "timeout": 0}]}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com/",
    "httpResponseBody": true,
    "sessionContext": [
        {
            "name": "id",
            "value": "browser"
        }
    ],
    "sessionContextParameters": {
        "actions": [
            {
                "action": "waitForTimeout",
                "timeout": 0
            }
        ]
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### Java

```java
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://toscrape.com/",
            "httpResponseBody",
            true,
            "sessionContext",
            ImmutableList.of(ImmutableMap.of("name", "id", "value", "browser")),
            "sessionContextParameters",
            ImmutableMap.of(
                "actions",
                ImmutableList.of(ImmutableMap.of("action", "waitForTimeout", "timeout", 0))));

    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com/',
    httpResponseBody: true,
    sessionContext: [
      {
        name: 'id',
        value: 'browser'
      }
    ],
    sessionContextParameters: {
      actions: [
        {
          action: 'waitForTimeout',
          timeout: 0
        }
      ]
    }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(httpResponseBody.toString())
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com/',
        'httpResponseBody' => true,
        'sessionContext' => [
            [
                'name' => 'id',
                'value' => 'browser',
            ],
        ],
        'sessionContextParameters' => [
            'actions' => [
                [
                    'action' => 'waitForTimeout',
                    'timeout' => 0,
                ],
            ],
        ],
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
echo $http_response_body.PHP_EOL;
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com/",
        "httpResponseBody": True,
        "sessionContext": [{"name": "id", "value": "browser"}],
        "sessionContextParameters": {
            "actions": [
                {
                    "action": "waitForTimeout",
                    "timeout": 0,
                },
            ],
        },
    },
)
http_response_body_bytes = b64decode(api_response.json()["httpResponseBody"])
http_response_body = http_response_body_bytes.decode()
print(http_response_body)
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    http_response = await client.get(
        {
            "url": "https://toscrape.com/",
            "httpResponseBody": True,
            "sessionContext": [{"name": "id", "value": "browser"}],
            "sessionContextParameters": {
                "actions": [
                    {
                        "action": "waitForTimeout",
                        "timeout": 0,
                    },
                ],
            },
        }
    )
    http_response_body = b64decode(http_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://toscrape.com/",
            meta={
                "zyte_api_automap": {
                    "sessionContext": [
                        {
                            "name": "id",
                            "value": "browser",
                        },
                    ],
                    "sessionContextParameters": {
                        "actions": [
                            {
                                "action": "waitForTimeout",
                                "timeout": 0,
                            },
                        ],
                    },
                },
            },
        )

    def parse(self, response):
        print(response.text)
```

#### Session IP addresses

Requests using the same session will normally share the same IP address.

This may not be the case, though, in the following scenarios:

- If Zyte API is using a device residential IP address for a session, and that IP address expires, new
  requests using the same session will get a different IP address.

  The new IP address will be in the same country as the original IP address.
- When using client-managed sessions, if you
  send 2 or more requests in parallel with the same session ID, and the
  session does not exist already, each request may get a different IP
  address.

  You should create sessions with a single request and, once you get a
  response, you can send as many parallel requests as you want with that
  session.

While requests in the same session are almost guaranteed to use the same IP
address, requests from different sessions are not guaranteed to have different
IP addresses, although they often will.

#### Session cookie jars

Requests using the same session share the same cookie jar.

Cookies from the target websites received by session requests will be stored in
the session cookie jar, and affect follow-up session requests.

> ###### NOTE
>
> While you can use [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies) and
> [responseCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/responseCookies) on requests using a session, those
> parameters only affect the specific request where they are set, they do not
> affect the session cookie jar. You cannot manually set cookies on the
> session cookie jar or read the contents of the session cookie jar.

### Response headers

Set the [httpResponseHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseHeaders) request field to `true` to get
HTTP response headers in the [httpResponseHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseHeaders) response
field.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"httpResponseHeaders", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var headerEnumerator = data.RootElement.GetProperty("httpResponseHeaders").EnumerateArray();
var headers = new Dictionary<string, string>();
while (headerEnumerator.MoveNext())
{
    headers.Add(
        headerEnumerator.Current.GetProperty("name").ToString(),
        headerEnumerator.Current.GetProperty("value").ToString()
    );
}
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "httpResponseHeaders": true}
```

```shell
zyte-api input.jsonl \
    | jq .httpResponseHeaders
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "httpResponseHeaders": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq .httpResponseHeaders
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url", "https://toscrape.com", "browserHtml", true, "httpResponseHeaders", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonArray httpResponseHeaders = jsonObject.get("httpResponseHeaders").getAsJsonArray();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(httpResponseHeaders));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    httpResponseHeaders: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseHeaders = response.data.httpResponseHeaders
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'httpResponseHeaders' => true,
    ],
]);
$api = json_decode($response->getBody());
$http_response_headers = $api->httpResponseHeaders;
```

#### Proxy mode

With the proxy mode, response headers
are always included in the HTTP response, no need to ask for them
explicitly.

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "httpResponseHeaders": True,
    },
)
http_response_headers = api_response.json()["httpResponseHeaders"]
```

#### Python client

```python
import asyncio
import json

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "httpResponseHeaders": True,
        }
    )
    http_response_headers = api_response["httpResponseHeaders"]
    print(json.dumps(http_response_headers, indent=2))

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "httpResponseBody": False,
                    "httpResponseHeaders": True,
                },
            },
        )

    def parse(self, response):
        headers = response.headers
```

> ###### NOTE
>
> In transparent mode,
> [httpResponseHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseHeaders) is sent by default for
> httpResponseBody requests, but sending it
> explicitly is still recommended, as future versions of
> scrapy-zyte-api may stop sending it
> by default.

Output (first 5 lines):

```json
[
  {
    "name": "date",
    "value": "Fri, 25 Aug 2023 07:08:05 GMT"
  },
```

> ###### NOTE
>
> Reading cookies from `Set-Cookie` response headers is not
> recommended, because it only contains the cookies set by the final
> response, it does not account for cookies set during redirection or during browser rendering. Better use [responseCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/responseCookies) as
> described in zapi-cookies.

### Metadata

Set the [echoData](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/echoData) request field to an arbitrary value, to get
that value verbatim in the [echoData](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/echoData) response field.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

var inputData = new List<List<object>>()
{
    new List<object>(){"https://toscrape.com", 1},
    new List<object>(){"https://books.toscrape.com", 2},
    new List<object>(){"https://quotes.toscrape.com", 3},
};
var output = new List<HttpResponseMessage>();

var handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All,
    MaxConnectionsPerServer = 15
};
var client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var responseTasks = new List<Task<HttpResponseMessage>>();
foreach (var entry in inputData)
{
    var input = new Dictionary<string, object>(){
        {"url", entry[0]},
        {"browserHtml", true},
        {"echoData", entry[1]}
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");
    var responseTask = client.PostAsync("https://api.zyte.com/v1/extract", content);
    responseTasks.Add(responseTask);
}

while (responseTasks.Any())
{
    var responseTask = await Task.WhenAny(responseTasks);
    responseTasks.Remove(responseTask);
    var response = await responseTask;
    output.Add(response);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true, "echoData": 1}
{"url": "https://books.toscrape.com", "browserHtml": true, "echoData": 2}
{"url": "https://quotes.toscrape.com", "browserHtml": true, "echoData": 3}
```

```shell
zyte-api --n-conn 15 input.jsonl -o output.jsonl
```

#### curl

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true, "echoData": 1}
{"url": "https://books.toscrape.com", "browserHtml": true, "echoData": 2}
{"url": "https://quotes.toscrape.com", "browserHtml": true, "echoData": 3}
```

```shell
cat input.jsonl \
    | xargs -P 15 -d\\n -n 1 \
    bash -c "
        curl \
            --user $ZYTE_API_KEY: \
            --header 'Content-Type: application/json' \
            --data \"\$0\" \
            --compressed \
            https://api.zyte.com/v1/extract \
        | jq .echoData \
        | awk '{print \$1}' \
        >> output.jsonl
"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Base64;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import org.apache.hc.client5.http.async.methods.SimpleHttpRequest;
import org.apache.hc.client5.http.async.methods.SimpleHttpResponse;
import org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient;
import org.apache.hc.client5.http.impl.async.HttpAsyncClients;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManagerBuilder;
import org.apache.hc.client5.http.ssl.ClientTlsStrategyBuilder;
import org.apache.hc.core5.concurrent.FutureCallback;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.nio.ssl.TlsStrategy;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws ExecutionException, InterruptedException, IOException, ParseException {

    Object[][] input = {
      {"https://toscrape.com", 1},
      {"https://bookstoscrape.com", 2},
      {"https://quotes.toscrape.com", 3}
    };
    List<Future> futures = new ArrayList<Future>();
    List<String> output = new ArrayList<String>();

    int concurrency = 15;

    // https://issues.apache.org/jira/browse/HTTPCLIENT-2219
    final TlsStrategy tlsStrategy = ClientTlsStrategyBuilder.create().useSystemProperties().build();

    PoolingAsyncClientConnectionManager connectionManager =
        PoolingAsyncClientConnectionManagerBuilder.create().setTlsStrategy(tlsStrategy).build();
    connectionManager.setMaxTotal(concurrency);
    connectionManager.setDefaultMaxPerRoute(concurrency);

    CloseableHttpAsyncClient client =
        HttpAsyncClients.custom().setConnectionManager(connectionManager).build();
    try {
      client.start();
      for (int i = 0; i < input.length; i++) {
        Map<String, Object> parameters =
            ImmutableMap.of("url", input[i][0], "browserHtml", true, "echoData", input[i][1]);
        String requestBody = new Gson().toJson(parameters);

        SimpleHttpRequest request =
            new SimpleHttpRequest("POST", "https://api.zyte.com/v1/extract");
        request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
        request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
        request.setBody(requestBody, ContentType.APPLICATION_JSON);

        final Future<SimpleHttpResponse> future =
            client.execute(
                request,
                new FutureCallback<SimpleHttpResponse>() {
                  public void completed(final SimpleHttpResponse response) {
                    String apiResponse = response.getBodyText();
                    output.add(apiResponse);
                  }

                  public void failed(final Exception ex) {}

                  public void cancelled() {}
                });
        futures.add(future);
      }
      for (int i = 0; i < futures.size(); i++) {
        futures.get(i).get();
      }
    } finally {
      client.close();
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const { ConcurrencyManager } = require('axios-concurrency')
const axios = require('axios')

const urls = [
  ['https://toscrape.com', 1],
  ['https://books.toscrape.com', 2],
  ['https://quotes.toscrape.com', 3]
]
const output = []

const client = axios.create()
ConcurrencyManager(client, 15)

Promise.all(
  urls.map((input) =>
    client.post(
      'https://api.zyte.com/v1/extract',
      { url: input[0], browserHtml: true, echoData: input[1] },
      {
        auth: { username: 'YOUR_ZYTE_API_KEY' }
      }
    ).then((response) => output.push(response.data))
  )
)
```

#### PHP

```php
<?php

$input = [
    ['https://toscrape.com', 1],
    ['https://books.toscrape.com', 2],
    ['https://quotes.toscrape.com', 3],
];
$output = [];
$promises = [];

$client = new GuzzleHttp\Client();

foreach ($input as $url_and_index) {
    $options = [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => $url_and_index[0],
            'browserHtml' => true,
            'echoData' => $url_and_index[1],
        ],
    ];
    $request = new \GuzzleHttp\Psr7\Request('POST', 'https://api.zyte.com/v1/extract');
    global $promises;
    $promises[] = $client->sendAsync($request, $options)->then(function ($response) {
        global $output;
        $output[] = json_decode($response->getBody());
    });
}

foreach ($promises as $promise) {
    $promise->wait();
}
```

#### Proxy mode

With the proxy mode you cannot set
request metadata.

#### Python

```python
import asyncio

import aiohttp

input_data = [
    ("https://toscrape.com", 1),
    ("https://books.toscrape.com", 2),
    ("https://quotes.toscrape.com", 3),
]
output = []

async def extract(client, url, index):
    response = await client.post(
        "https://api.zyte.com/v1/extract",
        json={"url": url, "browserHtml": True, "echoData": index},
        auth=aiohttp.BasicAuth("YOUR_ZYTE_API_KEY"),
    )
    output.append(await response.json())

async def main():
    connector = aiohttp.TCPConnector(limit_per_host=15)
    async with aiohttp.ClientSession(connector=connector) as client:
        await asyncio.gather(
            *[extract(client, url, index) for url, index in input_data]
        )

asyncio.run(main())
```

#### Python client

```python
import asyncio
import json

from zyte_api import AsyncZyteAPI

input_data = [
    ("https://toscrape.com", 1),
    ("https://books.toscrape.com", 2),
    ("https://quotes.toscrape.com", 3),
]

async def main():
    client = AsyncZyteAPI(n_conn=15)
    queries = [
        {"url": url, "browserHtml": True, "echoData": index}
        for url, index in input_data
    ]
    async with client.session() as session:
        for future in session.iter(queries):
            response = await future
            print(json.dumps(response))

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

input_data = [
    ("https://toscrape.com", 1),
    ("https://books.toscrape.com", 2),
    ("https://quotes.toscrape.com", 3),
]

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    custom_settings = {
        "CONCURRENT_REQUESTS": 15,
        "CONCURRENT_REQUESTS_PER_DOMAIN": 15,
    }

    async def start(self):
        for url, index in input_data:
            yield Request(
                url,
                meta={
                    "zyte_api_automap": {
                        "browserHtml": True,
                        "echoData": index,
                    },
                },
            )

    def parse(self, response):
        yield {
            "index": response.raw_api_response["echoData"],
            "html": response.text,
        }
```

Alternatively, you can use Scrapy’s `Request.cb_kwargs` directly for a
similar purpose:

```python

    async def start(self):
        for url, index in input_data:
            yield Request(
                url,
                cb_kwargs={"index": index},
                meta={
                    "zyte_api_automap": {
                        "browserHtml": True,
                    },
                },
            )

    def parse(self, response, index):
        yield {
            "index": index,
            "html": response.text,
        }

```

Output:

```json
{"url": "https://quotes.toscrape.com/", "statusCode": 200, "browserHtml": "<!DOCTYPE html><html lang=\"en\"><head>\n\t<meta charset=\"UTF-8\">\n\t<title>Quotes to Scrape</title>\n    <link rel=\"stylesheet\" href=\"/static/bootstrap.min.css\">\n    <link rel=\"stylesheet\" href=\"/static/main.css\">\n</head>\n<body>\n    <div class=\"container\">\n        <div class=\"row header-box\">\n            <div class=\"col-md-8\">\n                <h1>\n                    <a href=\"/\" style=\"text-decoration: none\">Quotes to Scrape</a>\n                </h1>\n            </div>\n            <div class=\"col-md-4\">\n                <p>\n                \n                    <a href=\"/login\">Login</a>\n                \n                </p>\n            </div>\n        </div>\n    \n\n<div class=\"row\">\n    <div class=\"col-md-8\">\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n        <a href=\"/author/Albert-Einstein\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"change,deep-thoughts,thinking,world\"> \n            \n            <a class=\"tag\" href=\"/tag/change/page/1/\">change</a>\n            \n            <a class=\"tag\" href=\"/tag/deep-thoughts/page/1/\">deep-thoughts</a>\n            \n            <a class=\"tag\" href=\"/tag/thinking/page/1/\">thinking</a>\n            \n            <a class=\"tag\" href=\"/tag/world/page/1/\">world</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“It is our choices, Harry, that show what we truly are, far more than our abilities.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">J.K. Rowling</small>\n        <a href=\"/author/J-K-Rowling\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"abilities,choices\"> \n            \n            <a class=\"tag\" href=\"/tag/abilities/page/1/\">abilities</a>\n            \n            <a class=\"tag\" href=\"/tag/choices/page/1/\">choices</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n        <a href=\"/author/Albert-Einstein\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"inspirational,life,live,miracle,miracles\"> \n            \n            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n            \n            <a class=\"tag\" href=\"/tag/life/page/1/\">life</a>\n            \n            <a class=\"tag\" href=\"/tag/live/page/1/\">live</a>\n            \n            <a class=\"tag\" href=\"/tag/miracle/page/1/\">miracle</a>\n            \n            <a class=\"tag\" href=\"/tag/miracles/page/1/\">miracles</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Jane Austen</small>\n        <a href=\"/author/Jane-Austen\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"aliteracy,books,classic,humor\"> \n            \n            <a class=\"tag\" href=\"/tag/aliteracy/page/1/\">aliteracy</a>\n            \n            <a class=\"tag\" href=\"/tag/books/page/1/\">books</a>\n            \n            <a class=\"tag\" href=\"/tag/classic/page/1/\">classic</a>\n            \n            <a class=\"tag\" href=\"/tag/humor/page/1/\">humor</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Marilyn Monroe</small>\n        <a href=\"/author/Marilyn-Monroe\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"be-yourself,inspirational\"> \n            \n            <a class=\"tag\" href=\"/tag/be-yourself/page/1/\">be-yourself</a>\n            \n            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“Try not to become a man of success. Rather become a man of value.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n        <a href=\"/author/Albert-Einstein\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"adulthood,success,value\"> \n            \n            <a class=\"tag\" href=\"/tag/adulthood/page/1/\">adulthood</a>\n            \n            <a class=\"tag\" href=\"/tag/success/page/1/\">success</a>\n            \n            <a class=\"tag\" href=\"/tag/value/page/1/\">value</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“It is better to be hated for what you are than to be loved for what you are not.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">André Gide</small>\n        <a href=\"/author/Andre-Gide\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"life,love\"> \n            \n            <a class=\"tag\" href=\"/tag/life/page/1/\">life</a>\n            \n            <a class=\"tag\" href=\"/tag/love/page/1/\">love</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“I have not failed. I've just found 10,000 ways that won't work.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Thomas A. Edison</small>\n        <a href=\"/author/Thomas-A-Edison\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"edison,failure,inspirational,paraphrased\"> \n            \n            <a class=\"tag\" href=\"/tag/edison/page/1/\">edison</a>\n            \n            <a class=\"tag\" href=\"/tag/failure/page/1/\">failure</a>\n            \n            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n            \n            <a class=\"tag\" href=\"/tag/paraphrased/page/1/\">paraphrased</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“A woman is like a tea bag; you never know how strong it is until it's in hot water.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Eleanor Roosevelt</small>\n        <a href=\"/author/Eleanor-Roosevelt\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"misattributed-eleanor-roosevelt\"> \n            \n            <a class=\"tag\" href=\"/tag/misattributed-eleanor-roosevelt/page/1/\">misattributed-eleanor-roosevelt</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“A day without sunshine is like, you know, night.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Steve Martin</small>\n        <a href=\"/author/Steve-Martin\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"humor,obvious,simile\"> \n            \n            <a class=\"tag\" href=\"/tag/humor/page/1/\">humor</a>\n            \n            <a class=\"tag\" href=\"/tag/obvious/page/1/\">obvious</a>\n            \n            <a class=\"tag\" href=\"/tag/simile/page/1/\">simile</a>\n            \n        </div>\n    </div>\n\n    <nav>\n        <ul class=\"pager\">\n            \n            \n            <li class=\"next\">\n                <a href=\"/page/2/\">Next <span aria-hidden=\"true\">→</span></a>\n            </li>\n            \n        </ul>\n    </nav>\n    </div>\n    <div class=\"col-md-4 tags-box\">\n        \n            <h2>Top Ten tags</h2>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 28px\" href=\"/tag/love/\">love</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 26px\" href=\"/tag/inspirational/\">inspirational</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 26px\" href=\"/tag/life/\">life</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 24px\" href=\"/tag/humor/\">humor</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 22px\" href=\"/tag/books/\">books</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 14px\" href=\"/tag/reading/\">reading</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 10px\" href=\"/tag/friendship/\">friendship</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 8px\" href=\"/tag/friends/\">friends</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 8px\" href=\"/tag/truth/\">truth</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 6px\" href=\"/tag/simile/\">simile</a>\n            </span>\n            \n        \n    </div>\n</div>\n\n    </div>\n    <footer class=\"footer\">\n        <div class=\"container\">\n            <p class=\"text-muted\">\n                Quotes by: <a href=\"https://www.goodreads.com/quotes\">GoodReads.com</a>\n            </p>\n            <p class=\"copyright\">\n                Made with <span class=\"zyte\">❤</span> by <a class=\"zyte\" href=\"https://www.zyte.com\">Zyte</a>\n            </p>\n        </div>\n    </footer>\n\n</body></html>", "echoData": 3}
{"url": "https://books.toscrape.com/", "statusCode": 200, "browserHtml": "<!DOCTYPE html><!--[if lt IE 7]>      <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8 lt-ie7\"> <![endif]--><!--[if IE 7]>         <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8\"> <![endif]--><!--[if IE 8]>         <html lang=\"en-us\" class=\"no-js lt-ie9\"> <![endif]--><!--[if gt IE 8]><!--><html lang=\"en-us\" class=\"no-js\"><!--<![endif]--><head>\n        <title>\n    All products | Books to Scrape - Sandbox\n</title>\n\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n        <meta name=\"created\" content=\"24th Jun 2016 09:29\">\n        <meta name=\"description\" content=\"\">\n        <meta name=\"viewport\" content=\"width=device-width\">\n        <meta name=\"robots\" content=\"NOARCHIVE,NOCACHE\">\n\n        <!-- Le HTML5 shim, for IE6-8 support of HTML elements -->\n        <!--[if lt IE 9]>\n        <script src=\"//html5shim.googlecode.com/svn/trunk/html5.js\"></script>\n        <![endif]-->\n\n        \n            <link rel=\"shortcut icon\" href=\"static/oscar/favicon.ico\">\n        \n\n        \n        \n    \n    \n        <link rel=\"stylesheet\" type=\"text/css\" href=\"static/oscar/css/styles.css\">\n    \n    <link rel=\"stylesheet\" href=\"static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.css\">\n    <link rel=\"stylesheet\" type=\"text/css\" href=\"static/oscar/css/datetimepicker.css\">\n\n\n        \n        \n\n        \n\n        \n            \n            \n\n        \n    </head>\n\n    <body id=\"default\" class=\"default\">\n        \n        \n    \n    \n    <header class=\"header container-fluid\">\n        <div class=\"page_inner\">\n            <div class=\"row\">\n                <div class=\"col-sm-8 h1\"><a href=\"index.html\">Books to Scrape</a><small> We love being scraped!</small>\n</div>\n\n                \n            </div>\n        </div>\n    </header>\n\n    \n    \n<div class=\"container-fluid page\">\n    <div class=\"page_inner\">\n        \n    <ul class=\"breadcrumb\">\n        <li>\n            <a href=\"index.html\">Home</a>\n        </li>\n        <li class=\"active\">All products</li>\n    </ul>\n\n        <div class=\"row\">\n\n            <aside class=\"sidebar col-sm-4 col-md-3\">\n                \n                <div id=\"promotions_left\">\n                    \n                </div>\n                \n    \n    \n        \n        <div class=\"side_categories\">\n            <ul class=\"nav nav-list\">\n                \n                    <li>\n                        <a href=\"catalogue/category/books_1/index.html\">\n                            \n                                Books\n                            \n                        </a>\n\n                        <ul>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/travel_2/index.html\">\n                            \n                                Travel\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/mystery_3/index.html\">\n                            \n                                Mystery\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/historical-fiction_4/index.html\">\n                            \n                                Historical Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/sequential-art_5/index.html\">\n                            \n                                Sequential Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/classics_6/index.html\">\n                            \n                                Classics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/philosophy_7/index.html\">\n                            \n                                Philosophy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/romance_8/index.html\">\n                            \n                                Romance\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/womens-fiction_9/index.html\">\n                            \n                                Womens Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/fiction_10/index.html\">\n                            \n                                Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/childrens_11/index.html\">\n                            \n                                Childrens\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/religion_12/index.html\">\n                            \n                                Religion\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/nonfiction_13/index.html\">\n                            \n                                Nonfiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/music_14/index.html\">\n                            \n                                Music\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/default_15/index.html\">\n                            \n                                Default\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/science-fiction_16/index.html\">\n                            \n                                Science Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/sports-and-games_17/index.html\">\n                            \n                                Sports and Games\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/add-a-comment_18/index.html\">\n                            \n                                Add a comment\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/fantasy_19/index.html\">\n                            \n                                Fantasy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/new-adult_20/index.html\">\n                            \n                                New Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/young-adult_21/index.html\">\n                            \n                                Young Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/science_22/index.html\">\n                            \n                                Science\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/poetry_23/index.html\">\n                            \n                                Poetry\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/paranormal_24/index.html\">\n                            \n                                Paranormal\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/art_25/index.html\">\n                            \n                                Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/psychology_26/index.html\">\n                            \n                                Psychology\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/autobiography_27/index.html\">\n                            \n                                Autobiography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/parenting_28/index.html\">\n                            \n                                Parenting\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/adult-fiction_29/index.html\">\n                            \n                                Adult Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/humor_30/index.html\">\n                            \n                                Humor\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/horror_31/index.html\">\n                            \n                                Horror\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/history_32/index.html\">\n                            \n                                History\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/food-and-drink_33/index.html\">\n                            \n                                Food and Drink\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/christian-fiction_34/index.html\">\n                            \n                                Christian Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/business_35/index.html\">\n                            \n                                Business\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/biography_36/index.html\">\n                            \n                                Biography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/thriller_37/index.html\">\n                            \n                                Thriller\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/contemporary_38/index.html\">\n                            \n                                Contemporary\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/spirituality_39/index.html\">\n                            \n                                Spirituality\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/academic_40/index.html\">\n                            \n                                Academic\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/self-help_41/index.html\">\n                            \n                                Self Help\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/historical_42/index.html\">\n                            \n                                Historical\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/christian_43/index.html\">\n                            \n                                Christian\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/suspense_44/index.html\">\n                            \n                                Suspense\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/short-stories_45/index.html\">\n                            \n                                Short Stories\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/novels_46/index.html\">\n                            \n                                Novels\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/health_47/index.html\">\n                            \n                                Health\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/politics_48/index.html\">\n                            \n                                Politics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/cultural_49/index.html\">\n                            \n                                Cultural\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/erotica_50/index.html\">\n                            \n                                Erotica\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/crime_51/index.html\">\n                            \n                                Crime\n                            \n                        </a>\n\n                        </li>\n                        \n                            </ul></li>\n                        \n                \n            </ul>\n        </div>\n    \n    \n\n            </aside>\n\n            <div class=\"col-sm-8 col-md-9\">\n                \n                <div class=\"page-header action\">\n                    <h1>All products</h1>\n                </div>\n                \n\n                \n\n\n\n<div id=\"messages\">\n\n</div>\n\n\n                <div id=\"promotions\">\n                    \n                </div>\n\n                \n    <form method=\"get\" class=\"form-horizontal\">\n        \n        <div style=\"display:none\">\n            \n            \n        </div>\n\n        \n            \n                \n                    <strong>1000</strong> results - showing <strong>1</strong> to <strong>20</strong>.\n                \n            \n            \n        \n    </form>\n    \n        <section>\n            <div class=\"alert alert-warning\" role=\"alert\"><strong>Warning!</strong> This is a demo website for web scraping purposes. Prices and ratings here were randomly assigned and have no real meaning.</div>\n\n            <div>\n                <ol class=\"row\">\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/a-light-in-the-attic_1000/index.html\"><img src=\"media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg\" alt=\"A Light in the Attic\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/a-light-in-the-attic_1000/index.html\" title=\"A Light in the Attic\">A Light in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.77</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/tipping-the-velvet_999/index.html\"><img src=\"media/cache/26/0c/260c6ae16bce31c8f8c95daddd9f4a1c.jpg\" alt=\"Tipping the Velvet\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/tipping-the-velvet_999/index.html\" title=\"Tipping the Velvet\">Tipping the Velvet</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£53.74</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/soumission_998/index.html\"><img src=\"media/cache/3e/ef/3eef99c9d9adef34639f510662022830.jpg\" alt=\"Soumission\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/soumission_998/index.html\" title=\"Soumission\">Soumission</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£50.10</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/sharp-objects_997/index.html\"><img src=\"media/cache/32/51/3251cf3a3412f53f339e42cac2134093.jpg\" alt=\"Sharp Objects\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/sharp-objects_997/index.html\" title=\"Sharp Objects\">Sharp Objects</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£47.82</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/sapiens-a-brief-history-of-humankind_996/index.html\"><img src=\"media/cache/be/a5/bea5697f2534a2f86a3ef27b5a8c12a6.jpg\" alt=\"Sapiens: A Brief History of Humankind\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/sapiens-a-brief-history-of-humankind_996/index.html\" title=\"Sapiens: A Brief History of Humankind\">Sapiens: A Brief History ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£54.23</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-requiem-red_995/index.html\"><img src=\"media/cache/68/33/68339b4c9bc034267e1da611ab3b34f8.jpg\" alt=\"The Requiem Red\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-requiem-red_995/index.html\" title=\"The Requiem Red\">The Requiem Red</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.65</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\"><img src=\"media/cache/92/27/92274a95b7c251fea59a2b8a78275ab4.jpg\" alt=\"The Dirty Little Secrets of Getting Your Dream Job\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\" title=\"The Dirty Little Secrets of Getting Your Dream Job\">The Dirty Little Secrets ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.34</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\"><img src=\"media/cache/3d/54/3d54940e57e662c4dd1f3ff00c78cc64.jpg\" alt=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\" title=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\">The Coming Woman: A ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.93</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\"><img src=\"media/cache/66/88/66883b91f6804b2323c8369331cb7dd1.jpg\" alt=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\" title=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\">The Boys in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.60</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-black-maria_991/index.html\"><img src=\"media/cache/58/46/5846057e28022268153beff6d352b06c.jpg\" alt=\"The Black Maria\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-black-maria_991/index.html\" title=\"The Black Maria\">The Black Maria</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.15</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/starving-hearts-triangular-trade-trilogy-1_990/index.html\"><img src=\"media/cache/be/f4/bef44da28c98f905a3ebec0b87be8530.jpg\" alt=\"Starving Hearts (Triangular Trade Trilogy, #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/starving-hearts-triangular-trade-trilogy-1_990/index.html\" title=\"Starving Hearts (Triangular Trade Trilogy, #1)\">Starving Hearts (Triangular Trade ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£13.99</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/shakespeares-sonnets_989/index.html\"><img src=\"media/cache/10/48/1048f63d3b5061cd2f424d20b3f9b666.jpg\" alt=\"Shakespeare's Sonnets\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/shakespeares-sonnets_989/index.html\" title=\"Shakespeare's Sonnets\">Shakespeare's Sonnets</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£20.66</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/set-me-free_988/index.html\"><img src=\"media/cache/5b/88/5b88c52633f53cacf162c15f4f823153.jpg\" alt=\"Set Me Free\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/set-me-free_988/index.html\" title=\"Set Me Free\">Set Me Free</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.46</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\"><img src=\"media/cache/94/b1/94b1b8b244bce9677c2f29ccc890d4d2.jpg\" alt=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\" title=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\">Scott Pilgrim's Precious Little ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.29</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/rip-it-up-and-start-again_986/index.html\"><img src=\"media/cache/81/c4/81c4a973364e17d01f217e1188253d5e.jpg\" alt=\"Rip it Up and Start Again\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/rip-it-up-and-start-again_986/index.html\" title=\"Rip it Up and Start Again\">Rip it Up and ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£35.02</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\"><img src=\"media/cache/54/60/54607fe8945897cdcced0044103b10b6.jpg\" alt=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\" title=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\">Our Band Could Be ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£57.25</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/olio_984/index.html\"><img src=\"media/cache/55/33/553310a7162dfbc2c6d19a84da0df9e1.jpg\" alt=\"Olio\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/olio_984/index.html\" title=\"Olio\">Olio</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£23.88</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\"><img src=\"media/cache/09/a3/09a3aef48557576e1a85ba7efea8ecb7.jpg\" alt=\"Mesaerion: The Best Science Fiction Stories 1800-1849\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\" title=\"Mesaerion: The Best Science Fiction Stories 1800-1849\">Mesaerion: The Best Science ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£37.59</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/libertarianism-for-beginners_982/index.html\"><img src=\"media/cache/0b/bc/0bbcd0a6f4bcd81ccb1049a52736406e.jpg\" alt=\"Libertarianism for Beginners\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/libertarianism-for-beginners_982/index.html\" title=\"Libertarianism for Beginners\">Libertarianism for Beginners</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.33</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/its-only-the-himalayas_981/index.html\"><img src=\"media/cache/27/a5/27a53d0bb95bdd88288eaf66c9230d7e.jpg\" alt=\"It's Only the Himalayas\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/its-only-the-himalayas_981/index.html\" title=\"It's Only the Himalayas\">It's Only the Himalayas</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£45.17</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                </ol>\n                \n\n\n\n    <div>\n        <ul class=\"pager\">\n            \n            <li class=\"current\">\n            \n                Page 1 of 50\n            \n            </li>\n            \n                <li class=\"next\"><a href=\"catalogue/page-2.html\">next</a></li>\n            \n        </ul>\n    </div>\n\n\n            </div>\n        </section>\n    \n\n\n            </div>\n\n        </div><!-- /row -->\n    </div><!-- /page_inner -->\n</div><!-- /container-fluid -->\n\n\n    \n<footer class=\"footer container-fluid\">\n    \n        \n    \n</footer>\n\n\n        \n        \n  \n            <!-- jQuery -->\n            <script src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js\"></script>\n            <script>window.jQuery || document.write('<script src=\"static/oscar/js/jquery/jquery-1.9.1.min.js\"><\\/script>')</script><script src=\"static/oscar/js/jquery/jquery-1.9.1.min.js\"></script>\n        \n  \n\n\n        \n        \n    \n        \n    <script type=\"text/javascript\" src=\"static/oscar/js/bootstrap3/bootstrap.min.js\"></script>\n    <!-- Oscar -->\n    <script src=\"static/oscar/js/oscar/ui.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n    <script src=\"static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n    <script src=\"static/oscar/js/bootstrap-datetimepicker/locales/bootstrap-datetimepicker.all.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n\n        \n        \n    \n\n    \n\n\n        \n        <script type=\"text/javascript\">\n            $(function() {\n                \n    \n    \n    oscar.init();\n\n    oscar.search.init();\n\n            });\n        </script>\n\n        \n        <!-- Version: N/A -->\n        \n    \n\n</body></html>", "echoData": 2}
{"url": "https://toscrape.com/", "statusCode": 200, "browserHtml": "<!DOCTYPE html><html lang=\"en\"><head>\n        <meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\n        <title>Scraping Sandbox</title>\n        <link href=\"./css/bootstrap.min.css\" rel=\"stylesheet\">\n        <link href=\"./css/main.css\" rel=\"stylesheet\">\n    </head>\n    <body>\n        <div class=\"container\">\n            <div class=\"row\">\n                <div class=\"col-md-1\"></div>\n                <div class=\"col-md-10 well\">\n                    <img class=\"logo\" src=\"img/zyte.png\" width=\"200px\">\n                    <h1 class=\"text-right\">Web Scraping Sandbox</h1>\n                </div>\n            </div>\n\n            <div class=\"row\">\n                <div class=\"col-md-1\"></div>\n                <div class=\"col-md-10\">\n                    <h2>Books</h2>\n                    <p>A <a href=\"http://books.toscrape.com\">fictional bookstore</a> that desperately wants to be scraped. It's a safe place for beginners learning web scraping and for developers validating their scraping technologies as well. Available at: <a href=\"http://books.toscrape.com\">books.toscrape.com</a></p>\n                    <div class=\"col-md-6\">\n                        <a href=\"http://books.toscrape.com\"><img src=\"./img/books.png\" class=\"img-thumbnail\"></a>\n                    </div>\n                    <div class=\"col-md-6\">\n                        <table class=\"table table-hover\">\n                            <tbody><tr><th colspan=\"2\">Details</th></tr>\n                            <tr><td>Amount of items </td><td>1000</td></tr>\n                            <tr><td>Pagination </td><td>✔</td></tr>\n                            <tr><td>Items per page </td><td>max 20</td></tr>\n                            <tr><td>Requires JavaScript </td><td>✘</td></tr>\n                        </tbody></table>\n                    </div>\n                </div>\n            </div>\n\n            <div class=\"row\">\n                <div class=\"col-md-1\"></div>\n                <div class=\"col-md-10\">\n                    <h2>Quotes</h2>\n                    <p><a href=\"http://quotes.toscrape.com/\">A website</a> that lists quotes from famous people. It has many endpoints showing the quotes in many different ways, each of them including new scraping challenges for you, as described below.</p>\n                    <div class=\"col-md-6\">\n                        <a href=\"http://quotes.toscrape.com\"><img src=\"./img/quotes.png\" class=\"img-thumbnail\"></a>\n                    </div>\n                    <div class=\"col-md-6\">\n                        <table class=\"table table-hover\">\n                            <tbody><tr><th colspan=\"2\">Endpoints</th></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/\">Default</a></td><td>Microdata and pagination</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/scroll\">Scroll</a> </td><td>infinite scrolling pagination</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/js\">JavaScript</a> </td><td>JavaScript generated content</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/js-delayed\">Delayed</a> </td><td>Same as JavaScript but with a delay (?delay=10000)</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/tableful\">Tableful</a> </td><td>a table based messed-up layout</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/login\">Login</a> </td><td>login with CSRF token (any user/passwd works)</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/search.aspx\">ViewState</a> </td><td>an AJAX based filter form with ViewStates</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/random\">Random</a> </td><td>a single random quote</td></tr>\n                        </tbody></table>\n                    </div>\n                </div>\n            </div>\n        </div>\n    \n\n</body></html>", "echoData": 1}
```

### Permissions control

By default, Zyte API may use different techniques to avoid bans.

You may [order changes](https://support.zyte.com/support/tickets/new) on
which features Zyte API can use for your account:

- **CAPTCHA management** (default: enabled)
- **Device residential IPs** (default: enabled)

  Disabling this limits your requests to data center IPs, ensuring transparency to website owners. However, it
  also disables the use of extended geolocations.

  If you only want to disable device residential IPs for *some* requests, set
  [ipType](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType) to `datacenter` on those requests instead.

> ###### NOTE
>
> Disabling either option, or both, may increase the rate of ban
> responses for some websites.

## Zyte API proxy mode

To use Zyte API as a proxy, use the `api.zyte.com:8011` endpoint, with your
[Zyte API key](https://app.zyte.com/o/zyte-api/api-access) and proxy
headers:

#### C#

```cs
using System;
using System.Net;
using System.Net.Http;

var proxy = new WebProxy("http://api.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var httpClientHandler = new HttpClientHandler
{
    Proxy = proxy,
};

var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
var message = new HttpRequestMessage(HttpMethod.Get, "https://toscrape.com");
var response = client.Send(message);
var body = await response.Content.ReadAsStringAsync();

Console.WriteLine(body);
```

#### curl

```bash
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### Java

```java
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.hc.client5.http.auth.AuthCache;
import org.apache.hc.client5.http.auth.AuthScope;
import org.apache.hc.client5.http.auth.CredentialsProvider;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.impl.auth.BasicAuthCache;
import org.apache.hc.client5.http.impl.auth.BasicScheme;
import org.apache.hc.client5.http.impl.auth.CredentialsProviderBuilder;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.client5.http.impl.routing.DefaultProxyRoutePlanner;
import org.apache.hc.client5.http.protocol.HttpClientContext;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHost;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;

class Example {
  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {

    HttpHost proxy = new HttpHost("api.zyte.com", 8011);
    DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
    CredentialsProvider credentialsProvider =
        CredentialsProviderBuilder.create()
            .add(new AuthScope(proxy), "YOUR_ZYTE_API_KEY", "".toCharArray())
            .build();

    AuthCache authCache = new BasicAuthCache();
    BasicScheme basicAuth = new BasicScheme();
    authCache.put(proxy, basicAuth);
    HttpClientContext context = HttpClientContext.create();
    context.setCredentialsProvider(credentialsProvider);
    context.setAuthCache(authCache);

    CloseableHttpClient client =
        HttpClients.custom()
            .setRoutePlanner(routePlanner)
            .setDefaultCredentialsProvider(credentialsProvider)
            .build();

    HttpGet request = new HttpGet("https://toscrape.com");
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String httpResponseBody = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }
}
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'api.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

> ###### NOTE
>
> You need to install and configure our CA certificate for
> the requests library.

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011" for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Ruby

```ruby
# frozen_string_literal: true

require 'net/http'

url = URI('https://toscrape.com/')
proxy_host = 'api.zyte.com'
proxy_port = '8011'

http = Net::HTTP.new(url.host, url.port, proxy_host, proxy_port, 'YOUR_ZYTE_API_KEY', '')
http.use_ssl = true

r = http.start do |h|
  h.request(Net::HTTP::Get.new(url))
end

puts r.body
```

#### Scrapy

When using [scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy), set the `ZYTE_SMARTPROXY_URL`
setting to `"http://api.zyte.com:8011"` and the
`ZYTE_SMARTPROXY_APIKEY` setting to [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access) for Zyte API.

> ###### NOTE
>
> **Important**: Use your **Zyte API key** here, not a Scrapy Cloud API key. Make sure you get this from the Zyte API access page.

Then you can continue using Scrapy as usual and all requests will be
proxied through Zyte API automatically.

```python
from scrapy import Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)
```

### Key differences

The proxy mode makes it easier to migrate existing code
that uses a proxy service. However, the proxy mode and the HTTP API have some key differences:

| Feature                 | HTTP API     | Proxy mode         |
|-------------------------|--------------|--------------------|
| Parameter definition    | Request body | Request headers    |
| Browser HTML            | Yes          | Yes (new!)         |
| Screenshots             | Yes          | No                 |
| Browser actions         | Yes          | No                 |
| Network capture         | Yes          | No                 |
| Disable JS on browser   | Yes          | No                 |
| Automatic extraction    | Yes          | No                 |
| Server-managed sessions | Yes          | No                 |
| Echo data               | Yes          | No                 |
| Overhead                | Some         | Minimum            |
| Cookie definition       | Multi-domain | Target domain only |

#### Overhead

When using HTTP requests, the HTTP API introduces some
overhead in responses due mainly to the base64-encoding of
[httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody), increasing network traffic and latency, and
requiring base64-decoding on your end.

In contrast, with proxy mode the only overhead you get is some additional
response headers.

> ###### SEE ALSO
>
> zapi-optimize

#### Cookie definition

With proxy mode, you can only set cookies for the domain
of the target URL, you cannot manually set cookies for additional domains that
may be reached through redirection.

### Request headers

The following headers allow changing how a request is sent through Zyte API in
proxy mode.

#### Zyte-Browser-Html

Sets browserHtml.

This is not compatible with zyte-disable-follow-redirect.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### curl

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Browser-Html: true" \
    https://toscrape.com
```

#### C#

```cs
using System;
using System.Net;
using System.Net.Http;

var proxy = new WebProxy("http://api.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var httpClientHandler = new HttpClientHandler
{
    Proxy = proxy,
};

var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
client.DefaultRequestHeaders.Add("Zyte-Browser-Html", "true");
var message = new HttpRequestMessage(HttpMethod.Get, "https://toscrape.com");
var response = client.Send(message);
var body = await response.Content.ReadAsStringAsync();

Console.WriteLine(body);
```

#### Java

```java
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.hc.client5.http.auth.AuthCache;
import org.apache.hc.client5.http.auth.AuthScope;
import org.apache.hc.client5.http.auth.CredentialsProvider;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.impl.auth.BasicAuthCache;
import org.apache.hc.client5.http.impl.auth.BasicScheme;
import org.apache.hc.client5.http.impl.auth.CredentialsProviderBuilder;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.client5.http.impl.routing.DefaultProxyRoutePlanner;
import org.apache.hc.client5.http.protocol.HttpClientContext;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHost;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;

class Example {
  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {

    HttpHost proxy = new HttpHost("api.zyte.com", 8011);
    DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
    CredentialsProvider credentialsProvider =
        CredentialsProviderBuilder.create()
            .add(new AuthScope(proxy), "YOUR_ZYTE_API_KEY", "".toCharArray())
            .build();

    AuthCache authCache = new BasicAuthCache();
    BasicScheme basicAuth = new BasicScheme();
    authCache.put(proxy, basicAuth);
    HttpClientContext context = HttpClientContext.create();
    context.setCredentialsProvider(credentialsProvider);
    context.setAuthCache(authCache);

    CloseableHttpClient client =
        HttpClients.custom()
            .setRoutePlanner(routePlanner)
            .setDefaultCredentialsProvider(credentialsProvider)
            .build();

    HttpGet request = new HttpGet("https://toscrape.com");
    request.setHeader("Zyte-Browser-Html", "true");
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String httpResponseBody = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }
}
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      headers: {
        'Zyte-Browser-Html': 'true'
      },
      proxy: {
        protocol: 'http',
        host: 'api.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'headers' => [
        'Zyte-Browser-Html' => 'true',
    ],
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

```python
import requests

response = requests.get(
    "https://toscrape.com",
    headers={
        "Zyte-Browser-Html": "true",
    },
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011" for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Ruby

```ruby
# frozen_string_literal: true

require 'net/http'

url = URI('https://toscrape.com/')
proxy_host = 'api.zyte.com'
proxy_port = '8011'

http = Net::HTTP.new(url.host, url.port, proxy_host, proxy_port, 'YOUR_ZYTE_API_KEY', '')
http.use_ssl = true

request = Net::HTTP::Get.new(url)
request['Zyte-Browser-Html'] = 'true'

r = http.start do |h|
  h.request(request)
end

puts r.body
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request("https://toscrape.com", headers={"Zyte-Browser-Html": "true"})

    def parse(self, response):
        print(response.text)
```

Output (first 5 lines):

```html
<!DOCTYPE html><html lang="en"><head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        <link href="./css/bootstrap.min.css" rel="stylesheet">
        <link href="./css/main.css" rel="stylesheet">
```

#### Zyte-Client

May be used to report to Zyte the software being used to access Zyte API.

It should be formatted with the syntax of the [User-Agent](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent) header, e.g.
`curl/1.2.3`.

#### Zyte-Cookie-Management

Sets cookieManagement.

#### Zyte-Device

Sets device emulation.

#### Zyte-Disable-Follow-Redirect

When set to `true`, disables redirect following, which is enabled by default.

#### Zyte-Geolocation

Sets a geolocation.

#### Zyte-IPType

Sets ipType.

#### Zyte-JobId

Sets the ID of the Scrapy Cloud job that is sending the
request.

[scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy) sets this header automatically when used from a Scrapy
Cloud job.

#### Zyte-Override-Headers

Zyte API automatically sends some request headers for ban avoidance.

Custom headers from your request will override most automatic headers, but not
these:

`Accept`
`Accept-Encoding`
`User-Agent`

To override any of these 3 headers, set `Zyte-Override-Headers` to a
comma-separated list of names of headers to override, e.g.
`Zyte-Override-Headers: Accept,Accept-Encoding`.

> ###### WARNING
>
> Overriding headers can break Zyte API ban avoidance.

#### Zyte-Session-ID

Sets [session.id](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session.id) for a client-managed session.

#### Zyte-Tags

Sets the [tags](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/tags) dictionary in the request.

The header is a JSON object, such as `Zyte-Tags: {"foo": "bar", "baz": null, "435":"true"}`.
Value MUST be valid ASCII under 512 bytes long.

#### Invalid request headers

The following headers are not allowed, and any request with one or more of them
will result in an HTTP 400 response:

`Client-IP`
`Cluster-Client-IP`
`Forwarded-For`
`True-Client-IP`
`Via`
`X-Client-IP`
`X-Forwarded`
`X-Forwarded-For`
`X-Forwarded-Host`
`X-Host`
`X-Original-URL`
`X-Originating-IP`
`X-ProxyUser-IO`
`X-ProxyUser-IP`
`X-Remote-Addr`
`X-Remote-IP`

### Response headers

Responses include some headers injected by Zyte API.

Note that the response body of unsuccessful responses is always the actual JSON response from the
HTTP API that provides error details.

#### Zyte-Error-Title

A short summary of the problem type. Written in English and readable for
engineers, usually not suited for non-technical stakeholders, and not
localized.

It matches the `title` JSON field of the error response.

#### Zyte-Error-Type

A URI reference that uniquely identifies the problem type, only in the context
of the provided API.

Opposed to the specification in RFC-7807, it is neither recommended to be
dereferencable and point to human-readable documentation nor globally unique
for the problem type.

It matches the `type` JSON field of the error response.

#### Zyte-Request-ID

A unique identifier of the request.

When reporting an issue about the outcome of a request to us, please include
the value of this response header when possible.

### HTTPS proxy

> ###### TIP
>
> The main endpoint works both for HTTP and
> HTTPS URLs, you do not need an HTTPS proxy interface to access HTTPS URLs.

You can use the `api.zyte.com:8014` endpoint for an HTTPS proxy interface,
provided your tech stack supports HTTPS proxies and you have installed
our CA certificate:

#### curl

```bash
curl \
    --proxy https://api.zyte.com:8014 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### JS

```js
const HttpsProxyAgent = require('https-proxy-agent')
const httpsAgent = new HttpsProxyAgent.HttpsProxyAgent('https://YOUR_ZYTE_API_KEY:@api.zyte.com:8014')
const axiosDefaultConfig = { httpsAgent }
const axios = require('axios').create(axiosDefaultConfig)

axios
  .get('https://toscrape.com')
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### Python

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "https://YOUR_ZYTE_API_KEY:@api.zyte.com:8014"
        for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

### Use with browser automation tools

The proxy mode is not optimized for use in combination with browser automation
tools. Please, consider using Zyte API’s browser automation features instead. See zapi-browser-automation.

## Zyte API rate limits

Zyte API limits the number of requests per minute (RPM). Each API key has
a rate limit that can be increased. Additional rate limits may
apply.

Rate-limited requests receive a rate-limiting response at no cost and can be retried.

### API key rate limit

Enterprise plans: Custom
Standard plans: 3000 RPM

To increase your rate limit, see rate-limit-increase.

### Other rate limits

Additional rate limits may apply:

- **Website limits**: Each website has its own rate limit to prevent issues on target sites.
- **Account-website limits**: Each account has per-website limits to ensure fair access.
- **Temporary limits**: Applied during high demand to ensure platform stability.

> ###### SEE ALSO
>
> stats-rate-limiting at stats-api

### Increasing rate limits

**Temporary increases**

For short-term needs (single job, specific dates), [open a support ticket](https://support.zyte.com/support/tickets/new) at least 24 hours in advance with:

- Target [API key](https://app.zyte.com/o/zyte-api/api-access)
- Desired RPM
- Target websites
- Start and end dates

**Permanent increases**

Standard plans: [Contact sales](https://www.zyte.com/zyte-web-scraping-api/#form) to upgrade to an Enterprise plan
Enterprise plan: Contact your account manager

### Concurrency

Rate limits are based on requests per minute, not concurrent requests.

Your maximum concurrency depends on:

- Your RPM limit
- Average response time of target sites
- Request parameters (e.g. browser rendering, extraction)

**Estimation formula:**

> max concurrency ≈ RPM limit ÷ 60 × avg response time (seconds)

**Examples:**

|   RPM limit | Avg. response time   |   Max. concurrency |
|-------------|----------------------|--------------------|
|        3000 | 0.2 s                |                 10 |
|        3000 | 2 s                  |                100 |
|        3000 | 20 s                 |               1000 |

## Optimizing Zyte API usage

Here are some tips to optimize your use of Zyte API:

- Send multiple requests in parallel.
- For real-time scenarios, where a single request is sent at a time, there
  are a few tips you can follow to improve response times.
- When targeting multiple websites at once, sort requests to spread the load.

### Sending multiple requests in parallel

A Zyte API request can take tens of seconds to process. The response time
depends on the target website and features used. For example, if you use a
browser request, it is common to get a response in
10-30 seconds.

Due to that, if you send requests sequentially, the throughput could be quite
low, only a few responses per minute.

To increase the throughput, send many requests in parallel, as shown in the
example below.

The number of parallel requests that is optimum for you depends on your
API key rate limit and on your target
websites. For example, if your rate limit is 3000 requests per minute, and the
average response time you observe for your websites is 2 seconds, then to
reach your rate limit you may set the number of parallel requests to 100
(`ceil(3000/60*2)`).

If too many requests are being processed in parallel, you will be getting many
rate-limiting responses, which you can
retry. To maximize efficiency, please use a number of
parallel requests that minimizes the number of rate-limiting responses.
However, a small percentage of rate-limiting responses is normal and expected
if you want to get close to your API key rate limit.

For some websites, increasing parallel requests will slow down their
responses and/or increase the ratio of unsuccessful responses. Zyte API does its best to prevent these
issues, but if you notice this happening to you, please consider decreasing
your parallel requests.

#### Example

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

var urls = new string[2];
urls[0] = "https://books.toscrape.com/catalogue/page-1.html";
urls[1] = "https://books.toscrape.com/catalogue/page-2.html";
var output = new List<HttpResponseMessage>();

var handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All,
    MaxConnectionsPerServer = 15
};
var client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var responseTasks = new List<Task<HttpResponseMessage>>();
foreach (var url in urls)
{
    var input = new Dictionary<string, object>(){
        {"url", url},
        {"browserHtml", true}
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");
    var responseTask = client.PostAsync("https://api.zyte.com/v1/extract", content);
    responseTasks.Add(responseTask);
}

while (responseTasks.Any())
{
    var responseTask = await Task.WhenAny(responseTasks);
    responseTasks.Remove(responseTask);
    var response = await responseTask;
    output.Add(response);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://books.toscrape.com/catalogue/page-1.html", "browserHtml": true}
{"url": "https://books.toscrape.com/catalogue/page-2.html", "browserHtml": true}
```

```shell
zyte-api --n-conn 15 input.jsonl -o output.jsonl
```

#### curl

input.jsonl
```json
{"url": "https://books.toscrape.com/catalogue/page-1.html", "browserHtml": true}
{"url": "https://books.toscrape.com/catalogue/page-2.html", "browserHtml": true}
```

```shell
cat input.jsonl \
    | xargs -P 15 -d\\n -n 1 \
    bash -c "
        curl \
            --user YOUR_ZYTE_API_KEY: \
            --header 'Content-Type: application/json' \
            --data \"\$0\" \
            --compressed \
            https://api.zyte.com/v1/extract \
        | awk '{print \$1}' \
        >> output.jsonl
"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Base64;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import org.apache.hc.client5.http.async.methods.SimpleHttpRequest;
import org.apache.hc.client5.http.async.methods.SimpleHttpResponse;
import org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient;
import org.apache.hc.client5.http.impl.async.HttpAsyncClients;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManagerBuilder;
import org.apache.hc.client5.http.ssl.ClientTlsStrategyBuilder;
import org.apache.hc.core5.concurrent.FutureCallback;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.nio.ssl.TlsStrategy;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws ExecutionException, InterruptedException, IOException, ParseException {

    String[] urls = {
      "https://books.toscrape.com/catalogue/page-1.html",
      "https://books.toscrape.com/catalogue/page-2.html"
    };
    List<Future> futures = new ArrayList<Future>();
    List<String> output = new ArrayList<String>();

    int concurrency = 15;

    // https://issues.apache.org/jira/browse/HTTPCLIENT-2219
    final TlsStrategy tlsStrategy = ClientTlsStrategyBuilder.create().useSystemProperties().build();

    PoolingAsyncClientConnectionManager connectionManager =
        PoolingAsyncClientConnectionManagerBuilder.create().setTlsStrategy(tlsStrategy).build();
    connectionManager.setMaxTotal(concurrency);
    connectionManager.setDefaultMaxPerRoute(concurrency);

    CloseableHttpAsyncClient client =
        HttpAsyncClients.custom().setConnectionManager(connectionManager).build();
    try {
      client.start();
      for (int i = 0; i < urls.length; i++) {
        Map<String, Object> parameters = ImmutableMap.of("url", urls[i], "browserHtml", true);
        String requestBody = new Gson().toJson(parameters);

        SimpleHttpRequest request =
            new SimpleHttpRequest("POST", "https://api.zyte.com/v1/extract");
        request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
        request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
        request.setBody(requestBody, ContentType.APPLICATION_JSON);

        final Future<SimpleHttpResponse> future =
            client.execute(
                request,
                new FutureCallback<SimpleHttpResponse>() {
                  public void completed(final SimpleHttpResponse response) {
                    String apiResponse = response.getBodyText();
                    JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
                    String browserHtml = jsonObject.get("browserHtml").getAsString();
                    output.add(browserHtml);
                  }

                  public void failed(final Exception ex) {}

                  public void cancelled() {}
                });
        futures.add(future);
      }
      for (int i = 0; i < futures.size(); i++) {
        futures.get(i).get();
      }
    } finally {
      client.close();
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const { ConcurrencyManager } = require('axios-concurrency')
const axios = require('axios')

const urls = [
  'https://books.toscrape.com/catalogue/page-1.html',
  'https://books.toscrape.com/catalogue/page-2.html'
]
const output = []

const client = axios.create()
ConcurrencyManager(client, 15)

Promise.all(
  urls.map((url) =>
    client.post(
      'https://api.zyte.com/v1/extract',
      { url, browserHtml: true },
      {
        auth: { username: 'YOUR_ZYTE_API_KEY' }
      }
    ).then((response) => output.push(response.data))
  )
)
```

#### PHP

```php
<?php

$urls = [
  'https://books.toscrape.com/catalogue/page-1.html',
  'https://books.toscrape.com/catalogue/page-2.html',
];
$output = [];
$promises = [];

$client = new GuzzleHttp\Client();

foreach ($urls as $url) {
    $options = [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => $url,
            'browserHtml' => true,
        ],
    ];
    $request = new \GuzzleHttp\Psr7\Request('POST', 'https://api.zyte.com/v1/extract');
    global $promises;
    $promises[] = $client->sendAsync($request, $options)->then(function ($response) {
        global $output;
        $output[] = json_decode($response->getBody());
    });
}

foreach ($promises as $promise) {
    $promise->wait();
}
```

#### Python

```python
import asyncio

import aiohttp

urls = [
    "https://books.toscrape.com/catalogue/page-1.html",
    "https://books.toscrape.com/catalogue/page-2.html",
]
output = []

async def extract(client, url):
    response = await client.post(
        "https://api.zyte.com/v1/extract",
        json={"url": url, "browserHtml": True},
        auth=aiohttp.BasicAuth("YOUR_ZYTE_API_KEY"),
    )
    output.append(await response.json())

async def main():
    connector = aiohttp.TCPConnector(limit_per_host=15)
    async with aiohttp.ClientSession(connector=connector) as client:
        await asyncio.gather(*[extract(client, url) for url in urls])

asyncio.run(main())
```

#### Python client

```python
import asyncio

from zyte_api import AsyncZyteAPI

urls = [
    "https://books.toscrape.com/catalogue/page-1.html",
    "https://books.toscrape.com/catalogue/page-2.html",
]

async def main():
    client = AsyncZyteAPI(n_conn=15)
    queries = [{"url": url, "browserHtml": True} for url in urls]
    async with client.session() as session:
        for future in session.iter(queries):
            response = await future
            print(response)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

urls = [
    "https://books.toscrape.com/catalogue/page-1.html",
    "https://books.toscrape.com/catalogue/page-2.html",
]

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    custom_settings = {
        "CONCURRENT_REQUESTS": 15,
        "CONCURRENT_REQUESTS_PER_DOMAIN": 15,
    }

    async def start(self):
        for url in urls:
            yield Request(
                url,
                meta={
                    "zyte_api_automap": {
                        "browserHtml": True,
                    },
                },
            )

    def parse(self, response):
        yield {
            "url": response.url,
            "browserHtml": response.text,
        }
```

Output:

```json
{"url": "https://books.toscrape.com/catalogue/page-1.html", "statusCode": 200, "browserHtml": "<!DOCTYPE html><!--[if lt IE 7]>      <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8 lt-ie7\"> <![endif]--><!--[if IE 7]>         <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8\"> <![endif]--><!--[if IE 8]>         <html lang=\"en-us\" class=\"no-js lt-ie9\"> <![endif]--><!--[if gt IE 8]><!--><html lang=\"en-us\" class=\"no-js\"><!--<![endif]--><head>\n        <title>\n    All products | Books to Scrape - Sandbox\n</title>\n\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n        <meta name=\"created\" content=\"24th Jun 2016 09:30\">\n        <meta name=\"description\" content=\"\">\n        <meta name=\"viewport\" content=\"width=device-width\">\n        <meta name=\"robots\" content=\"NOARCHIVE,NOCACHE\">\n\n        <!-- Le HTML5 shim, for IE6-8 support of HTML elements -->\n        <!--[if lt IE 9]>\n        <script src=\"//html5shim.googlecode.com/svn/trunk/html5.js\"></script>\n        <![endif]-->\n\n        \n            <link rel=\"shortcut icon\" href=\"../static/oscar/favicon.ico\">\n        \n\n        \n        \n    \n    \n        <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/styles.css\">\n    \n    <link rel=\"stylesheet\" href=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.css\">\n    <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/datetimepicker.css\">\n\n\n        \n        \n\n        \n\n        \n            \n            \n\n        \n    </head>\n\n    <body id=\"default\" class=\"default\">\n        \n        \n    \n    \n    <header class=\"header container-fluid\">\n        <div class=\"page_inner\">\n            <div class=\"row\">\n                <div class=\"col-sm-8 h1\"><a href=\"../index.html\">Books to Scrape</a><small> We love being scraped!</small>\n</div>\n\n                \n            </div>\n        </div>\n    </header>\n\n    \n    \n<div class=\"container-fluid page\">\n    <div class=\"page_inner\">\n        \n    <ul class=\"breadcrumb\">\n        <li>\n            <a href=\"../index.html\">Home</a>\n        </li>\n        <li class=\"active\">All products</li>\n    </ul>\n\n        <div class=\"row\">\n\n            <aside class=\"sidebar col-sm-4 col-md-3\">\n                \n                <div id=\"promotions_left\">\n                    \n                </div>\n                \n    \n    \n        \n        <div class=\"side_categories\">\n            <ul class=\"nav nav-list\">\n                \n                    <li>\n                        <a href=\"category/books_1/index.html\">\n                            \n                                Books\n                            \n                        </a>\n\n                        <ul>\n                        \n                \n                    <li>\n                        <a href=\"category/books/travel_2/index.html\">\n                            \n                                Travel\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/mystery_3/index.html\">\n                            \n                                Mystery\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical-fiction_4/index.html\">\n                            \n                                Historical Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sequential-art_5/index.html\">\n                            \n                                Sequential Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/classics_6/index.html\">\n                            \n                                Classics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/philosophy_7/index.html\">\n                            \n                                Philosophy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/romance_8/index.html\">\n                            \n                                Romance\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/womens-fiction_9/index.html\">\n                            \n                                Womens Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fiction_10/index.html\">\n                            \n                                Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/childrens_11/index.html\">\n                            \n                                Childrens\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/religion_12/index.html\">\n                            \n                                Religion\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/nonfiction_13/index.html\">\n                            \n                                Nonfiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/music_14/index.html\">\n                            \n                                Music\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/default_15/index.html\">\n                            \n                                Default\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science-fiction_16/index.html\">\n                            \n                                Science Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sports-and-games_17/index.html\">\n                            \n                                Sports and Games\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/add-a-comment_18/index.html\">\n                            \n                                Add a comment\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fantasy_19/index.html\">\n                            \n                                Fantasy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/new-adult_20/index.html\">\n                            \n                                New Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/young-adult_21/index.html\">\n                            \n                                Young Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science_22/index.html\">\n                            \n                                Science\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/poetry_23/index.html\">\n                            \n                                Poetry\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/paranormal_24/index.html\">\n                            \n                                Paranormal\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/art_25/index.html\">\n                            \n                                Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/psychology_26/index.html\">\n                            \n                                Psychology\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/autobiography_27/index.html\">\n                            \n                                Autobiography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/parenting_28/index.html\">\n                            \n                                Parenting\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/adult-fiction_29/index.html\">\n                            \n                                Adult Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/humor_30/index.html\">\n                            \n                                Humor\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/horror_31/index.html\">\n                            \n                                Horror\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/history_32/index.html\">\n                            \n                                History\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/food-and-drink_33/index.html\">\n                            \n                                Food and Drink\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian-fiction_34/index.html\">\n                            \n                                Christian Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/business_35/index.html\">\n                            \n                                Business\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/biography_36/index.html\">\n                            \n                                Biography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/thriller_37/index.html\">\n                            \n                                Thriller\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/contemporary_38/index.html\">\n                            \n                                Contemporary\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/spirituality_39/index.html\">\n                            \n                                Spirituality\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/academic_40/index.html\">\n                            \n                                Academic\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/self-help_41/index.html\">\n                            \n                                Self Help\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical_42/index.html\">\n                            \n                                Historical\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian_43/index.html\">\n                            \n                                Christian\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/suspense_44/index.html\">\n                            \n                                Suspense\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/short-stories_45/index.html\">\n                            \n                                Short Stories\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/novels_46/index.html\">\n                            \n                                Novels\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/health_47/index.html\">\n                            \n                                Health\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/politics_48/index.html\">\n                            \n                                Politics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/cultural_49/index.html\">\n                            \n                                Cultural\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/erotica_50/index.html\">\n                            \n                                Erotica\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/crime_51/index.html\">\n                            \n                                Crime\n                            \n                        </a>\n\n                        </li>\n                        \n                            </ul></li>\n                        \n                \n            </ul>\n        </div>\n    \n    \n\n            </aside>\n\n            <div class=\"col-sm-8 col-md-9\">\n                \n                <div class=\"page-header action\">\n                    <h1>All products</h1>\n                </div>\n                \n\n                \n\n\n\n<div id=\"messages\">\n\n</div>\n\n\n                <div id=\"promotions\">\n                    \n                </div>\n\n                \n    <form method=\"get\" class=\"form-horizontal\">\n        \n        <div style=\"display:none\">\n            \n            \n        </div>\n\n        \n            \n                \n                    <strong>1000</strong> results - showing <strong>1</strong> to <strong>20</strong>.\n                \n            \n            \n        \n    </form>\n    \n        <section>\n            <div class=\"alert alert-warning\" role=\"alert\"><strong>Warning!</strong> This is a demo website for web scraping purposes. Prices and ratings here were randomly assigned and have no real meaning.</div>\n\n            <div>\n                <ol class=\"row\">\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"a-light-in-the-attic_1000/index.html\"><img src=\"../media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg\" alt=\"A Light in the Attic\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"a-light-in-the-attic_1000/index.html\" title=\"A Light in the Attic\">A Light in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.77</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"tipping-the-velvet_999/index.html\"><img src=\"../media/cache/26/0c/260c6ae16bce31c8f8c95daddd9f4a1c.jpg\" alt=\"Tipping the Velvet\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"tipping-the-velvet_999/index.html\" title=\"Tipping the Velvet\">Tipping the Velvet</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£53.74</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"soumission_998/index.html\"><img src=\"../media/cache/3e/ef/3eef99c9d9adef34639f510662022830.jpg\" alt=\"Soumission\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"soumission_998/index.html\" title=\"Soumission\">Soumission</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£50.10</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"sharp-objects_997/index.html\"><img src=\"../media/cache/32/51/3251cf3a3412f53f339e42cac2134093.jpg\" alt=\"Sharp Objects\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"sharp-objects_997/index.html\" title=\"Sharp Objects\">Sharp Objects</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£47.82</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"sapiens-a-brief-history-of-humankind_996/index.html\"><img src=\"../media/cache/be/a5/bea5697f2534a2f86a3ef27b5a8c12a6.jpg\" alt=\"Sapiens: A Brief History of Humankind\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"sapiens-a-brief-history-of-humankind_996/index.html\" title=\"Sapiens: A Brief History of Humankind\">Sapiens: A Brief History ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£54.23</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-requiem-red_995/index.html\"><img src=\"../media/cache/68/33/68339b4c9bc034267e1da611ab3b34f8.jpg\" alt=\"The Requiem Red\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-requiem-red_995/index.html\" title=\"The Requiem Red\">The Requiem Red</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.65</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\"><img src=\"../media/cache/92/27/92274a95b7c251fea59a2b8a78275ab4.jpg\" alt=\"The Dirty Little Secrets of Getting Your Dream Job\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\" title=\"The Dirty Little Secrets of Getting Your Dream Job\">The Dirty Little Secrets ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.34</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\"><img src=\"../media/cache/3d/54/3d54940e57e662c4dd1f3ff00c78cc64.jpg\" alt=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\" title=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\">The Coming Woman: A ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.93</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\"><img src=\"../media/cache/66/88/66883b91f6804b2323c8369331cb7dd1.jpg\" alt=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\" title=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\">The Boys in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.60</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-black-maria_991/index.html\"><img src=\"../media/cache/58/46/5846057e28022268153beff6d352b06c.jpg\" alt=\"The Black Maria\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-black-maria_991/index.html\" title=\"The Black Maria\">The Black Maria</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.15</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"starving-hearts-triangular-trade-trilogy-1_990/index.html\"><img src=\"../media/cache/be/f4/bef44da28c98f905a3ebec0b87be8530.jpg\" alt=\"Starving Hearts (Triangular Trade Trilogy, #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"starving-hearts-triangular-trade-trilogy-1_990/index.html\" title=\"Starving Hearts (Triangular Trade Trilogy, #1)\">Starving Hearts (Triangular Trade ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£13.99</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"shakespeares-sonnets_989/index.html\"><img src=\"../media/cache/10/48/1048f63d3b5061cd2f424d20b3f9b666.jpg\" alt=\"Shakespeare's Sonnets\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"shakespeares-sonnets_989/index.html\" title=\"Shakespeare's Sonnets\">Shakespeare's Sonnets</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£20.66</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"set-me-free_988/index.html\"><img src=\"../media/cache/5b/88/5b88c52633f53cacf162c15f4f823153.jpg\" alt=\"Set Me Free\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"set-me-free_988/index.html\" title=\"Set Me Free\">Set Me Free</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.46</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\"><img src=\"../media/cache/94/b1/94b1b8b244bce9677c2f29ccc890d4d2.jpg\" alt=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\" title=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\">Scott Pilgrim's Precious Little ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.29</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"rip-it-up-and-start-again_986/index.html\"><img src=\"../media/cache/81/c4/81c4a973364e17d01f217e1188253d5e.jpg\" alt=\"Rip it Up and Start Again\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"rip-it-up-and-start-again_986/index.html\" title=\"Rip it Up and Start Again\">Rip it Up and ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£35.02</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\"><img src=\"../media/cache/54/60/54607fe8945897cdcced0044103b10b6.jpg\" alt=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\" title=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\">Our Band Could Be ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£57.25</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"olio_984/index.html\"><img src=\"../media/cache/55/33/553310a7162dfbc2c6d19a84da0df9e1.jpg\" alt=\"Olio\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"olio_984/index.html\" title=\"Olio\">Olio</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£23.88</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\"><img src=\"../media/cache/09/a3/09a3aef48557576e1a85ba7efea8ecb7.jpg\" alt=\"Mesaerion: The Best Science Fiction Stories 1800-1849\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\" title=\"Mesaerion: The Best Science Fiction Stories 1800-1849\">Mesaerion: The Best Science ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£37.59</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"libertarianism-for-beginners_982/index.html\"><img src=\"../media/cache/0b/bc/0bbcd0a6f4bcd81ccb1049a52736406e.jpg\" alt=\"Libertarianism for Beginners\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"libertarianism-for-beginners_982/index.html\" title=\"Libertarianism for Beginners\">Libertarianism for Beginners</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.33</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"its-only-the-himalayas_981/index.html\"><img src=\"../media/cache/27/a5/27a53d0bb95bdd88288eaf66c9230d7e.jpg\" alt=\"It's Only the Himalayas\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"its-only-the-himalayas_981/index.html\" title=\"It's Only the Himalayas\">It's Only the Himalayas</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£45.17</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                </ol>\n                \n\n\n\n    <div>\n        <ul class=\"pager\">\n            \n            <li class=\"current\">\n            \n                Page 1 of 50\n            \n            </li>\n            \n                <li class=\"next\"><a href=\"page-2.html\">next</a></li>\n            \n        </ul>\n    </div>\n\n\n            </div>\n        </section>\n    \n\n\n            </div>\n\n        </div><!-- /row -->\n    </div><!-- /page_inner -->\n</div><!-- /container-fluid -->\n\n\n    \n<footer class=\"footer container-fluid\">\n    \n        \n    \n</footer>\n\n\n        \n        \n  \n            <!-- jQuery -->\n            <script src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js\"></script>\n            <script>window.jQuery || document.write('<script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"><\\/script>')</script><script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"></script>\n        \n  \n\n\n        \n        \n    \n        \n    <script type=\"text/javascript\" src=\"../static/oscar/js/bootstrap3/bootstrap.min.js\"></script>\n    <!-- Oscar -->\n    <script src=\"../static/oscar/js/oscar/ui.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/locales/bootstrap-datetimepicker.all.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n\n        \n        \n    \n\n    \n\n\n        \n        <script type=\"text/javascript\">\n            $(function() {\n                \n    \n    \n    oscar.init();\n\n    oscar.search.init();\n\n            });\n        </script>\n\n        \n        <!-- Version: N/A -->\n        \n    \n\n</body></html>", "echoData": "https://books.toscrape.com/catalogue/page-1.html"}
{"url": "https://books.toscrape.com/catalogue/page-2.html", "statusCode": 200, "browserHtml": "<!DOCTYPE html><!--[if lt IE 7]>      <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8 lt-ie7\"> <![endif]--><!--[if IE 7]>         <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8\"> <![endif]--><!--[if IE 8]>         <html lang=\"en-us\" class=\"no-js lt-ie9\"> <![endif]--><!--[if gt IE 8]><!--><html lang=\"en-us\" class=\"no-js\"><!--<![endif]--><head>\n        <title>\n    All products | Books to Scrape - Sandbox\n</title>\n\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n        <meta name=\"created\" content=\"24th Jun 2016 09:29\">\n        <meta name=\"description\" content=\"\">\n        <meta name=\"viewport\" content=\"width=device-width\">\n        <meta name=\"robots\" content=\"NOARCHIVE,NOCACHE\">\n\n        <!-- Le HTML5 shim, for IE6-8 support of HTML elements -->\n        <!--[if lt IE 9]>\n        <script src=\"//html5shim.googlecode.com/svn/trunk/html5.js\"></script>\n        <![endif]-->\n\n        \n            <link rel=\"shortcut icon\" href=\"../static/oscar/favicon.ico\">\n        \n\n        \n        \n    \n    \n        <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/styles.css\">\n    \n    <link rel=\"stylesheet\" href=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.css\">\n    <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/datetimepicker.css\">\n\n\n        \n        \n\n        \n\n        \n            \n            \n\n        \n    </head>\n\n    <body id=\"default\" class=\"default\">\n        \n        \n    \n    \n    <header class=\"header container-fluid\">\n        <div class=\"page_inner\">\n            <div class=\"row\">\n                <div class=\"col-sm-8 h1\"><a href=\"../index.html\">Books to Scrape</a><small> We love being scraped!</small>\n</div>\n\n                \n            </div>\n        </div>\n    </header>\n\n    \n    \n<div class=\"container-fluid page\">\n    <div class=\"page_inner\">\n        \n    <ul class=\"breadcrumb\">\n        <li>\n            <a href=\"../index.html\">Home</a>\n        </li>\n        <li class=\"active\">All products</li>\n    </ul>\n\n        <div class=\"row\">\n\n            <aside class=\"sidebar col-sm-4 col-md-3\">\n                \n                <div id=\"promotions_left\">\n                    \n                </div>\n                \n    \n    \n        \n        <div class=\"side_categories\">\n            <ul class=\"nav nav-list\">\n                \n                    <li>\n                        <a href=\"category/books_1/index.html\">\n                            \n                                Books\n                            \n                        </a>\n\n                        <ul>\n                        \n                \n                    <li>\n                        <a href=\"category/books/travel_2/index.html\">\n                            \n                                Travel\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/mystery_3/index.html\">\n                            \n                                Mystery\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical-fiction_4/index.html\">\n                            \n                                Historical Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sequential-art_5/index.html\">\n                            \n                                Sequential Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/classics_6/index.html\">\n                            \n                                Classics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/philosophy_7/index.html\">\n                            \n                                Philosophy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/romance_8/index.html\">\n                            \n                                Romance\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/womens-fiction_9/index.html\">\n                            \n                                Womens Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fiction_10/index.html\">\n                            \n                                Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/childrens_11/index.html\">\n                            \n                                Childrens\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/religion_12/index.html\">\n                            \n                                Religion\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/nonfiction_13/index.html\">\n                            \n                                Nonfiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/music_14/index.html\">\n                            \n                                Music\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/default_15/index.html\">\n                            \n                                Default\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science-fiction_16/index.html\">\n                            \n                                Science Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sports-and-games_17/index.html\">\n                            \n                                Sports and Games\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/add-a-comment_18/index.html\">\n                            \n                                Add a comment\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fantasy_19/index.html\">\n                            \n                                Fantasy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/new-adult_20/index.html\">\n                            \n                                New Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/young-adult_21/index.html\">\n                            \n                                Young Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science_22/index.html\">\n                            \n                                Science\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/poetry_23/index.html\">\n                            \n                                Poetry\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/paranormal_24/index.html\">\n                            \n                                Paranormal\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/art_25/index.html\">\n                            \n                                Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/psychology_26/index.html\">\n                            \n                                Psychology\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/autobiography_27/index.html\">\n                            \n                                Autobiography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/parenting_28/index.html\">\n                            \n                                Parenting\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/adult-fiction_29/index.html\">\n                            \n                                Adult Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/humor_30/index.html\">\n                            \n                                Humor\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/horror_31/index.html\">\n                            \n                                Horror\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/history_32/index.html\">\n                            \n                                History\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/food-and-drink_33/index.html\">\n                            \n                                Food and Drink\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian-fiction_34/index.html\">\n                            \n                                Christian Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/business_35/index.html\">\n                            \n                                Business\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/biography_36/index.html\">\n                            \n                                Biography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/thriller_37/index.html\">\n                            \n                                Thriller\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/contemporary_38/index.html\">\n                            \n                                Contemporary\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/spirituality_39/index.html\">\n                            \n                                Spirituality\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/academic_40/index.html\">\n                            \n                                Academic\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/self-help_41/index.html\">\n                            \n                                Self Help\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical_42/index.html\">\n                            \n                                Historical\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian_43/index.html\">\n                            \n                                Christian\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/suspense_44/index.html\">\n                            \n                                Suspense\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/short-stories_45/index.html\">\n                            \n                                Short Stories\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/novels_46/index.html\">\n                            \n                                Novels\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/health_47/index.html\">\n                            \n                                Health\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/politics_48/index.html\">\n                            \n                                Politics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/cultural_49/index.html\">\n                            \n                                Cultural\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/erotica_50/index.html\">\n                            \n                                Erotica\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/crime_51/index.html\">\n                            \n                                Crime\n                            \n                        </a>\n\n                        </li>\n                        \n                            </ul></li>\n                        \n                \n            </ul>\n        </div>\n    \n    \n\n            </aside>\n\n            <div class=\"col-sm-8 col-md-9\">\n                \n                <div class=\"page-header action\">\n                    <h1>All products</h1>\n                </div>\n                \n\n                \n\n\n\n<div id=\"messages\">\n\n</div>\n\n\n                <div id=\"promotions\">\n                    \n                </div>\n\n                \n    <form method=\"get\" class=\"form-horizontal\">\n        \n        <div style=\"display:none\">\n            \n            \n        </div>\n\n        \n            \n                \n                    <strong>1000</strong> results - showing <strong>21</strong> to <strong>40</strong>.\n                \n            \n            \n        \n    </form>\n    \n        <section>\n            <div class=\"alert alert-warning\" role=\"alert\"><strong>Warning!</strong> This is a demo website for web scraping purposes. Prices and ratings here were randomly assigned and have no real meaning.</div>\n\n            <div>\n                <ol class=\"row\">\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"in-her-wake_980/index.html\"><img src=\"../media/cache/5d/72/5d72709c6a7a9584a4d1cf07648bfce1.jpg\" alt=\"In Her Wake\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"in-her-wake_980/index.html\" title=\"In Her Wake\">In Her Wake</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£12.84</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"how-music-works_979/index.html\"><img src=\"../media/cache/5c/c8/5cc8e107246cb478960d4f0aba1e1c8e.jpg\" alt=\"How Music Works\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"how-music-works_979/index.html\" title=\"How Music Works\">How Music Works</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£37.32</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"foolproof-preserving-a-guide-to-small-batch-jams-jellies-pickles-condiments-and-more-a-foolproof-guide-to-making-small-batch-jams-jellies-pickles-condiments-and-more_978/index.html\"><img src=\"../media/cache/9f/59/9f59f01fa916a7bb8f0b28a4012179a4.jpg\" alt=\"Foolproof Preserving: A Guide to Small Batch Jams, Jellies, Pickles, Condiments, and More: A Foolproof Guide to Making Small Batch Jams, Jellies, Pickles, Condiments, and More\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"foolproof-preserving-a-guide-to-small-batch-jams-jellies-pickles-condiments-and-more-a-foolproof-guide-to-making-small-batch-jams-jellies-pickles-condiments-and-more_978/index.html\" title=\"Foolproof Preserving: A Guide to Small Batch Jams, Jellies, Pickles, Condiments, and More: A Foolproof Guide to Making Small Batch Jams, Jellies, Pickles, Condiments, and More\">Foolproof Preserving: A Guide ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£30.52</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"chase-me-paris-nights-2_977/index.html\"><img src=\"../media/cache/9c/2e/9c2e0eb8866b8e3f3b768994fd3d1c1a.jpg\" alt=\"Chase Me (Paris Nights #2)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"chase-me-paris-nights-2_977/index.html\" title=\"Chase Me (Paris Nights #2)\">Chase Me (Paris Nights ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£25.27</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"black-dust_976/index.html\"><img src=\"../media/cache/44/cc/44ccc99c8f82c33d4f9d2afa4ef25787.jpg\" alt=\"Black Dust\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"black-dust_976/index.html\" title=\"Black Dust\">Black Dust</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£34.53</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"birdsong-a-story-in-pictures_975/index.html\"><img src=\"../media/cache/af/6e/af6e796160fe63e0cf19d44395c7ddf2.jpg\" alt=\"Birdsong: A Story in Pictures\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"birdsong-a-story-in-pictures_975/index.html\" title=\"Birdsong: A Story in Pictures\">Birdsong: A Story in ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£54.64</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"americas-cradle-of-quarterbacks-western-pennsylvanias-football-factory-from-johnny-unitas-to-joe-montana_974/index.html\"><img src=\"../media/cache/ef/0b/ef0bed08de4e083dba5e20fdb98d9c36.jpg\" alt=\"America's Cradle of Quarterbacks: Western Pennsylvania's Football Factory from Johnny Unitas to Joe Montana\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"americas-cradle-of-quarterbacks-western-pennsylvanias-football-factory-from-johnny-unitas-to-joe-montana_974/index.html\" title=\"America's Cradle of Quarterbacks: Western Pennsylvania's Football Factory from Johnny Unitas to Joe Montana\">America's Cradle of Quarterbacks: ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.50</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"aladdin-and-his-wonderful-lamp_973/index.html\"><img src=\"../media/cache/d6/da/d6da0371958068bbaf39ea9c174275cd.jpg\" alt=\"Aladdin and His Wonderful Lamp\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"aladdin-and-his-wonderful-lamp_973/index.html\" title=\"Aladdin and His Wonderful Lamp\">Aladdin and His Wonderful ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£53.13</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"worlds-elsewhere-journeys-around-shakespeares-globe_972/index.html\"><img src=\"../media/cache/2e/98/2e98c332bf8563b584784971541c4445.jpg\" alt=\"Worlds Elsewhere: Journeys Around Shakespeare’s Globe\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"worlds-elsewhere-journeys-around-shakespeares-globe_972/index.html\" title=\"Worlds Elsewhere: Journeys Around Shakespeare’s Globe\">Worlds Elsewhere: Journeys Around ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£40.30</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"wall-and-piece_971/index.html\"><img src=\"../media/cache/a5/41/a5416b9646aaa7287baa287ec2590270.jpg\" alt=\"Wall and Piece\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"wall-and-piece_971/index.html\" title=\"Wall and Piece\">Wall and Piece</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£44.18</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-four-agreements-a-practical-guide-to-personal-freedom_970/index.html\"><img src=\"../media/cache/0f/7e/0f7ee69495c0df1d35723f012624a9f8.jpg\" alt=\"The Four Agreements: A Practical Guide to Personal Freedom\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-four-agreements-a-practical-guide-to-personal-freedom_970/index.html\" title=\"The Four Agreements: A Practical Guide to Personal Freedom\">The Four Agreements: A ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.66</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-five-love-languages-how-to-express-heartfelt-commitment-to-your-mate_969/index.html\"><img src=\"../media/cache/38/c5/38c56fba316c07305643a8065269594e.jpg\" alt=\"The Five Love Languages: How to Express Heartfelt Commitment to Your Mate\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-five-love-languages-how-to-express-heartfelt-commitment-to-your-mate_969/index.html\" title=\"The Five Love Languages: How to Express Heartfelt Commitment to Your Mate\">The Five Love Languages: ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£31.05</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-elephant-tree_968/index.html\"><img src=\"../media/cache/5d/7e/5d7ecde8e81513eba8a64c9fe000744b.jpg\" alt=\"The Elephant Tree\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-elephant-tree_968/index.html\" title=\"The Elephant Tree\">The Elephant Tree</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£23.82</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-bear-and-the-piano_967/index.html\"><img src=\"../media/cache/cf/bb/cfbb5e62715c6d888fd07794c9bab5d6.jpg\" alt=\"The Bear and the Piano\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-bear-and-the-piano_967/index.html\" title=\"The Bear and the Piano\">The Bear and the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£36.89</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"sophies-world_966/index.html\"><img src=\"../media/cache/65/71/6571919836ec51ed54f0050c31d8a0cd.jpg\" alt=\"Sophie's World\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"sophies-world_966/index.html\" title=\"Sophie's World\">Sophie's World</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£15.94</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"penny-maybe_965/index.html\"><img src=\"../media/cache/12/53/1253c21c5ef3c6d075c5fa3f5fecee6a.jpg\" alt=\"Penny Maybe\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"penny-maybe_965/index.html\" title=\"Penny Maybe\">Penny Maybe</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.29</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"maude-1883-1993she-grew-up-with-the-country_964/index.html\"><img src=\"../media/cache/f5/88/f5889d038f5d8e949b494d147c2dcf54.jpg\" alt=\"Maude (1883-1993):She Grew Up with the country\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"maude-1883-1993she-grew-up-with-the-country_964/index.html\" title=\"Maude (1883-1993):She Grew Up with the country\">Maude (1883-1993):She Grew Up ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£18.02</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"in-a-dark-dark-wood_963/index.html\"><img src=\"../media/cache/23/85/238570a1c284e730dbc737a7e631ae2b.jpg\" alt=\"In a Dark, Dark Wood\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"in-a-dark-dark-wood_963/index.html\" title=\"In a Dark, Dark Wood\">In a Dark, Dark ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£19.63</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"behind-closed-doors_962/index.html\"><img src=\"../media/cache/e1/5c/e15c289ba58cea38519e1281e859f0c1.jpg\" alt=\"Behind Closed Doors\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"behind-closed-doors_962/index.html\" title=\"Behind Closed Doors\">Behind Closed Doors</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.22</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"you-cant-bury-them-all-poems_961/index.html\"><img src=\"../media/cache/e9/20/e9203b733126c4a0832a1c7885dc27cf.jpg\" alt=\"You can't bury them all: Poems\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"you-cant-bury-them-all-poems_961/index.html\" title=\"You can't bury them all: Poems\">You can't bury them ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.63</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                </ol>\n                \n\n\n\n    <div>\n        <ul class=\"pager\">\n            \n                <li class=\"previous\"><a href=\"page-1.html\">previous</a></li>\n            \n            <li class=\"current\">\n            \n                Page 2 of 50\n            \n            </li>\n            \n                <li class=\"next\"><a href=\"page-3.html\">next</a></li>\n            \n        </ul>\n    </div>\n\n\n            </div>\n        </section>\n    \n\n\n            </div>\n\n        </div><!-- /row -->\n    </div><!-- /page_inner -->\n</div><!-- /container-fluid -->\n\n\n    \n<footer class=\"footer container-fluid\">\n    \n        \n    \n</footer>\n\n\n        \n        \n  \n            <!-- jQuery -->\n            <script src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js\"></script>\n            <script>window.jQuery || document.write('<script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"><\\/script>')</script><script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"></script>\n        \n  \n\n\n        \n        \n    \n        \n    <script type=\"text/javascript\" src=\"../static/oscar/js/bootstrap3/bootstrap.min.js\"></script>\n    <!-- Oscar -->\n    <script src=\"../static/oscar/js/oscar/ui.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/locales/bootstrap-datetimepicker.all.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n\n        \n        \n    \n\n    \n\n\n        \n        <script type=\"text/javascript\">\n            $(function() {\n                \n    \n    \n    oscar.init();\n\n    oscar.search.init();\n\n            });\n        </script>\n\n        \n        <!-- Version: N/A -->\n        \n    \n\n</body></html>", "echoData": "https://books.toscrape.com/catalogue/page-2.html"}
```

### Improving response times

There are a few things you can try to improve the response time of your Zyte
API requests:

- Consider using HTTP requests instead of
  browser requests where possible. If you are only
  using browser requests to avoid bans, try using HTTP requests with a
  session context instead.
- When using [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/browserHtml), consider disabling
  [javascript](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/javascript) where possible.
- When using wait actions, avoid using the
  `waitForTimeout` action where possible. `waitForSelector` is best where
  feasible. Otherwise, `waitForResponse` or `waitForRequest` may work.
- The proxy mode offers lower overhead than the HTTP API, so it might be
  worth considering if its feature differences are
  not a problem for your use case.

### Sorting requests

If you target multiple websites, consider sorting your requests to spread the
load. That is, if you target websites A, B, and C, do not send requests in
AAABBBCCC order, send them in ABCABCABC order instead.

If you use Scrapy, on top of sorting your start requests
as described, you can change your `SCHEDULER_PRIORITY_QUEUE` to
`"scrapy.pqueues.DownloaderAwarePriorityQueue"`.

## Zyte API error handling

While using Zyte API, you may get the following type of responses:

- Successful responses
- Rate-limiting responses
- Unsuccessful responses

### Successful responses

Zyte API sends a successful response, i.e. a response with an HTTP status code
of 200, when that response provides the requested data, ban-free.

A Zyte API response is considered successful even in the following scenarios:

- The response from the target website is a bad response for a reason
  *other* than a ban.
- Some browser actions have failed.
- The webpage content does not match the
  specified automatic extraction property.
- The webpage content does not match what you
  get with an HTTP client program or library like `curl`.

#### Bad website responses

When a website sends a response with an HTTP status code other than 200, and
that response is not the result of a ban, Zyte API sends
that response to you.

For example, if you send a request to [https://toscrape.com/not-found](https://toscrape.com/not-found), you get a
successful response from Zyte API, where the value of the
[statusCode](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/statusCode) response field is `404`.

#### Browser action failures

Browser action failures, e.g. timeouts, or bad responses
received during action execution, e.g. after clicking a button, do not cause
Zyte API to send an unsuccessful response.

Zyte API returns your requested output (e.g. browser HTML, screenshot) the way
it was after all actions were executed or the time to run actions run out, and
the Zyte API response includes an [actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/actions) field with details
about the outcome of each action.

#### Automatic extraction mismatches

Mismatches between the webpage content and the specified automatic
extraction request field do not cause Zyte API to
send an unsuccessful response.

Zyte API returns your requested output, including the `metadata.probability`
field that indicates the probability that the specified automatic extraction
property matches the webpage content.

#### HTTP client mismatches

Some websites might return a different response to a browser than they send to
a different type of HTTP client, like `curl`.

Zyte API aims to provide the same responses that a browser would get, i.e.
those responses you can see in the browser developer tools for network
monitoring.

If you specifically want the same response that a specific non-browser HTTP
client gets, you can try setting the User-Agent header accordingly. However, some websites can tell
browsers from non-browser HTTP clients, and since Zyte API aims to behave like
a browser, getting the same response as a non-browser HTTP client might not be
possible.

### Rate-limiting responses

> ###### NOTE
>
> You are not charged for
> rate-limiting responses.

> ###### TIP
>
> scrapy-zyte-api and
> python-zyte-api handle rate limiting
> automatically.

Zyte API may send a response with an HTTP status code of 429 or 503 for
rate-limiting purposes.

The right way to handle any rate-limiting response is to retry its
request as many times as needed until you get a
non-rate-limiting response.

Rate-limiting responses are sent in the following scenarios:

- You have exceeded your API key rate limit.
  ```json
  {"status": 429, "type": "/limits/over-user-limit"}
  ```

  When making an efficient use of Zyte API, getting a
  small percentage of rate-limiting responses due to exceeding your API key
  rate limit is expected and normal.
- The global rate limit for the target website has been exceeded.
  ```json
  {"status": 429, "type": "/limits/over-domain-limit"}
  ```
- You have exceeded your account rate limit for the target website.
  ```json
  {"status": 429, "type": "/limits/over-org-domain-limit"}
  ```
- Zyte API automatic extraction is overloaded.
  ```json
  {"status": 503, "type": "/extractor/over-global-limit"}
  ```
- Zyte API is overloaded.
  ```json
  {"status": 503, "type": "/limits/over-global-limit"}
  ```

> ###### SEE ALSO
>
> stats-rate-limiting at stats-api

### Unsuccessful responses

> ###### NOTE
>
> You are not charged for
> unsuccessful responses.

Zyte API sends an unsuccessful response, i.e. a response with an HTTP status
code of 400 or higher that is not a rate-limiting response, when Zyte API cannot provide the requested data.

Zyte API sends unsuccessful responses in the following scenarios:

- There has been a download error: a ban response,
  a permanent download error or a
  service error.
- Your request is invalid.
- Your account has been suspended.

#### Ban responses

> ###### TIP
>
> By default, scrapy-zyte-api and
> python-zyte-api automatically retry
> ban responses up to 3 times before giving up.

Zyte API sends an HTTP 520 response when a temporary error, usually a ban that could not be avoided in a timely fashion, prevents
downloading the requested URL.

```json
{"status": 520, "type": "/download/temporary-error"}
```

On certain websites, it is normal to get these responses sometimes. When you
do, retry your request until you get a successful response.

We closely monitor the success rate for the most popular websites, but less
popular websites may slip under our radar. If you get this response too often,
follow zapi-max-success-rate to discard issues in your request
parameters. If that does not help, [reach out to our expert anti-ban team](https://support.zyte.com/support/tickets/new).

#### Permanent download errors

Zyte API sends an HTTP 521 response when a permanent error prevents downloading
the requested URL.

```json
{"status": 521, "type": "/download/internal-error"}
```

You can wait for us to address the issue, or [ask to be notified when the issue
is resolved](https://support.zyte.com/support/tickets/new).

> ###### TIP
>
> For some websites, Zyte API may sometimes accidentally flag some ban
> responses as permanent download errors. If sending the same Zyte API
> request multiple times returns an HTTP 521 error only *sometimes*, you
> might want to treat HTTP 521 errors as HTTP 520 errors for the target
> website, i.e. retry them automatically, until we
> resolve your issue report.

#### Service errors

If Zyte API sends an HTTP 500 response, it means that the request took too long
or that there was an unexpected issue in Zyte API.

```json
{"status": 500, "type": "/server/timed-out"}
```

```json
{"status": 500, "type": "/server/internal"}
```

If the issue persists, feel free to [ask to be notified when the issue is
resolved](https://support.zyte.com/support/tickets/new).

#### Invalid requests

Zyte API may send a response with an HTTP status code of 400, 401, 421, 422 or 451
if there is an error in your request, including:

- You are using invalid parameters or parameter values.
  ```json
  {"status": 400, "type": "/request/invalid"}
  ```

- Your request body is invalid JSON.
  ```json
  {"status": 400, "type": "/request/invalid-json"}
  ```

- Your API key is not properly specified, e.g. missing or malformed.
  ```json
  {"status": 401, "type": "/auth/not-valid"}
  ```

- Your API key is unknown, e.g. it might be the wrong API key.
  ```json
  {"status": 401, "type": "/auth/key-not-found"}
  ```

- The domain you are trying to download is unreachable. Please check domain name and verify that domain is valid before retrying.
  ```json
  {"status": 421, "type": "/website/domain-unreachable"}
  ```

- You are using incompatible parameters, such as mixing
  [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/browserHtml) and [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody).
  ```json
  {"status": 422, "type": "/request/unprocessable"}
  ```

- You are targeting a domain that Zyte API does not allow.
  ```json
  {"status": 451, "type": "/download/domain-forbidden"}
  ```

#### Account suspension

Zyte API sends an HTTP 403 response if your Zyte API account is suspended.

```json
{"status": 403, "type": "/auth/account-suspended"}
```

Causes of account suspension include:

- Reaching the end of your trial.

  Setting a spending limit lifts your account suspension immediately.
- Reaching your spending limit.

  Increasing your spending limit lifts your account suspension immediately.

### Retrying requests

> ###### TIP
>
> scrapy-zyte-api and
> python-zyte-api handle retries for
> rate limiting and ban responses automatically.

You should automatically retry requests that get a rate-limiting or ban response.

When retrying requests automatically, please use an exponential backoff
algorithm: wait for some random time before every retry, and use an
exponentially longer time the more retries you have used for any given request.

For rate-limiting responses, you should retry forever, but use generous retry
times. For unsuccessful responses, you can use lower retry times, but you
should cap the number of retries per request, to prevent an infinite loop
from causing your code to hang.

These are some example ranges of random wait times for different scenarios:

| Retry   | Rate-limiting responses   | Unsuccessful responses   |
|---------|---------------------------|--------------------------|
| 1st     | 20-40 seconds             | 3-9 seconds              |
| 2nd     | 20-40 seconds             | 3-11 seconds             |
| 3rd     | 30-38 seconds             | 3-15 seconds             |
| 4th     | 30-46 seconds             | 3-23 seconds             |
| 5th     | 30-62 seconds             | 3-39 seconds             |
| 6th     | 30-94 seconds             | 3-62 seconds             |
| 7th     | 30-158 seconds            | 3-62 seconds             |
| 8th     | 30-286 seconds            | 3-62 seconds             |
| 9th     | 30-542 seconds            | 3-62 seconds             |
| 10th+   | 30-630 seconds            | 3-62 seconds             |

### Ban handling

A banned response is a response from a website that is different from the
response anyone would get in a browser.

Zyte API handles banned responses automatically and transparently where
possible, so that you never get a banned response.

For a given request, if Zyte API cannot avoid a ban in a reasonable time, Zyte
API sends you a ban response, for which you
are not charged. You can then retry your
request as many times as needed until Zyte API succeeds.

We monitor and proactively work on improving the success rate and response
times of Zyte API for the most popular websites, but less popular websites may
slip under our radar. If you encounter too many bans, please [reach out to our
expert anti-ban team](https://support.zyte.com/support/tickets/new).

If you ever get a successful Zyte API response that you believe is the result of a ban,
please [report it to our expert anti-ban team](https://support.zyte.com/support/tickets/new).

Zyte API uses many different techniques to avoid bans. However, Zyte API does
*not* log into websites automatically. Zyte API cannot automatically get you
data that is *always* locked behind a user login.

> ###### SEE ALSO
>
> zapi-permissions-control

#### Maximizing your success rate

Some request parameters can lower the success rate of Zyte API on some
websites. To maximize your success rate, i.e. minimize the rate of
ban responses that you get:

- **Ensure your URL is valid**

  Make sure your URL works when you use it in a browser set in incognito
  mode. Mind that some URLs may stop working after their website changes.

  Also ensure that [query string](https://en.wikipedia.org/wiki/Query_string) parameters, parameter order and values
  match what you get when you access that webpage manually from a browser.

  For complex URLs, you can alternatively use a browser request with actions to get to the target URL
  from a simpler URL.

- **Do not set a Referer header**

  It is often best not to set any value for the `Referer` request
  header, unless you are building an API
  request that expects it.

  If you set the header in the past because it was improving your success
  rate, but your success rate has lowered now, see if removing the header
  makes a difference.

- **Set headers for API requests**

  When targeting API endpoints, set the right request headers and cookies, i.e. those
  your browser sets when sending the same request.

  Mind that some of those values, such as session cookies or [CSRF](https://en.wikipedia.org/wiki/Cross-site_request_forgery) tokens,
  might expire with time or need to be read from responses to earlier
  requests. You may also need sessions to maximize
  your success rate in request chains.

  Alternatively, consider using a browser request
  instead. You can use actions if needed to trigger
  specific API requests, and either read the result from the browser
  HTML, if the API response data is loaded onto the
  webpage, or read the actual API response with network capture.

## Zyte API reference documentation

This is the complete reference documentation of the HTTP API of Zyte API.
For topic-based usage documentation, see zapi-usage.
All requests require [basic authentication](https://datatracker.ietf.org/doc/html/rfc7617#section-2).
Use your [Zyte API key](https://app.zyte.com/o/zyte-api/api-access) as username, and no password.
For example, if your Zyte API key is `foo`, base64-encode `foo:` as `Zm9vOg==`
and send the `Authorization` header with value `Basic Zm9vOg==`:
```none
Authorization: Basic Zm9vOg==
```

```yaml
openapi: 3.0.3
info:
  title: Web Data Extraction API
  version: 1.0.0
  description: A single API for web scraping
  contact:
    name: Zyte (Formerly Scrapinghub)
    url: https://www.zyte.com
servers:
- url: https://api.zyte.com/v1
  description: Zyte Extraction API Production server
security:
- BasicAuth: []

paths:
  /extract:
    post:
      operationId: extract
      summary: Process a single URL, return the result
      description: |
        Process a single URL, return the result.

        This endpoint blocks until the result is ready.
        It is intended for short-running operations.

        At least one of the following request fields must be set to true:
          - browserHtml
          - httpResponseBody
          - httpResponseHeaders
          - screenshot
          - An automatic extraction request field:
              - article
              - articleList
              - articleNavigation
              - forumThread
              - jobPosting
              - jobPostingNavigation
              - pageContent
              - product
              - productList
              - productNavigation
              - serp

        All automatic extraction data types support performing extraction using
        either a browser request or an HTTP request. Choose which using the
        corresponding `extractFrom` option, e.g.
        productOptions.extractFrom
        when extracting a product.

        When no option is specified, currently automatic extraction defaults to
        using a browser request, except for
        serp,
        where an HTTP request is used by default instead. In the future,
        however, the default value may depend on the target website.

        When automatic extraction uses a browser request, it can be combined
        with any fields compatible with
        browserHtml,
        e.g. screenshot.
        When automatic extraction uses an HTTP request, it can be combined with
        any fields compatible with
        httpResponseBody.
        serp
        cannot be combined with any other fields besides
        serpOptions and
        url.

        You cannot combine multiple automatic extraction request fields (e.g.
        product
        and productList)
        on the same request.

        You cannot combine
        httpResponseBody
        with a request field that is exclusive of browser requests (e.g.
        httpResponseBody
        and
        browserHtml).

        httpResponseHeaders
        can be requested alone or with any other valid combination of request
        fields except for
        serp.

        The request body size limit is 5MiB.
      requestBody:
        required: true
        description: An extraction request body
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ExtractRequest'
            examples:
              DownloadHttp:
                summary: Retrieve raw HTTP content from a page
                value:
                  url: https://example.com
                  httpResponseBody: true
              DownloadCustomHttpRequestHeaders:
                summary: Retrieve raw HTTP content from a page, using custom HTTP headers
                value:
                  url: https://example.com
                  httpResponseBody: true
                  customHttpRequestHeaders:
                  - name: X-APOLLO-OPERATION-NAME
                    value: nearByNodes
              DownloadHttpPostWithHttpRequestBody:
                summary: Retrieve raw HTTP content from a page using a POST request
                value:
                  url: https://example.com
                  httpResponseBody: true
                  httpRequestMethod: POST
                  httpRequestBody: WyJCV0kiLCAiRkxMIl0=
                  customHttpRequestHeaders:
                  - name: Content-Type
                    value: application/json
              DownloadHttpPostWithHttpRequestText:
                summary: Retrieve raw HTTP content from a page using a POST request
                value:
                  url: https://example.com
                  httpResponseBody: true
                  httpRequestMethod: POST
                  httpRequestText: '{"name":"John Doe","email":"johndoe@example.com","age":32,"address":{"street":"123 Main St","city":"Anytown","state":"CA","zip":"12345"},"phone_numbers":["+1 555 555 1212","+1 555 555 1313"]}'
                  customHttpRequestHeaders:
                  - name: Content-Type
                    value: application/json
              DownloadHttpHeaders:
                summary: Retrieve HTTP headers from a page using a POST request
                value:
                  url: https://example.com
                  httpResponseHeaders: true
              DownloadHtml:
                summary: Open a page in a browser, return HTML
                value:
                  url: https://example.com
                  browserHtml: true
              DownloadHtmlWithRequestHeaders:
                summary: Open a page in a browser and return HTML, setting a proper Referer header
                value:
                  url: https://example.com
                  browserHtml: true
                  requestHeaders:
                    referer: https://www.google.com
              DownloadScreenshot:
                summary: |
                  Open a page in a browser and return a JPEG screenshot
                  of the content visible on the browser window
                value:
                  url: https://example.com
                  screenshot: true
              DownloadScreenshotFullPagePng:
                summary: |
                  Open a page in a browser and return a full-page PNG
                  screenshot
                value:
                  url: https://example.com
                  screenshot: true
                  screenshotOptions:
                    fullPage: true
                    format: png
              DownloadHtmlEchoData:
                summary: echoData and jobId example
                description: |
                  Open a page in a browser, return HTML.

                  Pass the echoData and jobId fields through - they'll be
                  returned unchanged in the output.
                value:
                  url: https://example.com/foo
                  browserHtml: true
                  echoData:
                    seedUrl: https://example.com
                    foo: bar
                  jobId: 123/234/12
              DownloadHtmlActions:
                summary: actions example
                description: |
                  1. Open the target page in a browser
                  2. Type "Zyte" in the search box
                  3. Click the Search button
                  4. Wait for the results page to load
                value:
                  url: https://example.com/search
                  browserHtml: true
                  actions:
                  - action: type
                    selector:
                      value: '#searchbox'
                      type: css
                    text: Zyte
                  - action: click
                    selector:
                      value: '#searchbtn'
                      type: css
              ExtractProduct:
                summary: Extract Product information
                description: |
                  Extract Product information from a page:
                  price, name, etc.
                value:
                  url: https://example.com/foo
                  product: true
              ExtractProductWithHtml:
                summary: Extract Product information, as well as browser HTML
                description: |
                  Extract Product information, as well as browser HTML.
                  Make a request from Spanish geolocation.
                value:
                  url: https://example.com/foo
                  product: true
                  browserHtml: true
                  geolocation: ES
              ExtractProductRaw:
                summary: Extract Product information using an HTTP request
                description: |
                  Extract Product information using an HTTP request.
                value:
                  url: https://example.com/foo
                  product: true
                  productOptions:
                    extractFrom: httpResponseBody
              ExtractProductRawWithBody:
                summary: Extract Product information, as well as httpResponseBody
                description: |
                  Extract Product information using an HTTP request,
                  as well as httpResponseBody and httpResponseHeaders.
                  Make a request from Spain.
                value:
                  url: https://example.com/foo
                  product: true
                  productOptions:
                    extractFrom: httpResponseBody
                  httpResponseBody: true
                  httpResponseHeaders: true
                  geolocation: ES
              ExtractArticleCustomAttributes:
                summary: Extract Custom Attributes along with Article information
                description: |
                  Extract Custom Attributes along with Article information.
                value:
                  url: https://example.com/foo
                  article: true
                  customAttributes:
                    summary:
                      type: string
                      description: A two sentence article summary
                    article_sentiment:
                      type: string
                      enum:
                      - positive
                      - negative
                      - neutral
      responses:
        '200':
          description: Successful response. Contains the output requested.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Response200'
        '400':
          description: |
            Malformed request. See the error details to identify the exact problem in the request.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Unrecognized field:
                  value:
                    type: /request/invalid
                    title: Bad Request
                    status: 400
                    detail: >-
                      Unrecognized field "foo"
                Invalid JSON value:
                  value:
                    type: /request/invalid-json
                    title: Invalid JSON
                    status: 400
                    detail: >-
                      The submitted request body is not a valid JSON. Location: line 2, column 26. Details: Unrecognized token 'False': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
        '401':
          headers:
            WWW-Authenticate:
              schema:
                type: string
          description: |
            Authentication problem. See the error details.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Invalid authentication data:
                  value:
                    type: /auth/not-valid
                    title: Authentication Info Invalid
                    status: 401
                    detail: >-
                      No valid authentication info found in the request. Check the documentation for the correct authentication schema.
                Invalid API key:
                  value:
                    type: /auth/key-not-found
                    title: Authentication Key Not Found
                    status: 401
                    detail: >-
                      The authentication key is not valid or can't be matched.
        '403':
          description: |
            Your account is suspended or not allowed to make the request.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Account suspended:
                  value:
                    type: /auth/account-suspended
                    title: Account Suspended
                    status: 403
                    detail: >-
                      Account is suspended, check billing details.
        '421':
          description: |
            The request failed and should not be retried
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Incompatible parameters:
                  value:
                    type: /website/domain-unreachable
                    title: Domain Unreachable
                    status: 421
                    detail: >-
                      The domain is invalid or unreachable. Please check the domain name and try again. Verify the domain name and ensure it is registered and valid before restarting the crawl.
        '422':
          description: |
            The request couldn't be processed. Check the details.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Incompatible parameters:
                  value:
                    type: /request/unprocessable
                    title: Unprocessable Request
                    status: 422
                    detail: >-
                      Incompatible parameters were found in the request. Check details
        '429':
          headers:
            Retry-After:
              schema:
                type: integer
                format: int32
                minimum: 0
          description: |
            Too many requests, see the details.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Domain limit:
                  value:
                    type: /limits/over-domain-limit
                    title: Over Domain Requests Limit
                    status: 429
                    detail: >-
                      Too many requests. Retry in N seconds from 'Retry-After' header.
                User limit:
                  value:
                    type: /limits/over-user-limit
                    title: Over User Requests Limit
                    status: 429
                    detail: >-
                      Too many requests to a specific domain. Retry in N seconds from 'Retry-After' header.
                Organisation limit:
                  value:
                    type: /limits/over-org-domain-limit
                    title: Over Organisation Requests limit for the requested domain
                    status: 429
                    detail: >-
                      Too many requests to a specific domain. Retry in N seconds from 'Retry-After' header.
        '451':
          description: |
            Extraction for the domain is forbidden.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/ForbiddenDomainProblem'
              examples:
                Domain forbidden:
                  value:
                    type: /download/domain-forbidden
                    title: Domain Forbidden
                    status: 451
                    detail: >-
                      Extraction for the domain is forbidden.
                    blockedDomain: blocked-domain.example
        '500':
          description: |
            Request timeout or internal server error.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Internal error:
                  value:
                    type: /server/internal
                    title: Internal Server Error
                    status: 500
                    detail: >-
                      The server encountered an internal error. Please contact support or wait for us to resolve the issue.
                Timeout:
                  value:
                    type: /server/timed-out
                    title: Request Timed Out
                    status: 500
                    detail: >-
                      The request took too long and timed out. Try it again. Contact support if it fails consistently.
        '503':
          description: |
            System overload. See the details.
          headers:
            Retry-After:
              schema:
                type: integer
                format: int32
                minimum: 0
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Global request limit:
                  value:
                    type: /limits/over-global-limit
                    title: Global Requests Limit Reached
                    status: 503
                    detail: >-
                      Too many requests to the service. Retry in N seconds from 'Retry-After' header.
        '520':
          description: |
            A downloading error, possibly requiring user action.
          headers:
            Retry-After:
              schema:
                type: integer
                format: int32
                minimum: 0
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Website ban:
                  value:
                    type: /download/temporary-error
                    title: Website Ban
                    status: 520
                    detail: >-
                      Zyte API could not get a ban-free response in a reasonable time. See https://docs.zyte.com/zyte-api/usage/errors.html#ban-responses
        '521':
          description: |
            Permanent downloading error.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'
              examples:
                Internal download error:
                  value:
                    type: /download/internal-error
                    title: Internal Downloading Error
                    status: 521
                    detail: >-
                      Server encountered a problem while downloading. Check request and contact support.
        default:
          description: |
            Error. Check the code and problem object for additional information.
            Note: The client should be ready for the absence of a problem object.
            In this case, the HTTP status code should be used.
          content:
            application/problem+json:
              schema:
                $ref: '#/components/schemas/Problem'

components:
  securitySchemes:
    BasicAuth:
      type: http
      scheme: basic

  schemas:

    HTTPHeader:
      type: object
      description: A header name and value.
      required:
      - name
      - value
      properties:
        name:
          type: string
          description: The name of the header
          minLength: 1
        value:
          type: string
          description: The value of the header
      example:
        name: Content-Type
        value: text/html; charset=utf-8

    ExtractRequest:
      type: object
      required:
      - url
      properties:
        url:
          description: |
            An absolute URL to extract data from.

            The host name must be a domain name, it cannot be an IP address.
          example: https://example.com/item-page
          type: string
          maxLength: 8192
        requestHeaders:
          $ref: '#/components/schemas/RequestHeaders'
        tags:
          type: object
          nullable: true
          description: |
            Assign arbitrary key-value pairs to the request that you can use
            for filtering in the
            [Stats API](/zyte-api/usage/stats.md).

            Keys must be strings. Values must be strings or `null`.

            For example: `{"tags": {"foo": "bar", "baz": null}}`.
          additionalProperties:
            type: string
        ipType:
          description: |
            [Type of IP address](/zyte-api/usage/features.md)
            from which the request should be sent.

            If not specified, Zyte API will use an IP type that, for the target
            website, does not cause bans or unexpected response data.

            If you believe Zyte API is using the wrong default IP type for a
            website, please
            [reach out to our expert anti-ban team](https://support.zyte.com/support/tickets/new).

            [See an example](/zyte-api/usage/features.md).
          type: string
          enum:
          - datacenter
          - residential
        httpRequestMethod:
          description: |
            Request [HTTP method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods).

            Can only be used in combination with
            httpResponseBody.

            [See an example](/zyte-api/usage/http.md).
            See also:
            httpRequestText,
            httpRequestBody,
            customHttpRequestHeaders,
            httpResponseHeaders.
          type: string
          enum:
          - GET
          - POST
          - PUT
          - DELETE
          - OPTIONS
          - TRACE
          - PATCH
          - HEAD
        httpRequestBody:
          description: |
            [Base64](https://en.wikipedia.org/wiki/Base64)-encoded data to send
            as request body.

            Can only be used in combination with
            httpResponseBody.

            It usually needs to be used in combination with
            httpRequestMethod.

            If you only need to send UTF-8-encoded text, use
            httpRequestText
            instead to skip Base64-encoding. Note that you cannot combine both
            fields on the same request.

            [See an example](/zyte-api/usage/http.md).
            See also:
            customHttpRequestHeaders.
          type: string
          format: byte
          maxLength: 400000
        httpRequestText:
          description: |
            UTF-8 text to send as request body.

            Can only be used in combination with
            httpResponseBody.

            It usually needs to be used in combination with
            httpRequestMethod.

            If you need to send a binary or non-UTF-8 request body,
            use
            httpRequestBody
            instead. Note that you cannot combine both fields on the same
            request.

            [See an example](/zyte-api/usage/http.md).
            See also:
            customHttpRequestHeaders.
          type: string
          minLength: 1
          maxLength: 400000
          example: '{"name":"John Doe","email":"johndoe@example.com","age":32,"address":{"street":"123 Main St","city":"Anytown","state":"CA","zip":"12345"},"phone_numbers":["+1 555 555 1212","+1 555 555 1313"]}'
        customHttpRequestHeaders:
          description: |
            HTTP request headers.

            Can only be used in combination with
            httpResponseBody.
            To set headers with other outputs, see
            requestHeaders.

            Setting HTTP request headers has some caveats:

            -   Zyte API sends some headers automatically for
                [ban avoidance](/zyte-api/usage/errors.md),
                and may silently override or drop some of your custom headers
                for that purpose.

                However, your custom headers may override those automatic
                headers, and in doing so they can break the ban avoidance
                capabilities of Zyte API, as some websites may ban based on the
                presence, values, or order of certain headers.

            -   You cannot set the `Cookie` header. Use
                requestCookies
                instead.

            -   If you set multiple headers with the same name, only the last
                header value will be sent. To overcome this limitation, [join
                the header values with a comma into a single header value](https://stackoverflow.com/a/4371395).
                For example, replace `"customHttpRequestHeaders": [{"name":
                "foo", "value": "bar"}, {"name": "foo", "value": "baz"}]` with
                `"customHttpRequestHeaders": [{"name": "foo", "value":
                "bar,baz"}]`.

            [See an example](/zyte-api/usage/http.md).
            See also:
            httpRequestMethod,
            httpRequestText,
            httpRequestBody,
            httpResponseHeaders.
          type: array
          maxItems: 200
          items:
            $ref: '#/components/schemas/CustomHttpRequestHeader'
        httpResponseBody:
          description: |
            Set to `true` to get the HTTP response body in the
            httpResponseBody
            response field.

            This field is not compatible with
            [browser automation](/zyte-api/usage/browser.md).

            [See an example](/zyte-api/usage/http.md).
            See also:
            httpRequestMethod,
            httpRequestText,
            httpRequestBody,
            customHttpRequestHeaders.
          type: boolean
          default: false
        httpResponseHeaders:
          description: |
            Set to `true` to get the HTTP response headers in the
            httpResponseHeaders
            response field.

            [See an example](/zyte-api/usage/features.md).
            See also:
            customHttpRequestHeaders,
            requestHeaders.
          type: boolean
          default: false
        browserHtml:
          description: |
            Set to `true` to get the
            [browser HTML](/zyte-api/usage/browser.md)
            in the
            browserHtml
            response field.

            This field is not compatible with
            [HTTP requests](/zyte-api/usage/http.md).

            If you use
            actions,
            the browser HTML is generated *after* action execution has finished
            or timed out.

            By default,
            [iframes](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe)
            are empty. See
            includeIframes.

            To access content from the
            [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM),
            check out the corresponding example in the
            [actions documentation](/zyte-api/usage/browser.md).

            [See an example](/zyte-api/usage/browser.md).
            See also:
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        screenshot:
          description: |
            Set to `true` to get a page screenshot in the
            screenshot
            response field.

            This field is not compatible with
            [HTTP requests](/zyte-api/usage/http.md).

            To adjust the screenshot contents you can use
            screenshotOptions
            and
            viewport.

            If you use
            actions,
            the screenshot is generated *after* action execution has finished
            or timed out.

            [See an example](/zyte-api/usage/browser.md).
            See also:
            browserHtml,
            requestHeaders.
          type: boolean
          default: false
        screenshotOptions:
          $ref: '#/components/schemas/ScreenshotOptions'

        article:
          description: |
            Set to `true` to get article data in the
            article
            response field.

            The target page should only contain a single article, such as a
            blog post or a news article. For pages with multiple articles
            consider using
            articleList
            instead.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            articleOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            articleNavigation,
            browserHtml,
            screenshot,
            requestHeaders.
          default: false
          type: boolean

        articleOptions:
          description: |
            Additional options for article extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        articleList:
          description: |
            Set to `true` to get article list data in the
            articleList
            response field.

            The target page should contain multiple articles, usually as
            links or short snippets. Examples of such pages are main or
            category pages of news sites, main pages of blogs showing multiple
            posts, and other pages with multiple articles.

            Article list data is especially useful to get basic information
            about articles on a website, like a headline and a link to the
            article details, using a smaller number of requests, when article
            attributes are extracted directly from a article list page, without
            making individual
            article
            requests.

            To implement article crawling from article list pages, use
            articleNavigation,
            which also enables navigation through pagination links.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            articleListOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        articleListOptions:
          description: |
            Additional options for articleList extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        articleNavigation:
          description: |
            Set to `true` to get article navigation data in the
            articleNavigation
            response field.

            The target page should contain multiple articles and/or
            subcategories that can be followed.

            Article navigation data is especially useful for implementing
            article crawling, i.e. following links to article pages, as well as
            to subcategories and pagination that can in turn link to more
            article pages.

            Article navigation data can also be used to get basic information
            of articles and subcategories on a website, obtaining the URLs and
            link names of the articles and subcategories, without making
            individual requests for those articles.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            articleNavigationOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            article,
            articleList,
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        articleNavigationOptions:
          description: |
            Additional options for articleNavigation extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        forumThread:
          description: |
            Set to `true` to get forum threads data in the
            forumThread
            response field.

            The target page should contain an individual forum thread page on a forum website.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            forumThread.extractFrom
            to `"httpResponseBody"`.
            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            article,
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        forumThreadOptions:
          description: |
            Additional options for forumThread extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        jobPosting:
          description: |
            Set to `true` to get job posting data in the
            jobPosting
            response field.

            The target page should contain individual job posting page on a company website or on a job website.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            jobPostingOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        jobPostingOptions:
          description: |
            Additional options for jobPosting extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        jobPostingNavigation:
          description: |
            Set to `true` to get job posting navigation data in the
            jobPostingNavigation
            response field.

            The target page should contain multiple job postings and/or
            subcategories that can be followed.

            Job posting navigation data is especially useful for implementing
            job posting crawling, i.e. following links to job posting pages, as well as
            pagination that can in turn link to more
            job posting pages.

            Job posting navigation data can also be used to get basic information
            of job postings on a website, obtaining the URLs and
            link names of the job postings, without making
            individual requests for them.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            jobPostingNavigationOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            jobPosting,
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        jobPostingNavigationOptions:
          description: |
            Additional options for jobPostingNavigation extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        pageContent:
          description: |
            Set to `true` to get page content data in the
            pageContent
            response field.

            The target page can contain any type of data.

            Page content data is especially useful for understanding the layout and
            hierarchy of information on a page, enabling advanced processing such as
            content extraction, user experience analysis, and automated page summarization.

            Page content data can also be used to capture the main content intended
            for users, along with auxiliary navigation components such as headers, footers,
            sidebars, and pagination controls. This makes it possible to distinguish core
            content from supporting links used for site-wide navigation.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            pageContentOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        pageContentOptions:
          description: |
            Additional options for pageContent extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        product:
          description: |
            Set to `true` to get product data in the
            product
            response field.

            The target page should only contain a single product. For
            pages with multiple products consider using
            productList
            instead.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            productOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            [See an example](/zyte-api/usage/extract.md).
            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            productNavigation,
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        productOptions:
          description: |
            Additional options for product extraction.
          type: object
          properties:
            extractFrom:
              $ref: '#/components/schemas/ExtractionOptions/properties/extractFrom'
            model:
              type: string
              enum:
              - '2024-02-01'
              - '2024-09-16'
              description: |
                Model version to use for product extraction. If not specified,
                the "2024-09-16" version is used.

                Available product models:

                - "2024-02-01"

                - "2024-09-16"

                See [Model pinning](/zyte-api/usage/extract/index.md).
        productList:
          description: |
            Set to `true` to get product list data in the
            productList
            response field.

            The target page should contain a list or a grid of products.

            Product list data is especially useful to get basic information
            about products on a website using a smaller number of requests,
            when product attributes are extracted directly from a product list
            page, without making individual
            product
            requests.

            To implement product crawling from product list pages, use
            productNavigation,
            which also enables navigation through pagination links.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            productListOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        productListOptions:
          description: |
            Additional options for productList extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        productNavigation:
          description: |
            Set to `true` to get product navigation data in the
            productNavigation
            response field.

            The target page should contain multiple products and/or
            subcategories that can be followed.

            Product navigation data is especially useful for implementing
            product crawling, i.e. following links to product pages, as well as
            to subcategories and pagination that can in turn link to more
            product pages.

            Product navigation data can also be used to get basic information
            of products and subcategories on a website, obtaining the URLs and
            link names of the products and subcategories, without making
            individual requests for those products.

            To combine this field with
            [HTTP requests](/zyte-api/usage/http.md),
            set
            productNavigationOptions.extractFrom
            to `"httpResponseBody"`.

            If you use
            actions,
            data extraction happens *after* action execution has finished or
            timed out.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md),
            product,
            productList,
            browserHtml,
            screenshot,
            requestHeaders.
          type: boolean
          default: false

        productNavigationOptions:
          description: |
            Additional options for productNavigation extraction.
          $ref: '#/components/schemas/ExtractionOptions'

        customAttributes:
          type: object
          description: |
            Schema of the custom attributes to extract. This is a subset of the OpenAPI specification, using JSON syntax.

            Zyte custom attributes extraction uses a Large Language Model (LLM) operated by Zyte
            to obtain any structured data specified by this schema from any unstructured web page.
            This allows to perform extraction similar to standard schemas, such as article or product,
            but much more flexibly.

            When this field is specified, the
            customAttributes.values
            field in the response would contain the extracted data.

            When custom attributes extraction is requested, a standard extraction field must also be
            specified (e.g. product).
            This determines the part of the web page which would be passed to the LLM for custom attributes extraction,
            e.g. when a web page is a product, we're only going to pass the product information,
            ignoring other parts of the page, such as menu or footer, which makes extraction cheaper and more accurate.

            [See detailed documentation](/zyte-api/usage/custom-attributes.md).
            Additionally, to see a request example, scroll up to the right-hand sidebar **Request samples**,
            and select “Extract Custom Attributes along with Article information” under **Example**.

          nullable: true
          additionalProperties:
            $ref: '#/components/schemas/CustomAttribute'
          maxProperties: 20

        customAttributesOptions:
          type: object
          description: Additional options for custom attributes extraction.
          properties:
            method:
              type: string
              description: |
                Method to use for custom attributes extraction:
                * "generate" (default) generates extracted data with the help of a generative Large Language Model (LLM).
                  It is the most powerful and versatile extraction method, but also the most expensive one,
                  with [variable per-request cost](/zyte-api/pricing.md).

                * "extract" locates extracted data in the requested web page with the help of a non-generative LLM.
                  It only supports a subset of the schema (only string, integer and number types),
                  and can't perform generative tasks such as summarization or data transformation.
                  It is however much cheaper compared to the generative method and has a
                  [fixed per-request cost](/zyte-api/pricing.md).
              enum:
              - generate
              - extract
              default: generate
            maxInputTokens:
              type: integer
              minimum: 1
              description: |
                Limit on the number of input tokens for custom attribute extraction with the "generate" method.

                This includes the schema as well, but not our internal fixed prompt with the LLM instruction.

                When the number of tokens for schema and page text is above the specified maxInputTokens,
                we truncate the page text to fit in maxInputTokens.
                This may result in quality degradation or data not extracted from the page because it was truncated.

                Tokens are words or word pieces, for example
                ``{"price": "2.00 $"}`` is 9 tokens:
                ``{"``, ``price``, ``":``, `` "``, ``2``, ``.``, ``00``, `` $``, ``"}``.
            maxOutputTokens:
              type: integer
              minimum: 1
              description: |
                Limit on the number of output tokens for extracted custom attributes with the "generate" method.
                This field can be set to limit the extraction cost, but may result in quality degradation.

                See an example of token counting in the
                maxInputTokens
                field above.

        geolocation:
          $ref: '#/components/schemas/CountryCode'

        javascript:
          description: |
            Forces JavaScript execution on a
            [browser request](/zyte-api/usage/browser.md)
            to be enabled (`true`) or disabled (`false`).

            By default Zyte API enables or disables JavaScript execution for a
            request depending on which option makes it easier to avoid bans.
            Use this request field to override that choice.

            Passing this request field when requesting automatic extraction (
            product,
            article,
            etc.) may impact the quality of the returned data, as it might
            override the optimal value for automatic extraction.

            This field is not compatible with
            [HTTP requests](/zyte-api/usage/http.md).

            [See an example](/zyte-api/usage/browser.md).
          type: boolean

        actions:
          $ref: '#/components/schemas/ActionSequence'

        jobId:
          description: |
            ID of the
            [Scrapy Cloud](/scrapy-cloud/get-started.md)
            job from which this request has been sent, to be returned in the
            jobId
            response field.

            This field is meant to help with request tracking.

            [scrapy-zyte-api](https://scrapy-zyte-api.readthedocs.io/en/latest/index.html)
            fills this request field automatically.

            [See an example](/zyte-api/usage/features.md).
            See also:
            echoData.
          type: string
          maxLength: 100
          example: example-job-1

        echoData:
          description: |
            This field is returned in the
            echoData
            response field, verbatim.

            This field can be useful, for example, to keep track of the
            original request order when
            [sending multiple requests in parallel](/zyte-api/usage/optimize.md).

            The request can be rejected if the data is too big.

            [See an example](/zyte-api/usage/features.md).
            See also:
            jobId.
        viewport:
          $ref: '#/components/schemas/Viewport'

        followRedirect:
          description: |
            Whether to follow
            [HTTP redirection](https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections)
            or not.

            Only supported in
            [HTTP requests](/zyte-api/usage/http.md),
            [browser requests always follow redirection](/zyte-api/usage/browser.md).
          type: boolean

        sessionContext:
          $ref: '#/components/schemas/SessionContext'

        sessionContextParameters:
          $ref: '#/components/schemas/SessionContextParameters'

        session:
          $ref: '#/components/schemas/Session'

        networkCapture:
          $ref: '#/components/schemas/NetworkCaptureFilterSequence'

        device:
          description: |
            Type of device to emulate during your request.

            A desktop device is emulated by default.

            Can only be used in combination with
            httpResponseBody.
          type: string
          enum:
          - desktop
          - mobile

        cookieManagement:
          description: |
            Cookie management method

            It determines how to handle user cookies, defined through
            requestCookies,
            and automatic cookies, cookies automatically generated by Zyte API.
            `auto` (default) uses user cookies if defined, or automatic
            cookies otherwise.

            `discard` uses user cookies if defined, or no cookies
            otherwise.
          enum:
          - auto
          - discard
          default: auto
        requestCookies:
          type: array
          description: |
            A list of cookies to be sent with a request.

            You can use the contents of the
            responseCookies
            response field as a value for this request field.

            [See an example](/zyte-api/usage/features.md).
          items:
            $ref: '#/components/schemas/Cookie'
          maxItems: 100
        responseCookies:
          description: |
            Set to `true` to get the list of cookies set during a request
            in the
            responseCookies
            response field.

            [See an example](/zyte-api/usage/features.md).
            See also:
            requestCookies.
          type: boolean
          default: false
        serp:
          type: boolean
          description: |
            Set to `true` to get the data of a search engine results page
            (SERP) in the
            serp
            response field.

            The
            target URL
            should be a search URL that belongs to a
            [Google domain](https://www.google.com/supported_domains).

            Currently, you cannot combine this field with any other request
            fields besides
            serpOptions
            and
            url.

            See also:
            [List of all automatic extraction request fields](/zyte-api/usage/extract.md).
        serpOptions:
          $ref: '#/components/schemas/SerpOptions'

        includeIframes:
          type: boolean
          description: |
            Whether to add the content of
            [iframes](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe)
            into
            browserHtml.

            Note that iframes are visible in screenshots even if this is set to
            `false`.

            See also:
            browserHtml.
          default: false
      additionalProperties: false
    Response200:
      required:
      - url
      properties:
        url:
          type: string
          description: |
            URL the data was extracted from.

            Could be different from the input URL in case of
            [redirection](/zyte-api/usage/http.md).

            See also:
            statusCode.
          example: https://example.com/item-page/
        statusCode:
          type: integer
          description: |
            The HTTP status code retrieved from the target page.

            If
            [redirection is followed](/zyte-api/usage/http.md),
            this is the status code of the response *after* redirection.

            See also:
            url.
          example: 200
        httpResponseBody:
          description: |
            [Base64-encoded](https://en.wikipedia.org/wiki/Base64)
            HTTP response body.

            To get this response field, set the
            httpResponseBody
            request field to `true`.

            Unlike
            browserHtml,
            this field supports binary response bodies, such as image files or
            PDF files. This is the reason why this field is Base64-encoded,
            JSON does not support binary data.

            [See an example](/zyte-api/usage/http.md).
          type: string
          format: byte
        httpResponseHeaders:
          description: |
            HTTP response headers.

            To get this response field, set the
            httpResponseHeaders
            request field to `true`.

            The `Content-Encoding` header value (e.g. `gzip`, `br`, etc.)
            should not be used to decompress
            httpResponseBody,
            Zyte API already decompresses the body of compressed responses.

            The `Set-Cookie` header value, when present, contains the header
            value received from the main HTTP response. These cookies could
            have changed later on, e.g. during browser rendering. Usually you
            will want to ignore this header in favor of
            responseCookies,
            which provides the *final* cookies.

            [See an example](/zyte-api/usage/features.md).
          type: array
          items:
            $ref: '#/components/schemas/HTTPHeader'
        browserHtml:
          description: |
            [Browser HTML](/zyte-api/usage/browser.md).

            To get this response field, set the
            browserHtml
            request field to `true`.

            Browser HTML does not include the contents of
            [iframes](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe)
            or the
            [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM).

            [See an example](/zyte-api/usage/browser.md).
          type: string
          example: <html>Downloaded data.</html>
        session:
          $ref: '#/components/schemas/Session'
        screenshot:
          description: |
            [Base64-encoded](https://en.wikipedia.org/wiki/Base64)
            page screenshot file data.

            To get this response field, set the
            screenshot
            request field to `true`.

            screenshotOptions.format
            determines the file format of the screenshot data.

            [See an example](/zyte-api/usage/browser.md).
          type: string
          format: byte

        article:
          allOf:
          - $ref: '#/components/schemas/Article'
          - description: |
              Article data.

              To get this response field, set the
              article
              request field to `true`.

        articleList:
          allOf:
          - $ref: '#/components/schemas/ArticleList'
          - description: |
              Article list data.

              To get this response field, set the
              articleList
              request field to `true`.

        articleNavigation:
          allOf:
          - $ref: '#/components/schemas/ArticleNavigation'
          - description: |
              Article navigation data.

              To get this response field, set the
              articleNavigation
              request field to `true`.

        forumThread:
          allOf:
          - $ref: '#/components/schemas/ForumThread'
          - description: |
              Forum thread data.

              To get this response field, set the
              forumThread
              request field to `true`.

        jobPosting:
          allOf:
          - $ref: '#/components/schemas/JobPosting'
          - description: |
              Job posting data.

              To get this response field, set the
              jobPosting
              request field to `true`.

        jobPostingNavigation:
          allOf:
          - $ref: '#/components/schemas/JobPostingNavigation'
          - description: |
              Job posting navigation data.

              To get this response field, set the
              jobPostingNavigation
              request field to `true`.
        pageContent:
          allOf:
          - $ref: '#/components/schemas/PageContent'
          - description: |
              Page content data.

              To get this response field, set the
              pageContent
              request field to `true`.
        product:
          allOf:
          - $ref: '#/components/schemas/Product'
          - description: |
              Product data.

              To get this response field, set the
              product
              request field to `true`.

        productList:
          allOf:
          - $ref: '#/components/schemas/ProductList'
          - description: |
              Product list data.

              To get this response field, set the
              productList
              request field to `true`.

        productNavigation:
          allOf:
          - $ref: '#/components/schemas/ProductNavigation'
          - description: |
              Product navigation data.

              To get this response field, set the
              productNavigation
              request field to `true`.

        customAttributes:
          type: object
          properties:
            values:
              type: object
              additionalProperties: true
              description: |
                Values of extracted custom attributes, extracted according to the requested
                customAttributes
                schema.
            metadata:
              type: object
              properties:
                inputTokens:
                  type: integer
                  description: |
                    Total number of used input tokens, excluding our internal fixed prompt with the LLM instruction,
                    when using the "generate" method.
                outputTokens:
                  type: integer
                  description: |
                    Total number of used output tokens, when using the "generate" method.
                textInputTokens:
                  type: integer
                  description: |
                    Total number of input tokens used for the text of the web page, excluding
                    the schema and our internal fixed prompt with the LLM instruction,
                    when using the "generate" method.
                    Already included in the customAttributes.metadata.inputTokens
                    field.
                textInputTokensBeforeTruncation:
                  type: integer
                  description: |
                    textInputTokens before the text was truncated to fit into the input limits,
                    either set via
                    customAttributesOptions.maxInputTokens
                    or due to the model limitation returned in
                    customAttributes.metadata.maxInputTokens,
                    when using the "generate" method.
                maxInputTokens:
                  type: integer
                  description: |
                    Maximum number of allowed input tokens for the model, when using the "generate" method.
                excludedPIIAttributes:
                  type: array
                  items:
                    type: string
                  description: |
                    A list of all attributes dropped from the output due to a risk of PII
                    (Personally Identifiable Information) extraction.
                error:
                  type: string
                  description: |
                    * The ``extraction/unparsable-response`` error is given when the LLM response could not be parsed or recovered.
                      If this error happens, we suggest simplifying the task or reducing the number of attributes.
                    * The ``extraction/schema-size-exceeded`` error is given when the schema did not fit into the input limits,
                      leaving no space for the input text, and therefore the LLM could not be used. If this error happens, we suggest
                      either making the schema smaller (fewer attributes and/or shorter descriptions),
                      or increasing
                      customAttributesOptions.maxInputTokens.

        echoData:
          description: |
            Arbitrary data set on the
            echoData
            request field.

            [See an example](/zyte-api/usage/features.md).
          type: object

        jobId:
          description: |
            [Scrapy Cloud](/scrapy-cloud/get-started.md)
            job ID set on the
            jobId
            request field.

            [See an example](/zyte-api/usage/features.md).
          type: string
          maxLength: 100
          example: example-job-1

        actions:
          description: |
            Debug information about the execution of the action sequence set in
            the
            actions
            request field.

            Action order in the response always matches that of the request.
          type: array
          items:
            $ref: '#/components/schemas/ActionResult'

        responseCookies:
          description: |
            List of cookies set during the request.

            To get this response field, set the
            responseCookies
            request field to `true`.

            [See an example](/zyte-api/usage/features.md).
            See also:
            requestCookies.
          type: array
          items:
            $ref: '#/components/schemas/Cookie'
        networkCapture:
          type: array
          description: |
            Responses captured by filters specified in the
            networkCapture
            request parameter.
          items:
            $ref: '#/components/schemas/CapturedResponse'
        serp:
          $ref: '#/components/schemas/SearchResultsPage'

    Problem:
      type: object
      properties:
        type:
          type: string
          format: uri-reference
          description: |
            A URI reference that uniquely identifies the problem type, only in
            the context of the provided API.

            Opposed to the specification in RFC-7807, it is neither recommended
            to be dereferenceable and point to human-readable documentation nor
            globally unique for the problem type.
          default: about:blank
          example: /problem/connection-error
        title:
          type: string
          description: >
            A short summary of the problem type. Written in English and readable for engineers, usually not suited for non-technical stakeholders, and not localized.
          example: Service Unavailable
        status:
          type: integer
          format: int32
          description: >
            The HTTP status code generated by Zyte API for this occurrence of the problem.
          minimum: 100
          maximum: 600
          exclusiveMaximum: true
          example: 503
        detail:
          type: string
          description: >
            A human-readable explanation specific to this occurrence of the problem that is helpful to locate the source of the problem and gives advice on how to proceed.

            Written in English and readable for engineers, usually not suited for non-technical stakeholders, and not localized.
          example: Connection to database timed out

    ForbiddenDomainProblem:
      allOf:
      - $ref: '#/components/schemas/Problem'
      properties:
        blockedDomain:
          type: string
          description: >
            The domain which extraction cannot be performed.
          example: forbiddendomain.com
    SessionContext:
      description: |
        User-defined name-value pairs to
        [request a server-managed session](/zyte-api/usage/features.md)
        initialized with
        sessionContextParameters).

        For every subsequent request with the same session context, Zyte API
        will either reuse an available session created for the same session
        context or create a new session using
        sessionContextParameters).

        Server-managed sessions expire after 4 hours or 3 ban responses. If you
        are targeting websites that silently expire their sessions before the
        4-hour mark, i.e. they revert the effects of your
        sessionContextParameters
        but requests continue working as expected otherwise, consider using
        [client-managed sessions](/zyte-api/usage/features.md)
        for higher session control.

        [See an example](/zyte-api/usage/features.md).
        See also:
        requestCookies,
        responseCookies.
      type: array
      items:
        type: object
        maxItems: 10
        properties:
          name:
            type: string
            description: Name of the context identifier.
            minLength: 1
            maxLength: 30
            nullable: false
          value:
            type: string
            description: Value of the context identifier.
            minLength: 1
            maxLength: 100
            nullable: false
        required:
        - name
        - value
    SessionContextParameters:
      description: |
        Parameters to create a server-managed session for a given
        sessionContext).

        [See an example](/zyte-api/usage/features.md).
        See also:
        actions.
      type: object
      properties:
        actions:
          $ref: '#/components/schemas/SessionContextActionSequence'
    ActionResult:
      description: |
        Returns detailed information about the elapsed time and errors for a particular action.
      type: object
      properties:
        action:
          description: The type of action submitted
          type: string
          example: waitForSelector
        elapsedTime:
          description: Elapsed time in seconds
          type: number
        status:
          description: |
            Status of execution of a particular action
            * success - When the action finishes execution successfully without any errors
            * continued - When the action fails, but the execution of the action sequence is continued
            * returned - When the action fails and stops execution
            * notExecuted - When a a prior action has failed, thereby not executing the current action
          type: string
          enum:
          - success
          - continued
          - returned
          - notExecuted
          example: success
        error:
          description: Detailed information about the underlying error.
          type: string
          example: Request timeout while waiting for selector '#form-input'
        interactionLogs:
          description: |
            Messages logged with `console.log()` from
            [browser scripts](/zyte-api/ide/index.md).
          type: array
          items:
            $ref: '#/components/schemas/InteractionLogEntry'
      required:
      - action
      - elapsedTime
      - status

    InteractionLogEntry:
      description: Interaction log entry
      type: object
      properties:
        time:
          description: The ISO 8601 format of the time
          type: string
        level:
          description: The log level
          type: string
          enum:
          - debug
          - info
          - warning
          - error
          - warn
        message:
          description: The log message
          type: string

    ActionTimeout:
      description: Maximum wait time in seconds.
      type: number
      minimum: 0.0
      default: 5.0
      maximum: 15.0

    UrlPattern:
      description: |
        A string to compare with a URL according to `urlMatchingOptions`.
      type: string
      example:
      - https://example.com/api
      - /api/store/fulfilment

    PatternMatchingOptions:
      description: |
        How to compare a user-defined string with a target string:

        - `contains` matches if the user-defined string is a substring of the
          target string.

        - `exact` matches if the user-defined string is an exact match of the
          target string.

        - `startsWith` matches if the target string starts with the
          user-defined string.

        - `endsWith` matches if the target string ends with the user-defined
          string.

        Comparisons are case-sensitive. Regular expressions or wildcard
        characters are not supported.
      type: string
      enum:
      - startsWith
      - endsWith
      - contains
      - exact
      default: contains

    ActionSelector:
      description: |
        A CSS or XPath selector to search for an element.
      properties:
        type:
          description: The type of selector - CSS or XPath
          type: string
          enum:
          - css
          - xpath
        value:
          type: string
          minLength: 1
          maxLength: 500
        state:
          description: |
            State can be either of the following values and defaults to visible
            * 'visible' - The element has a non-empty bounding box and no visibility:hidden. Note that an element without content or with display:none has an empty bounding box, and is not considered visible.
            * 'hidden' - The element is either detached from the DOM, or has an empty bounding box or visibility:hidden.
            This is the opposite of the 'visible' option.
            * 'attached' - The element is present in the DOM; it can be visible or hidden
          type: string
          enum:
          - attached
          - visible
          - hidden
          default: visible
      required:
      - type
      - value

    onError:
      description: |
        Handle errors encountered while executing a particular action.
        * continue - When a particular action fails, the action sequence continues, executing the next actions
        * return - When a particular actions fails, the action sequence stops, not executing any more actions

        When an action sequence finishes prematurely the service will return the entire response body up until the
        point of execution.
      type: string
      enum:
      - continue
      - return
      default: return

    ActionSequence:
      description: |
        Sequence of browser actions to execute.

        Select an action below to see its API reference.

        When using actions, you get the
        actions
        response field with debug information about action execution.

        [See an example](/zyte-api/usage/browser.md).
      type: array
      items:
        oneOf:
        - $ref: '#/components/schemas/click'
        - $ref: '#/components/schemas/doubleClick'
        - $ref: '#/components/schemas/evaluate'
        - $ref: '#/components/schemas/goto'
        - $ref: '#/components/schemas/hide'
        - $ref: '#/components/schemas/hover'
        - $ref: '#/components/schemas/interaction'
        - $ref: '#/components/schemas/keyPress'
        - $ref: '#/components/schemas/reload'
        - $ref: '#/components/schemas/scrollBottom'
        - $ref: '#/components/schemas/scrollTo'
        - $ref: '#/components/schemas/searchKeyword'
        - $ref: '#/components/schemas/select'
        - $ref: '#/components/schemas/setLocation'
        - $ref: '#/components/schemas/type'
        - $ref: '#/components/schemas/waitForNavigation'
        - $ref: '#/components/schemas/waitForRequest'
        - $ref: '#/components/schemas/waitForResponse'
        - $ref: '#/components/schemas/waitForSelector'
        - $ref: '#/components/schemas/waitForTimeout'
    Action:
      description: Action to perform.
      type: object
      properties:
        onError:
          $ref: '#/components/schemas/onError'
      example:
        action: click
        selector:
          type: css
          value: '#main'

    GoToOptions:
      description: Used to customise navigation options
      type: object
      properties:
        waitUntil:
          description: |
            When to consider navigation succeeded, defaults to load. Events can be either:
            * load - consider navigation to be finished when the load event is fired.
            * networkidle0 -  consider navigation to be finished when there are no more than 0 network connections for at least 500 ms
            * domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
          type: string
          enum:
          - load
          - networkidle0
          - domcontentloaded
          default: load
        timeout:
          description: Maximum navigation time in seconds, defaults to 30 seconds. Pass 0 to disable timeout.
          type: integer
          default: 30
          minimum: 0

    ScreenshotOptions:
      description: |
        Options for the screenshot taken when the
        screenshot
        request field is `true`.
      type: object
      properties:
        format:
          description: |
            File format.

            JPEG screenshots are taken with a quality of 75%.
          type: string
          enum:
          - png
          - jpeg
          default: jpeg
        fullPage:
          description: |
            When `true`, the screenshot features the full page. When
            `false`, it features only what is visible on the browser window
            (viewport).

            Full page screenshots:

            -   Are only available in JPEG format.

            -   Have a minimum resolution of 1920x1080, i.e. for pages
                smaller than 1920x1080, the screenshot looks the same
                regardless of the value of `fullPage`.

            -   Any image exceeding 5000 (width) x 10000 (height) pixels will be clipped to those
                dimensions.
          type: boolean
          default: false

    ExtractionOptions:
      description: |
        Options for automatic extraction.
      type: object
      properties:
        extractFrom:
          type: string
          enum:
          - httpResponseBody
          - browserHtml
          - browserHtmlOnly
          description: |
            [Extraction source](/zyte-api/usage/extract/index.md).

            `httpResponseBody` extracts from
            httpResponseBody.
            It is usually faster and cheaper.

            `browserHtmlOnly` extracts from
            browserHtml.
            It typically improves quality over `httpResponseBody` on
            JavaScript-heavy web pages.

            `browserHtml` extracts from both
            browserHtml
            and visual features of the rendered web page. It typically improves
            quality over `browserHtmlOnly`, but is not as robust in case of
            rendering issues.

            If not specified, `browserHtml` is currently used by default for
            [AI extraction](/zyte-api/usage/extract/index.md),
            while `httpResponseBody` is used by default for
            [non-AI extraction](/zyte-api/usage/extract/index.md).
            In the future, the default value may depend on the target website.
    RequestHeaders:
      description: |
        HTTP request headers.

        Can only be used in a
        [browser request](/zyte-api/usage/browser.md).
        For
        [HTTP requests](/zyte-api/usage/http.md), see
        customHttpRequestHeaders.

        At the moment it only supports the
        [Referer header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer).

        [See an example](/zyte-api/usage/browser.md).
      properties:
        referer:
          description: |
            [Referer header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer).
          type: string
          example: https://www.google.com/

    CustomHttpRequestHeader:
      properties:
        name:
          type: string
          maxLength: 200
          example: X-APOLLO-OPERATION-NAME
        value:
          type: string
          maxLength: 7000
          example: nearByNodes
    Session:
      description: |
        Parameters to create or reuse a
        [client-managed session](/zyte-api/usage/features.md).

        If `id` does not match one of your running sessions, a new session is
        created with that session ID. Otherwise, the matching running session
        is reused.

        Client-managed sessions may expire due to any of the following:

        -   15 minutes (900 seconds) have passed since the session was created.

        -   2 minutes (120 seconds) have passed since the session use.

        -   For 3 times in a row, requests using this session got banned.

        For 5-10 minutes after a session expires, Zyte API keeps track of the
        expired session and does not allow re-using it. After that time,
        attempts to reuse the session will instead create a new session.

        [See an example](/zyte-api/usage/features.md).
      example:
        id: ab837d21-f848-42b2-8e88-47ea9d84bad0
      properties:
        id:
          description: |
            User-defined session ID.

            It must be a
            [version 4 UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)),
            i.e. a randomly-generated UUID.
          type: string
      type: object

    Cookie:
      type: object
      properties:
        name:
          type: string
          maxLength: 4085
          description: Cookie name
        value:
          type: string
          maxLength: 4085
          description: Cookie value
        domain:
          type: string
          maxLength: 253
          description: Domain the cookie belongs to
        path:
          type: string
          description: Path the cookie belongs to
        expires:
          type: integer
          format: int64
          description: Unix time in seconds.
        httpOnly:
          type: boolean
        secure:
          type: boolean
        sameSite:
          type: string
          enum:
          - Strict
          - Lax
          - Extended
          - None
      required:
      - name
      - value
      - domain

    PostalAddress:
      description: Postal address to be set
      type: object
      properties:
        addressCountry:
          description: The country code in [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
          type: string
          example: US
        addressRegion:
          description: The region in which the address is. This value is specific to the website.
          type: string
          example: California
        streetAddress:
          description: The street address.
          type: string
        postalCode:
          description: The postal code.
          type: string

    NetworkCaptureFilterSequence:
      type: array
      maxItems: 10
      description: |
        Filters to capture browser network responses.

        HTTP responses received during browser rendering (including
        action
        execution) will be returned in the
        networkCapture
        response field if they match any of the filters defined here.

        You can capture up to 10 responses, provided the sum of their bodies
        does not exceed 5 MiB. If they do exceed that limit, only the first
        captured responses within the limit are returned.

        [See an example](/zyte-api/usage/browser.md).
      items:
        $ref: '#/components/schemas/NetworkCaptureFilter'

    NetworkCaptureFilter:
      type: object
      # Note: In the rendered docs, this description only appears in the
      # networkCapture.filter response field, so the wording is tailored for
      # that.
      description: |
        Filter defined in the
        networkCapture
        request field that matched the captured response.
      properties:
        filterType:
          type: string
          enum:
          - url
          - resourceType
        httpResponseBody:
          type: boolean
          default: false
          description: |
            Set to `true` to get the body of the captured response in the
            [networkCapture[].httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/networkCapture.httpResponseBody)
            response field.
      required:
      - filterType
      discriminator:
        propertyName: filterType
        mapping:
          url: '#/components/schemas/UrlFilter'
          resourceType: '#/components/schemas/ResourceTypeFilter'

    UrlFilter:
      description: An object specifying how to capture responses by matching URL
      allOf:
      - $ref: '#/components/schemas/NetworkCaptureFilter'
      - properties:
          value:
            type: string
            minLength: 3
            maxLength: 8192
            description: |
              A string to compare with the URL of network responses according
              to `matchType`.
          matchType:
            $ref: '#/components/schemas/PatternMatchingOptions'
      - required:
        - value
        - matchType
    ResourceTypeFilter:
      description: An object specifying how to capture responses by resource type
      allOf:
      - $ref: '#/components/schemas/NetworkCaptureFilter'
      - properties:
          resourceType:
            type: string
            enum:
            - document
            - xhr
            description: |
              A resource type for a network response to match:

              - `document` is the source HTML document, which might change
                during browser rendering or through
                actions.

              - `xhr` is a response obtained using
                [XMLHttpRequest](http://devdoc.net/web/developer.mozilla.org/en-US/docs/XMLHttpRequest.1.html).
      - required:
        - resourceType

    CapturedResponse:
      type: object
      properties:
        interceptionStatus:
          type: object
          description: |
            Exit status of the network capture.

            If `interceptionStatus.status` is `error`, `httpResponseBody` is
            not delivered.

            Possible causes of error include all matching responses exceeding
            the maximum total body size of 5 MiB.
          properties:
            status:
              type: string
              enum:
              - success
              - error
            error:
              type: string
              description: |
                Error message.

                This field is only present if `interceptionStatus.status` is
                `error`.
        statusCode:
          type: integer
          description: HTTP status code of the captured response.
        httpResponseBody:
          type: string
          format: byte
          description: |
            [Base64](https://en.wikipedia.org/wiki/Base64)-encoded body of the
            captured response.

            To get this response field, set the
            [networkCapture[].httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/networkCapture.httpResponseBody)
            request field to `true`.
        url:
          type: string
          format: uri
          description: Captured response URL.
        headers:
          type: object
          description: Captured response headers.
        filter:
          $ref: '#/components/schemas/NetworkCaptureFilter'
        request:
          type: object
          description: Captured request that got the captured response.
          properties:
            url:
              type: string
              description: URL of the captured request.
            headers:
              type: object
              description: Headers of the captured request.
            method:
              type: string
              description: HTTP method of the captured request.
            body:
              type: string
              description: Body of the captured request, if any.
    Viewport:
      type: object
      description: |
        [Browser viewport](https://developer.mozilla.org/en-US/docs/Glossary/Viewport).
      properties:
        width:
          type: integer
          description: Viewport width, in pixels.
          default: 1920
          minimum: 320
          maximum: 5120
        height:
          type: integer
          description: Viewport height, in pixels.
          default: 1080
          minimum: 360
          maximum: 4096

    CountryCode:
      description: |
        [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
        code of a country from which the request should be sent, i.e. the
        request
        [geolocation](/zyte-api/usage/features.md).

        If not specified, Zyte API will use a geolocation that, for the target
        website, does not cause bans or unexpected locale changes in the
        response data, such as the wrong language, currency, date format, time
        zone, etc.

        If you believe Zyte API is using the wrong default geolocation for a
        website, please
        [reach out to our expert anti-ban team](https://support.zyte.com/support/tickets/new).

        For some websites, however, you might want to set a custom geolocation.
        For example, you may be interested in visiting the same URL from
        different locations.

        Zyte API provides 2 sets of geolocations. Standard geolocations are
        `AU`, `BE`, `BR`, `CA`, `CN`, `DE`, `ES`, `FR`, `GB`, `IN`, `IT`, `JP`,
        `KR`, `MX`, `NL`, `PL`, `RU`, `TR`, `US`, and `ZA`. All other
        geolocations are
        [extended geolocations](/zyte-api/usage/features.md).

        [See an example](/zyte-api/usage/features.md).
      example: US
      type: string
      enum:
      - AW
      - AF
      - AO
      - AI
      - AX
      - AL
      - AD
      - AE
      - AR
      - AM
      - AS
      - AQ
      - TF
      - AG
      - AU
      - AT
      - AZ
      - BI
      - BE
      - BJ
      - BQ
      - BF
      - BD
      - BG
      - BH
      - BS
      - BA
      - BL
      - BY
      - BZ
      - BM
      - BO
      - BR
      - BB
      - BN
      - BT
      - BV
      - BW
      - CF
      - CA
      - CC
      - CH
      - CL
      - CN
      - CI
      - CM
      - CD
      - CG
      - CK
      - CO
      - KM
      - CV
      - CR
      - CU
      - CW
      - CX
      - KY
      - CY
      - CZ
      - DE
      - DJ
      - DM
      - DK
      - DO
      - DZ
      - EC
      - EG
      - ER
      - EH
      - ES
      - EE
      - ET
      - FI
      - FJ
      - FK
      - FR
      - FO
      - FM
      - GA
      - GB
      - GE
      - GG
      - GH
      - GI
      - GN
      - GP
      - GM
      - GW
      - GQ
      - GR
      - GD
      - GL
      - GT
      - GF
      - GU
      - GY
      - HK
      - HM
      - HN
      - HR
      - HT
      - HU
      - ID
      - IM
      - IN
      - IO
      - IE
      - IR
      - IQ
      - IS
      - IL
      - IT
      - JM
      - JE
      - JO
      - JP
      - KZ
      - KE
      - KG
      - KH
      - KI
      - KN
      - KR
      - KW
      - LA
      - LB
      - LR
      - LY
      - LC
      - LI
      - LK
      - LS
      - LT
      - LU
      - LV
      - MO
      - MF
      - MA
      - MC
      - MD
      - MG
      - MV
      - MX
      - MH
      - MK
      - ML
      - MT
      - MM
      - ME
      - MN
      - MP
      - MZ
      - MR
      - MS
      - MQ
      - MU
      - MW
      - MY
      - YT
      - NA
      - NC
      - NE
      - NF
      - NG
      - NI
      - NU
      - NL
      - NO
      - NP
      - NR
      - NZ
      - OM
      - PK
      - PA
      - PN
      - PE
      - PH
      - PW
      - PG
      - PL
      - PR
      - KP
      - PT
      - PY
      - PS
      - PF
      - QA
      - RE
      - RO
      - RU
      - RW
      - SA
      - SD
      - SN
      - SG
      - GS
      - SH
      - SJ
      - SB
      - SL
      - SV
      - SM
      - SO
      - PM
      - RS
      - SS
      - ST
      - SR
      - SK
      - SI
      - SE
      - SZ
      - SX
      - SC
      - SY
      - TC
      - TD
      - TG
      - TH
      - TJ
      - TK
      - TM
      - TL
      - TO
      - TT
      - TN
      - TR
      - TV
      - TW
      - TZ
      - UG
      - UA
      - UM
      - UY
      - US
      - UZ
      - VA
      - VC
      - VE
      - VG
      - VI
      - VN
      - VU
      - WF
      - WS
      - YE
      - ZA
      - ZM
      - ZW

    OrganicResult:
      type: object
      properties:
        description:
          type: string
          description: Result excerpt.
          example: >
            Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently- ...
        name:
          type: string
          description: Result title.
          example: squid-cache.org
        url:
          $ref: '#/components/schemas/OrganicResultURL'
        rank:
          type: integer
          example: 1
          description: |
            Result position among organic results in the search page.

            The first result of a search page is always 1, regardless of the
            value of
            serp.pageNumber.

    Metadata:
      type: object
      description: Metadata.
      properties:
        displayedQuery:
          type: string
          description: Search query as seen in the web page.
          example: squid proxy
        searchedQuery:
          type: string
          description: Search query as specified in the input URL.
          example: squid proxy
        totalOrganicResults:
          type: integer
          format: int64
          description: |
            Total number of organic results reported by the search engine.
          minimum: 0
          example: 10000
        dateDownloaded:
          type: string
          description: |
            The timestamp at which the data was downloaded. Timezone: UTC.
            Format: ISO 8601 format: "YYYY-MM-DDThh:mm:ssZ"
          example: '2024-02-29T13:01:54Z'

    SearchResultsPage:
      type: object
      description: |
        Search engine results page data.

        To get this response field, set the
        serp
        request field to `true`.
      properties:
        organicResults:
          type: array
          description: |
            List of search results excluding paid results.
          items:
            $ref: '#/components/schemas/OrganicResult'
        url:
          $ref: '#/components/schemas/SearchURL'
        pageNumber:
          type: integer
          description: Page number.
          minimum: 1
        metadata:
          $ref: '#/components/schemas/Metadata'
        product:
          $ref: '#/components/schemas/Product'

    OrganicResultURL:
      type: string
      pattern: ^https?://[\S]+$
      description: Result URL.
      example: https://www.squid-cache.org/
      additionalProperties: false

    SearchURL:
      type: string
      pattern: ^https?://[\S]+$
      description: |
        Search URL.

        Should match
        url.
      example: https://www.google.pl/search?q=squid+proxy
      additionalProperties: false

    URL:
      type: string
      pattern: ^https?://[\S]+$
      additionalProperties: false

    SerpOptions:
      type: object
      description: Options for SERP extraction.
      properties:
        extractFrom:
          type: string
          enum:
          - browserHtml
          - httpResponseBody
          description: |
            Input to use for extraction, either
            httpResponseBody
            or
            browserHtml.

            If not specified, `httpResponseBody` is currently used by default.
            In the future, the default value may depend on the target website.
    click:
      allOf:
      - properties:
          action:
            enum:
            - click
            description: Click on an element.
          selector:
            $ref: '#/components/schemas/ActionSelector'
          button:
            description: Mouse button to click
            type: string
            enum:
            - left
            - right
            - middle
            default: left
          delay:
            description: Time to wait between mousedown and mouseup, in seconds.
            type: number
            minimum: 0
            maximum: 3
            default: 0
          waitForNavigationTimeout:
            description: |
              Maximum waiting time in seconds for the navigation event during the click action.

              If navigation happens within the defined duration, then
              waiting is halted and the next action is executed after the new
              is page is loaded. If the page loading does not finish then the next action
              ends with an error, and following actions may not be executed, depending
              on the onError property. If no navigation happens within the defined
              duration then the next action is executed.
            type: number
            minimum: 0
            maximum: 20
            default: 0
        required:
        - selector

        - action
      - $ref: '#/components/schemas/Action'
    doubleClick:
      allOf:
      - properties:
          action:
            enum:
            - doubleClick
            description: Double click on an element.
          selector:
            $ref: '#/components/schemas/ActionSelector'
        required:
        - selector

        - action
      - $ref: '#/components/schemas/Action'
    evaluate:
      allOf:
      - properties:
          action:
            enum:
            - evaluate
            description: |
              Run JavaScript code in the page context.

              This is a very powerful action. Use cases include:

              -   Sending an API request from the page context, and writing
                  the response somewhere in the DOM, so that the
                  [browser HTML](/zyte-api/usage/browser.md)
                  output includes it.
          source:
            description: JavaScript code to run.
            type: string
            maxLength: 2000
        required:
        - source

        - action
      - $ref: '#/components/schemas/Action'
    goto:
      allOf:
      - properties:
          action:
            enum:
            - goto
            description: |
              Navigate to a new page.

              This action waits until page load event is fired with a default timeout
              of 30 seconds.
          url:
            description: URL to navigate page to. The url should include scheme
            type: string
          options:
            $ref: '#/components/schemas/GoToOptions'
        required:
        - url

        - action
      - $ref: '#/components/schemas/Action'
    hide:
      allOf:
      - properties:
          action:
            enum:
            - hide
            description: Hide an element.
          selector:
            $ref: '#/components/schemas/ActionSelector'
        required:
        - action
        - selector
      - $ref: '#/components/schemas/Action'
    hover:
      allOf:
      - properties:
          action:
            enum:
            - hover
            description: |
              Hover over a visible element.

              Elements that are either hidden or not present will cause the action to
              exit with an error.
          selector:
            $ref: '#/components/schemas/ActionSelector'
        required:
        - selector

        - action
      - $ref: '#/components/schemas/Action'
    interaction:
      allOf:
      - properties:
          action:
            enum:
            - interaction
            description: |
              Execute a
              [browser script](//zyte-api/ide/index.md).
          id:
            description: Script identifier
            type: string
          args:
            description: Input arguments
            type: object
        required:
        - id

        - action
      - $ref: '#/components/schemas/Action'
    keyPress:
      allOf:
      - properties:
          action:
            enum:
            - keyPress
            description: Press a key on the keyboard.
          key:
            type: string
            maxLength: 14
            description: |
              Key to press.

              A single character or special key from the [list of supported
              keys](/zyte-api/ide/api/index.md).

              Key names are case-sensitive.

              Only one key can be executed at a time. Key combinations are
              not supported.
        required:
        - key

        - action
      - $ref: '#/components/schemas/Action'
    reload:
      allOf:
      - properties:
          action:
            enum:
            - reload
            description: |
              Reload the page.

              This action waits until page load event is fired with a default timeout
              of 30 seconds.
          options:
            $ref: '#/components/schemas/GoToOptions'

        required:
        - action
      - $ref: '#/components/schemas/Action'
    scrollBottom:
      allOf:
      - properties:
          action:
            enum:
            - scrollBottom
            description: |
              Continuously scroll down the page while it keeps loading more content.

              The action halts if any of the following conditions are met:

                - the timeout or the total browser execution time is reached
                - the page does not load any new content for the duration of
                  maxScrollDelay
                - maxPageHeight or maxScrollCount have been reached
          timeout:
            description: Maximum wait time, in seconds.
            type: number
            minimum: 0.0
            default: 15.0
            maximum: 30.0
          maxScrollDelay:
            description: |
              The maximum amount of time to wait for each scroll to complete,
              in seconds.

              If the page does not not load any content during this time, the
              action is deemed to have been completed.
            type: number
            default: 1.5
            minimum: 0.5
            maximum: 10
          maxPageHeight:
            description: Maximum height (in pixels) until which the browser keeps scrolling down the page
            type: integer
          maxScrollCount:
            description: |
              The maximum number of scrolls to perform.

              If the page does not yield any fresh content, then the action
              will finish execution before maxScrollCount is reached.
            type: integer
          scrollStep:
            description: |
              The number of pixels for each scroll.
              It can be used for gradual scrolling.
              If it's specified, maxScrollDelay will be used as fixed time waiting instead of waiting for new contents.
            type: integer
            default: 0
            minimum: 0

        required:
        - action
      - $ref: '#/components/schemas/Action'
    scrollTo:
      allOf:
      - properties:
          action:
            enum:
            - scrollTo
            description: |
              Scroll the window to a particular place in the document.

              To set the target location, use one (and only one) of the following:

              - `top` and `left`, to set the target coordinates in pixels.

              - `selector`, to target the center of an HTML element.
          top:
            description: Specifies the number of pixels along the Y axis to scroll the window.
            type: integer
          left:
            description: Specifies the number of pixels along the X axis to scroll the window.
            type: integer
            default: 0
          selector:
            allOf:
            - $ref: '#/components/schemas/ActionSelector'
            - description: If passed scrolls to specified selector instead of scrolling to specified coordinates within page. If selector is not found no scroll is performed. If more than one elements match selector it scrolls to the first one.
        required:
        - action
      - $ref: '#/components/schemas/Action'
    searchKeyword:
      allOf:
      - properties:
          action:
            enum:
            - searchKeyword
            description: |
              Perform keyword search on the page.

              This action uses website-specific knowledge to find and use a search
              box.

              It may not work on some websites. If that’s the case, please
              [reach out to us](https://support.zyte.com/support/tickets/new).

              If there is no search box on a page, an error is returned.
          keyword:
            description: The keyword to be searched for
            type: string
        required:
        - keyword

        - action
      - $ref: '#/components/schemas/Action'
    select:
      allOf:
      - properties:
          action:
            enum:
            - select
            description: |
              Pick single or multiple values from a `<select>` element.
          selector:
            $ref: '#/components/schemas/ActionSelector'
          values:
            description: |
              Values of options to select.

              If the `<select>` has the multiple attribute, all values are
              considered, otherwise only the first one is taken into account.
            type: array
            items:
              type: string
        required:
        - selector
        - values

        - action
      - $ref: '#/components/schemas/Action'
    setLocation:
      allOf:
      - properties:
          action:
            enum:
            - setLocation
            description: |
              Configure a physical address on the website.

              This action uses website-specific knowledge to find and fill a location
              form.

              It may not work on some websites. If that’s the case, please
              [reach out to us](https://support.zyte.com/support/tickets/new).
          address:
            $ref: '#/components/schemas/PostalAddress'

        required:
        - action
      - $ref: '#/components/schemas/Action'
    type:
      allOf:
      - properties:
          action:
            enum:
            - type
            description: Type text into an element.
          selector:
            $ref: '#/components/schemas/ActionSelector'
          text:
            description: |
              Text to type into a focused element.

              To press a special key, use the `keyPress` action instead.
            type: string
          delay:
            description: Time to wait between key presses, in seconds.
            type: number
            minimum: 0
            default: 0
        required:
        - selector
        - text
        - action
      - $ref: '#/components/schemas/Action'
    waitForNavigation:
      allOf:
      - properties:
          action:
            enum:
            - waitForNavigation
            description: |
              Wait until the page navigates to a new URL or reloads.

              If `waitForNavigation` is the first action, the specified options will
              be applied to the initial navigation. Use it if the default timeout of
              30 seconds or the default `waitUntil` value (`load`) is not sufficient
              for the initial navigation.

              Mind, however, that using `waitForNavigation` as the first action has
              an important drawback: any error with the initial navigation will
              nonetheless result in a successful API response, as any other [browser
              action failure](/zyte-api/usage/errors.md).
          timeout:
            type: number
            maximum: 45.0
            minimum: 31.0
            default: 31.0
            description: |
              Maximum wait time, in seconds.
          waitUntil:
            default: load
            description: |
              When to consider that navigation succeeded:
              - `load` - [load event](https://developer.mozilla.org/en-US/docs/Web/API/Window/load_event), default.
              - `domcontentloaded` - [DOMContentLoaded event](https://developer.mozilla.org/en-US/docs/Web/API/Window/DOMContentLoaded_event).
              - `networkidle0` - no ongoing network connections for at least
                 0.5 seconds.
            type: string
            enum:
            - load
            - domcontentloaded
            - networkidle0
        required:
        - timeout
        - action
        nullable: false

      - $ref: '#/components/schemas/Action'
    waitForRequest:
      allOf:
      - properties:
          action:
            enum:
            - waitForRequest
            description: Wait until the request to a specific URL has been sent.
          urlPattern:
            $ref: '#/components/schemas/UrlPattern'
          urlMatchingOptions:
            $ref: '#/components/schemas/PatternMatchingOptions'
          timeout:
            $ref: '#/components/schemas/ActionTimeout'
        example:
            # To wait for a request to https://example.org/store/api/ref=sspa_dk_left_sx_aax_0
        - urlPattern: https://example.org/store/api
            # To wait for a request to https://cdn123.example.org/api/store?q=1234
        - urlPattern: api/store
          urlMatchingOptions: contains
            # To wait for a request to https://example.org/afsk123/ref=sspa_dk_left_sx_aax_0
        - urlPattern: https://example.org/
          urlMatchingOptions: startsWith
        required:
        - urlPattern

        - action
      - $ref: '#/components/schemas/Action'
    waitForResponse:
      allOf:
      - properties:
          action:
            enum:
            - waitForResponse
            description: Wait until the response from a specific URL has been received.
          urlPattern:
            $ref: '#/components/schemas/UrlPattern'
          urlMatchingOptions:
            $ref: '#/components/schemas/PatternMatchingOptions'
          timeout:
            $ref: '#/components/schemas/ActionTimeout'
        example:
            # To wait for a response from https://cdn123.example.org/store/api?q=1234
        - urlPattern: /store/api
          urlMatchingOptions: contains
            # To wait for a response from https://example.org/store/ref=sspa_dk_left_sx_aax_0
        - urlPattern: https://example.org/store/
          urlMatchingOptions: startsWith
        required:
        - urlPattern

        - action
      - $ref: '#/components/schemas/Action'
    waitForSelector:
      allOf:
      - properties:
          action:
            enum:
            - waitForSelector
            description: |
              Wait for the selector to appear.

              If at the moment of calling the method the selector already
              exists, the action will return immediately.

              Also, the action will return immediately after the first matching
              selector appears.

              For a usage example, see the
              [web scraping tutorial](/web-scraping/tutorial/js.md).
          selector:
            $ref: '#/components/schemas/ActionSelector'
          timeout:
            $ref: '#/components/schemas/ActionTimeout'
        required:
        - selector

        - action
      - $ref: '#/components/schemas/Action'
    waitForTimeout:
      allOf:
      - properties:
          action:
            enum:
            - waitForTimeout
            description: |
              Pause script execution for the given number of seconds before
              continuing.

              If the value of timeout is greater than the remaining browser
              execution time, then this action ends with an error.
          timeout:
            $ref: '#/components/schemas/ActionTimeout'

        required:
        - action
      - $ref: '#/components/schemas/Action'
    SessionContextActionSequence:
      description: |
        Actions to run to initialize a server-managed session for a given
        sessionContext).
      type: array
      items:
        oneOf:
        - $ref: '#/components/schemas/click'
        - $ref: '#/components/schemas/doubleClick'
        - $ref: '#/components/schemas/evaluate'
        - $ref: '#/components/schemas/goto'
        - $ref: '#/components/schemas/hide'
        - $ref: '#/components/schemas/hover'
        - $ref: '#/components/schemas/interaction'
        - $ref: '#/components/schemas/keyPress'
        - $ref: '#/components/schemas/reload'
        - $ref: '#/components/schemas/scrollBottom'
        - $ref: '#/components/schemas/scrollTo'
        - $ref: '#/components/schemas/searchKeyword'
        - $ref: '#/components/schemas/select'
        - $ref: '#/components/schemas/setLocation'
        - $ref: '#/components/schemas/type'
        - $ref: '#/components/schemas/waitForNavigation'
        - $ref: '#/components/schemas/waitForRequest'
        - $ref: '#/components/schemas/waitForResponse'
        - $ref: '#/components/schemas/waitForSelector'
        - $ref: '#/components/schemas/waitForTimeout'
    CustomAttribute:
      type: object
      properties:
        description:
          type: string
          maxLength: 300
        type:
          type: string
          enum:
          - boolean
          - string
          - number
          - integer
          - array
          - object
      discriminator:
        propertyName: type
        mapping:
          boolean: '#/components/schemas/CustomAttributeBoolean'
          string: '#/components/schemas/CustomAttributeString'
          number: '#/components/schemas/CustomAttributeNumber'
          integer: '#/components/schemas/CustomAttributeInteger'
          array: '#/components/schemas/CustomAttributeArray'
          object: '#/components/schemas/CustomAttributeObject'
    CustomAttributeBoolean:
      allOf:
      - $ref: '#/components/schemas/CustomAttribute'
      type: object
      required:
      - type
    CustomAttributeString:
      allOf:
      - $ref: '#/components/schemas/CustomAttribute'
      - properties:
          enum:
            type: array
            minItems: 2
            maxItems: 100
            items:
              type: string
              maxLength: 50
              minLength: 1
          format:
            type: string
            enum:
            - html
            - uri
            - html-text
            - xpath
      required:
      - type
    CustomAttributeNumber:
      allOf:
      - $ref: '#/components/schemas/CustomAttribute'
      - properties:
          enum:
            type: array
            minItems: 2
            maxItems: 10
            items:
              type: number
      type: object
      required:
      - type
    CustomAttributeInteger:
      allOf:
      - $ref: '#/components/schemas/CustomAttribute'
      - properties:
          enum:
            type: array
            minItems: 2
            maxItems: 10
            items:
              type: integer
      type: object
      required:
      - type
    CustomAttributeArray:
      allOf:
      - $ref: '#/components/schemas/CustomAttribute'
      properties:
        items:
          $ref: '#/components/schemas/CustomAttribute'
      type: object
      required:
      - type
      - items
    CustomAttributeObject:
      allOf:
      - $ref: '#/components/schemas/CustomAttribute'
      properties:
        properties:
          type: object
          additionalProperties:
            $ref: '#/components/schemas/CustomAttribute'
      type: object
      required:
      - type
      - properties
    Article:
      type: object
      properties:

        headline:
          description: Article headline or title.
          type: string
          example: Article headline

        articleBody:
          description: |
            Clean text of the article, including sub-headings, with newline separators.
          type: string
          example: Article body ...

        articleBodyHtml:
          description: |
            Simplified and standardized HTML of the article body, including sub-headings,
            image captions and embedded content (videos, tweets, etc.).
          type: string
          example: <article><p>Article body ... </p> ... </article>

        description:
          description: |
            A short summary of the article. It can be either human-provided
            (if available), or auto-generated.
          type: string
          example: Article summary

        datePublished:
          description: |
            Publication date. ISO-formatted with 'T' separator, may contain a timezone.
            If the actual publication date is not found, "dateModified" value is taken.
          type: string
          example: '2019-06-19T00:00:00'

        datePublishedRaw:
          description: |
            Same date as "datePublished", but before parsing/normalization, i.e. as
            it appears on the website.
          type: string
          example: June 19, 2019

        dateModified:
          description: |
            The date when the article was most recently modified.
            ISO-formatted with 'T' separator, may contain a timezone.
          type: string
          example: '2019-06-21T00:00:00'

        dateModifiedRaw:
          description: |
            Same date as "dateModified", but before parsing/normalization, i.e. as
            it appears on the website.
          type: string
          example: June 21, 2019

        authors:
          description: Authors of the article.
          type: array
          items:
            $ref: '#/components/schemas/Author'
          example:
          - name: Alice
            nameRaw: Alice and Bob
          - name: Bob
            nameRaw: Alice and Bob

        inLanguage:
          description: |
            Language of the article, as an ISO 639-1 language code. Example: "en".
            Sometimes article language is not the same as the web page overall
            language; to get the detected web page languages,
            see "webPageInfo".
          type: string
          example: en

        breadcrumbs:
          description: |
            A list of breadcrumbs (a specific navigation element)
            with optional `name` and `url`.
          example:
          - name: Home
            url: https://example.com/
          - name: Cell Phones
            url: https://example.com/cell-phones
          - name: Cell Phones & Accessories
          type: array
          items:
            $ref: '#/components/schemas/Breadcrumb'

        mainImage:
          $ref: '#/components/schemas/Image'
          description: The main image of the item.

        images:
          description: All images of the item (may include the main image).
          type: array
          items:
            $ref: '#/components/schemas/Image'

        videos:
          description: A list of all videos inside the article body.
          type: array
          items:
            type: object
            properties:
              url:
                description: Absolute URL of the video.
                type: string
                example: https://example.com/video.mp4
            required:
            - url

        audios:
          description: A list of all audios inside the article body.
          type: array
          items:
            type: object
            properties:
              url:
                description: Absolute URL of the audio.
                type: string
                example: https://example.com/audio.mp3
            required:
            - url

        url:
          description: URL of a page where this article was extracted.
          type: string
          example: https://example.com/article/

        canonicalUrl:
          description: Canonical URL of the article, if available.
          type: string
          example: https://example.com/article

        metadata:
          $ref: '#/components/schemas/Metadata__metadata'

      required:
      - url
      - metadata
    Author:
      description: Author of the article.
      type: object
      properties:
        name:
          description: Full name of the author, e.g. "Alice".
          type: string
        nameRaw:
          description: Text from which this author name was extracted, e.g. "Alice and Bob".
          type: string
      example:
        name: Alice
        nameRaw: Alice and Bob
      required:
      - name
    Breadcrumb:
      description: |
        Breadcrumb item (a specific navigation element) with optional `name` and `url`.
      example:
        name: Home
        url: https://example.com/
      type: object
      properties:
        name:
          description: Text of the breadcrumb, as it appears on the website.
          type: string
        url:
          description: Absolute URL of the breadcrumb.
          type: string
    Image:
      description: Image.
      type: object
      properties:
        url:
          description: URL of an image.
          type: string
          example: http://example.com/item-1/image1.jpeg
      required:
      - url
    Metadata__metadata:
      description: Extracted item metadata for single-item data types.
      type: object
      properties:
        probability:
          description: |
            Probability that extracted item is of requested data type.
            It is closer to 0 in case this page does not contain requested data type.
            For example, when single product extraction is requested with
            "product: true", but a page does not contain a product,
            probability would be close to 0.
            If an item of requested type can be extracted from a page,
            then probability is closer to 1.
            Recommended probability threshold is 0.5,
            but we will return extracted data even if probability is very low.
          type: number
          minimum: 0.0
          maximum: 1.0
          example: 0.87
        dateDownloaded:
          $ref: '#/components/schemas/DateDownloaded'
      required:
      - probability
      - dateDownloaded
    DateDownloaded:
      description: |
        The timestamp at which the data was downloaded.
        Timezone: UTC. Format: ISO 8601 format: "YYYY-MM-DDThh:mm:ssZ"
      type: string
      example: '2019-06-19T08:27:43Z'
    ArticleList:
      type: object
      properties:

        articles:
          description: List of articles available on this page.
          type: array
          items:
            type: object
            properties:

              url:
                description: |
                  URL of a detailed article page.
                  Pass this URL with "article: true" in the request to
                  extract detailed information about the article.
                type: string
                example: https://example.com/articles/1/

              headline:
                description: Article headline or title.
                type: string
                example: Article headline

              articleBody:
                description: |
                  Text of the article as it appears on the list page,
                  including sub-headings, with newline separators.
                type: string
                example: Article body ...

              datePublished:
                description: |
                  Publication date. ISO-formatted with 'T' separator, may contain a timezone.
                type: string
                example: '2019-06-19T00:00:00'

              datePublishedRaw:
                description: |
                  Same date as "datePublished", but before parsing/normalization, i.e. as
                  it appears on the website.
                type: string
                example: June 19, 2019

              authors:
                description: Authors of the article.
                type: array
                items:
                  $ref: '#/components/schemas/Author'
                example:
                - name: Alice
                  nameRaw: Alice and Bob
                - name: Bob
                  nameRaw: Alice and Bob

              inLanguage:
                description: |
                  Language of the article, as an ISO 639-1 language code. Example: "en".
                  Sometimes article language is not the same as the web page overall
                  language; to get the detected web page languages,
                  see "webPageInfo".
                type: string
                example: en

              mainImage:
                $ref: '#/components/schemas/Image'
                description: The main image of the item.

              images:
                description: All images of the item (may include the main image).
                type: array
                items:
                  $ref: '#/components/schemas/Image'

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - metadata

        url:
          description: URL of a page where this article list was extracted.
          type: string
          example: https://example.com/articles/
        metadata:
          $ref: '#/components/schemas/MetadataList'

      required:
      - url
      - metadata
    MetadataListItem:
      description: Item-level metadata for list data types.
      properties:
        probability:
          description: |
            Probability that extracted item in a list is a valid item.
            Items which are unlikely to be valid are not returned,
            so normally no extra thresholding is needed for list items.
            This probability is not calibrated.
          type: number
          minimum: 0.0
          maximum: 1.0
          example: 0.34
      required:
      - probability
    MetadataList:
      description: Top-level metadata for list data types.
      properties:
        dateDownloaded:
          $ref: '#/components/schemas/DateDownloaded'
      required:
      - dateDownloaded
    ArticleNavigation:
      type: object
      properties:

        nextPage:
          $ref: '#/components/schemas/PaginationNext'
        pageNumber:
          $ref: '#/components/schemas/PageNumber'

        items:
          description: List of articles available on this page.
          type: array
          items:
            type: object
            properties:

              url:
                description: |
                  URL of a detailed article page.
                  Pass this URL with "article: true" in the request to
                  extract detailed information about the article.
                type: string
                example: https://example.com/articles/1/

              name:
                description: The name of the article or article link text.
                type: string
                example: Article name

              datePublished:
                description: |
                  Publication date. ISO-formatted with 'T' separator, may contain a timezone.
                type: string
                example: '2019-06-19T00:00:00'

              datePublishedRaw:
                description: |
                  Same date as "datePublished", but before parsing/normalization, i.e. as
                  it appears on the website.
                type: string
                example: June 19, 2019

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - url
            - metadata

        url:
          description: URL of a page containing the list of articles.
          type: string
          example: https://example.com/articles/
        metadata:
          $ref: '#/components/schemas/MetadataList'

      required:
      - url
      - metadata
    PaginationNext:
      description: A link to the next page in the list.
      type: object
      properties:

        url:
          description: URL of the next page in the list.
          type: string
          example: http://example.com/foo?p=3

        name:
          description: Text of the link to the next page, if available.
          type: string
          example: '3'

      required:
      - url

    PageNumber:
      description: Integer describing the current page number. Starts at 1.
      type: integer
      example: 2
    ForumThread:
      type: object
      properties:

        topic:
          description: Topic that is discussed on the page.
          type: object
          properties:
            name:
              description: Name of the topic.
              type: string
              example: How do you cook rice?
          required:
          - name

        posts:
          description: List of posts available on this page, including the first or top post.
          type: array
          items:
            type: object
            properties:

              text:
                description: |
                  Text of the post.
                type: string
                example: Cooking rice is a hobby of mine. Here is how I cook it.

              datePublished:
                description: |
                  Publication date. ISO-formatted with 'T' separator, may contain a timezone.
                type: string
                example: '2019-06-19T00:00:00'

              datePublishedRaw:
                description: |
                  Same date as "datePublished", but before parsing/normalization, i.e. as
                  it appears on the website.
                type: string
                example: June 19, 2019

              reactions:
                description: Details of reactions to this post.
                type: object
                properties:

                  likes:
                    description: |
                      Number of up-votes or likes/stars received by the post.
                    type: integer
                    minimum: 0
                    example: 3

                  replies:
                    description: |
                      Number of replies received by the post.
                    type: integer
                    minimum: 0
                    example: 2

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - metadata

        url:
          description: URL of a page where this forum post list was extracted.
          type: string
          example: https://example.com/forum/thread/1/
        metadata:
          $ref: '#/components/schemas/MetadataList'

      required:
      - url
      - metadata
    JobPosting:
      type: object
      properties:

        jobTitle:
          description: The title of the job.
          type: string
          example: Regional Manager

        datePublished:
          description: |
            Publication date of the job posting.
            ISO-formatted with 'T' separator, may contain a timezone.
          type: string
          example: '2019-06-19T00:00:00'

        datePublishedRaw:
          description: |
            Same date as 'datePublished', but before parsing/normalization,
            i.e. as it appears on the website.
          type: string
          example: 19 June 2019

        validThrough:
          description: |
            The date after which the job posting is not valid,
            e.g. the end of an offer.
            ISO-formatted with ‘T’ separator, may contain a timezone.
          type: string
          example: '2019-08-20T00:00:00'

        description:
          description: |
            A description of the job posting including sub-headings,
            with newline separators.
          type: string
          example: Job Description ...

        descriptionHtml:
          description: |
            Simplified HTML of the description, including sub-headings,
            image captions and embedded content.
          type: string
          example: <article>HTML for Job Description ...

        employmentType:
          description: |
            Type of employment
            (e.g. full-time, part-time, contract, temporary, seasonal, internship).
          type: string
          example: Full-time

        hiringOrganization:
          description: Information about the organization offering the job position.
          type: object
          properties:
            name:
              description: Name of the organization.
              type: string
              example: ACME Corp.
          required:
          - name

        baseSalary:
          description: |
            The base salary of the job or of an employee in the proposed role.
          type: object
          properties:
            raw:
              description: Salary amount as it appears on the website.
              example: $53,251 a year
              type: string
            valueMax:
              description: |
                The maximum value of the base salary as a number string.
                In case of only one value given for the salary instead of a range, valueMax is used to represent it.
              example: '53251.0'
              type: string
            currency:
              description: |
                Currency associated with the salary amount.
                ISO 4217 standard.
              type: string
              example: USD
            currencyRaw:
              description: Currency associated with the salary amount, without normalization.
              type: string
              example: $

        jobLocation:
          description: |
            A (typically single) geographic location associated with the job position.
          type: object
          properties:
            raw:
              description: Job location as it appears on the website.
              type: string
              example: West New York, NJ 07093
          required:
          - raw

        url:
          description: URL of a page where this job posting was extracted.
          type: string
          example: https://example.com/job

        metadata:
          $ref: '#/components/schemas/Metadata__metadata'

      required:
      - url
      - metadata
    JobPostingNavigation:
      type: object
      properties:

        nextPage:
          $ref: '#/components/schemas/PaginationNext'

        pageNumber:
          $ref: '#/components/schemas/PageNumber'

        items:
          description: List of job postings available on this page.
          type: array
          items:
            type: object
            properties:

              url:
                description: |
                  URL of a detailed job posting page.
                  Pass this URL with "jobPosting: true" in the request to
                  extract detailed information about the job posting.
                type: string
                example: https://example.com/jobs/1/

              name:
                description: The name of the job posting or job posting link text.
                type: string
                example: Job posting name

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - metadata
            - url

        url:
          description: URL a of page.
          type: string
          example: https://example.com/jobs/

        metadata:
          $ref: '#/components/schemas/MetadataList'

      required:
      - url
      - metadata
    PageContent:
      type: object
      properties:

        breadcrumbs:
          description: |
            A list of breadcrumbs (a specific navigation element).
          example:
          - name: Home
            url: https://example.com/
          - name: Category
            url: https://example.com/category
          - name: Subcategory
          type: array
          items:
            $ref: '#/components/schemas/Breadcrumb'

        headline:
          description: A page headline.
          type: string
          example: Example page headline

        title:
          description: A page title extracted from the `<title>` tag of the page.
          type: string
          example: Example page title

        itemMain:
          description: |
            Text of the primary content of the page.

            It does not include navigation elements (headers, footers,
            sidebars or pagination links).
          type: string
          example: Example content snippet showing part of the page’s main text…

        itemMainXPath:
          description: |
            XPath for `itemMain`.

            It is an XPath 1.0 expression that points to the smallest HTML
            element that contains all of `itemMain`.

            The expression may only work with an HTML5-compliant parser.
          type: string
          example: //*[@id='homepage-container']/*[1]

        navigationHeader:
          description: |
            Navigation items from the header.

            They are typically for site-wide navigation, not page-specific.
          type: array
          items:
            type: object
            properties:

              url:
                description: URL.
                type: string
                example: https://example.com/category/

              name:
                description: Name.
                type: string
                example: Category name

            required:
            - url

        navigationFooter:
          description: |
            Navigation items from the footer.

            They are typically for site-wide navigation, not page-specific.
          type: array
          items:
            type: object
            properties:

              url:
                description: URL.
                type: string
                example: https://example.com/policy/

              name:
                description: Name.
                type: string
                example: Privacy Policy

            required:
            - url

        navigationSidebar:
          description: |
            Navigation items from the sidebars.

            They are typically for site-wide navigation, not page-specific.
          type: array
          items:
            type: object
            properties:

              url:
                description: URL.
                type: string
                example: https://example.com/sidebar-link/

              name:
                description: Name.
                type: string
                example: Sidebar link

            required:
            - url

        pagination:
          description: |
            Pagination items.

            Items to navigate content pages, either relative to the current
            page (e.g. current, next, previous) or absolute (e.g. first, last,
            specific page number).
          type: array
          items:
            type: object
            properties:

              url:
                description: URL.
                type: string
                example: https://example.com/?page=2

              name:
                description: Name.
                type: string
                example: Next

            required:
            - url

        nextPage:
          $ref: '#/components/schemas/PaginationNext'

        url:
          description: URL of the page.
          type: string
          example: https://example.com/example-page/

        canonicalUrl:
          description: Canonical URL of the page, if available.
          type: string
          example: https://example.com/canonical-url-page/

        metadata:
          $ref: '#/components/schemas/Metadata__metadata'

      required:
      - url
      - metadata
    Product:
      type: object
      required:
      - url
      - metadata
      properties:
        name:
          $ref: '#/components/schemas/Name'
        price:
          $ref: '#/components/schemas/Price'
        currency:
          $ref: '#/components/schemas/Currency'
        currencyRaw:
          $ref: '#/components/schemas/CurrencyRaw'
        regularPrice:
          $ref: '#/components/schemas/RegularPrice'
        availability:
          $ref: '#/components/schemas/Availability'
        sku:
          $ref: '#/components/schemas/Sku'
        mpn:
          $ref: '#/components/schemas/Mpn'
        gtin:
          description: |
            Standardized GTIN product identifier which is unique for
            a product across different sellers.
          type: array
          items:
            $ref: '#/components/schemas/Gtin'
        brand:
          description: |
            Brand or manufacturer of the product.
          type: object
          properties:
            name:
              description: Name of the brand.
              type: string
              example: Product brand
          required:
          - name
        breadcrumbs:
          description: |
            A list of breadcrumbs (a specific navigation element)
            with optional `name` and `url`.
          example:
          - name: Home
            url: https://example.com/
          - name: Cell Phones
            url: https://example.com/cell-phones
          - name: Cell Phones & Accessories
          type: array
          items:
            $ref: '#/components/schemas/Breadcrumb'
        mainImage:
          $ref: '#/components/schemas/Image'
          description: The main image of the item.
        images:
          description: All images of the item (may include the main image).
          type: array
          items:
            $ref: '#/components/schemas/Image'
        description:
          description: Description of the product.
          type: string
          example: product description
        descriptionHtml:
          description: >
            Simplified HTML of the description, including sub-headings, image captions and embedded content.
          type: string
          example: <article>HTML description for Product ...
        aggregateRating:
          description: |
            The overall rating, based on a collection of reviews or ratings.

            ![](https://docs.zyte.com/_static/images/schemas/rating.png)
          type: object
          properties:
            ratingValue:
              description: The average rating value.
              type: number
              example: 4.0
            bestRating:
              description: The highest value allowed in this rating system.
              type: number
              example: 5.0
            reviewCount:
              description: The total number of reviews or ratings for the product.
              type: integer
              minimum: 0
              example: 24
        color:
          $ref: '#/components/schemas/Color'
        size:
          $ref: '#/components/schemas/Size'
        weight:
          $ref: '#/components/schemas/Weight'
        material:
          description: |
            The materials from which the product is made. Contains all product materials on the page.
          type: string
          example: Metal, Plastic
        style:
          $ref: '#/components/schemas/Style'
        additionalProperties:
          description: |
            A list of properties or characteristics.

            * name field contains the property name,
            * value field contains the property value.

            ![](https://docs.zyte.com/_static/images/schemas/product_info.png)
          type: array
          items:
            $ref: '#/components/schemas/AdditionalProperty'
        features:
          description: |
            A list of features of the Product.

            The features of a Product can be found generally on the product page arranged
            in a list, which is usually bulleted.
          type: array
          items:
            type: string
          example:
          - Multi-System Compatible
          - HD Ready 1366 x 768 LED Panel
          - REFRESH RATE 100Hz PQI
        url:
          $ref: '#/components/schemas/Url'
        canonicalUrl:
          $ref: '#/components/schemas/CanonicalUrl'
        metadata:
          $ref: '#/components/schemas/Metadata__metadata'
        variants:
          description: |
            Array of product variants, using the same Product schema.
            Represents extra information available about the variants of a product.
            All variants are included into this array, including the variant
            shown on the page. If some field in this array is empty,
            it means that either the value is the same as in the top-level product,
            or that extraction API did not manage to extract it.
          type: array
          items:
            type: object
            properties:
              name:
                $ref: '#/components/schemas/Name'
              price:
                $ref: '#/components/schemas/Price'
              currency:
                $ref: '#/components/schemas/Currency'
              currencyRaw:
                $ref: '#/components/schemas/CurrencyRaw'
              regularPrice:
                $ref: '#/components/schemas/RegularPrice'
              availability:
                $ref: '#/components/schemas/Availability'
              sku:
                $ref: '#/components/schemas/Sku'
              mpn:
                $ref: '#/components/schemas/Mpn'
              gtin:
                description: |
                  Standardized GTIN product identifier which is unique for
                  a product across different sellers.
                type: array
                items:
                  $ref: '#/components/schemas/Gtin'
              mainImage:
                $ref: '#/components/schemas/Image'
                description: The main image of the item.
              images:
                description: All images of the item (may include the main image).
                type: array
                items:
                  $ref: '#/components/schemas/Image'
              color:
                $ref: '#/components/schemas/Color'
              size:
                $ref: '#/components/schemas/Size'
              style:
                $ref: '#/components/schemas/Style'
              additionalProperties:
                description: |
                  A list of properties or characteristics.

                  * name field contains the property name,
                  * value field contains the property value.

                  ![](https://docs.zyte.com/_static/images/schemas/product_info.png)
                type: array
                items:
                  $ref: '#/components/schemas/AdditionalProperty'
              url:
                $ref: '#/components/schemas/Url'
              canonicalUrl:
                $ref: '#/components/schemas/CanonicalUrl'
    Name:
      description: The name of the product.
      type: string
      example: Product name

    Price:
      description: >
        The price at which the product is being offered. If there is only one price associated with the offer, it is returned in this field.
      type: string
      pattern: ^[0-9]+(\.[0-9]+)?$
      example: '149'

    Currency:
      description: >
        The ISO 4217 standard of the currency in which the price is in.
      type: string
      pattern: ^[A-Z]{3}$
      example: USD

    CurrencyRaw:
      description: >
        The currency as given on the website, without extra normalization (for example, both "$" and "USD" are possible currencies).
      type: string
      example: $

    RegularPrice:
      description: >
        The price before any discount or special offer.
      type: string
      pattern: ^[0-9]+(\.[0-9]+)?$
      example: '199.00'

    Availability:
      description: >
        Availability, as a string. Allowed values:

          * `"InStock"` - includes limited availability, presale,
            preorder, and in-store only.
          * `"OutOfStock"` - includes discontinued and sold out.

      example: InStock
      type: string
      enum:
      - InStock
      - OutOfStock
    Sku:
      description: |
        The Stock Keeping Unit (SKU), i.e. a merchant-specific identifier
        for the product - identifier assigned by the seller.

        ![](https://docs.zyte.com/_static/images/schemas/sku.png)
      example: A123DK9823
      type: string

    Mpn:
      description: The Manufacturer Part Number (MPN) of the product. It is issued by the manufacturer, and is the same across different e-commerce websites.
      type: string
      example: code-123

    Gtin:
      type: object
      description: >
        Standardized GTIN product identifier which is unique for a product across different sellers.
      example:
        type: isbn13
        value: 9781933624341
      properties:
        type:
          description: |
            `gtin14` corresponds to former names
            *EAN/UCC-14*, *SCC-14*, *DUN-14*, *UPC Case Code*,
            *UPC Shipping Container Code*.

            `gtin13` also includes the *jan* (japanese article number).
          enum:
          - gtin8
          - gtin13
          - gtin14
          - isbn10
          - isbn13
          - ismn
          - issn
          - upc
          type: string
        value:
          description: The GTIN value as a string.
          type: string
      required:
      - type
      - value

    Color:
      description: Color of the product.
      type: string
      example: Red

    Size:
      description: |
        A standardized size of a product,
        specified through a simple textual string (for example "XL", "32Wx34L").
        A single product dimension (height, width) is not considered as the size.
      type: string
      example: XL

    Weight:
      type: object
      properties:

        value:
          description: |
            A weight value expressed as a floating point number.
          type: number
          example: 120.0

        unit:
          description: |
            A normalized unit of weight, like kilogram / ounce / pound and others.
          type: string
          example: kilogram

        rawUnit:
          description: |
            A unit of weight without normalization - how it was extracted from the page.
            Normalized version of the rawUnit is in 'unit' attribute.
          type: string
          example: kg

    Style:
      description: |
        Style of the product.
        It can also be referred as pattern/finish on the product page.
        Example values: "Polka dots", "Striped",
        "Nickel finish with Translucent glass", etc.
      type: string
      example: Striped

    AdditionalProperty:
      description: |
        Additional propertiy or characteristics.

        * name field contains the property name,
        * value field contains the property value.

        ![](https://docs.zyte.com/_static/images/schemas/product_info.png)
      example:
        name: batteries
        value: 1 Lithium ion batteries required. (included)
      type: object
      properties:
        name:
          description: Property name.
          type: string
        value:
          description: Property value.
          type: string
      required:
      - name
    Url:
      description: URL of a page where this product was extracted.
      type: string
      example: https://example.com/product/

    CanonicalUrl:
      description: Canonical URL of the product, if available.
      type: string
      example: https://example.com/product/

    ProductList:
      type: object
      properties:

        breadcrumbs:
          description: |
            A list of breadcrumbs (a specific navigation element)
            with optional `name` and `url`.
          example:
          - name: Home
            url: https://example.com/
          - name: Cell Phones
            url: https://example.com/cell-phones
          - name: Cell Phones & Accessories
          type: array
          items:
            $ref: '#/components/schemas/Breadcrumb'

        products:
          description: List of products available on this page.
          type: array
          items:
            type: object
            properties:

              url:
                description: |
                  URL of a detailed product page.
                  Pass this URL with "product: true" in the request to
                  extract detailed information about the product.
                type: string
                example: https://example.com/products/1/

              name:
                description: The name of the product.
                type: string
                example: Product name

              price:
                $ref: '#/components/schemas/Price'
              currencyRaw:
                $ref: '#/components/schemas/CurrencyRaw'
              currency:
                $ref: '#/components/schemas/Currency'
              regularPrice:
                $ref: '#/components/schemas/RegularPrice'

              mainImage:
                $ref: '#/components/schemas/Image'
                description: The main image of the item.

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - metadata

        url:
          description: URL of a page where this product list was extracted.
          type: string
          example: https://example.com/products/

        metadata:
          $ref: '#/components/schemas/MetadataList'

        categoryName:
          description: Name of the category in which the listed products are.
          type: string
          example: Sports & Outdoors

      required:
      - url
      - metadata
    ProductNavigation:
      type: object
      properties:

        categoryName:
          description: Name of the category in which the listed products are found.
          type: string
          example: Sports & Outdoors

        nextPage:
          $ref: '#/components/schemas/PaginationNext'

        pageNumber:
          $ref: '#/components/schemas/PageNumber'

        items:
          description: List of products available on this page.
          type: array
          items:
            type: object
            properties:

              url:
                description: |
                  URL of a detailed product page.
                  Pass this URL with "product: true" in the request to
                  extract detailed information about the product.
                type: string
                example: https://example.com/products/1/

              name:
                description: The name of the product or product link text.
                type: string
                example: Product name

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - metadata
            - url

        subCategories:
          description: List of subcategory links found on this page.
          type: array
          items:
            type: object
            properties:

              url:
                description: |
                  URL of the subcategory.
                type: string
                example: https://example.com/category/1/

              name:
                description: The name of the subcategory or subcategory link text.
                type: string
                example: Category name

              metadata:
                $ref: '#/components/schemas/MetadataListItem'

            required:
            - metadata
            - url

        url:
          description: URL a of page.
          type: string
          example: https://example.com/products/

        metadata:
          $ref: '#/components/schemas/MetadataList'

      required:
      - url
      - metadata
```

## Zyte API stats API

> ###### TIP
>
> For the reference documentation of the HTTP API of Zyte API itself,
> see zapi-reference.

The [Zyte dashboard](https://app.zyte.com) has a [Stats](https://app.zyte.com/o/stats/usage) page that lets you monitor different
aspects of your Zyte API requests, including cost, response time, or features
used.

Zyte API also offers an HTTP API to query your Zyte API requests.

### Authentication

All requests require [basic authentication](https://datatracker.ietf.org/doc/html/rfc7617#section-2), with your [Zyte dashboard API
key](https://app.zyte.com/o/settings) (not your Zyte API key) as username, and no password. For example, if
your API key is `foo`, you base64-encode `foo:` as `Zm9vOg==` and send
the `Authorization` header with value `Basic Zm9vOg==`.

```none
Authorization: Basic Zm9vOg==
```

![](zyte-api/usage/images/account_settings.png)

### Basic usage

The most basic request only requires an organization ID.

To find your organization ID, open the [Zyte dashboard](https://app.zyte.com) and
copy your organization ID from the browser address bar.
For example, if the URL is `https://app.zyte.com/o/000000`,
`000000` is your organization ID.

```bash
curl \
    --user YOUR_STATS_API_KEY: \
    --compressed \
    https://zyte-api-stats.zyte.com/api/stats?organization_id=000000
```

```json
{
    "page": 1,
    "page_size": 500,
    "results": [
        {
            "cost_microusd_avg": "1335.10",
            "cost_microusd_p80": "2040.00",
            "cost_microusd_total": "584773.00",
            "organization_id": 000000,
            "request_count": 438,
            "response_time_sec_avg": "5.49",
            "response_time_sec_p80": "6.40",
            "status_codes": [
                {
                    "code": null,
                    "count": 3
                },
                {
                    "code": 200,
                    "count": 432
                },
                {
                    "code": 404,
                    "count": 3
                }
            ]
        }
    ],
    "total_result_count": 1
}
```

### Rate limiting

The stats API has a rate limit of 20 requests per minute. Anything above that
will trigger a 429 response.

### Grafana dashboard

Follow the steps below to replicate the shown [Grafana](https://grafana.com/grafana/) dashboard to visualize data from the stats API.

1. First, install the [Infinity](https://grafana.com/grafana/plugins/yesoreyeram-infinity-datasource/) plugin on your Grafana instance.
2. Add the newly installed data source into the [Data Sources](https://grafana.com/docs/plugins/yesoreyeram-infinity-datasource/latest/setup/configuration/) section and configure it to fetch data from [https://zyte-api-stats.zyte.com](https://zyte-api-stats.zyte.com) with your Stats dashboard API key.
   ![](zyte-api/usage/images/infinity_auth.png)![](zyte-api/usage/images/infinity_url.png)
3. [Impot the dashboard](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) from the file [stats_api_demo.json](https://docs.zyte.com/_static/stats_api_demo.json).
4. Paste your organization ID into the “organization_id” field as shown in the screenshot below.
   ![](zyte-api/usage/images/grafana_screenshot.png)

### Google Looker Studio dashboard

Follow the steps below to replicate the shown [Google Looker Studio](https://lookerstudio.google.com/) dashboard to visualize data from the stats API.

1. First, connect to the [Zyte API Stats Connector](https://datastudio.google.com/datasources/create?connectorId=AKfycbyh_6V56h157XDWT97HQZEyc_cQlE7bdL633-9AfUWaOWwTkHJ-UyRTMvmUVzunKm_JYQ&authuser=0).
2. It will ask for the API key - provide your your Stats dashboard API key (not your Zyte API key).
3. Check all of the “Allow … to be modified in reports.” checkboxes.
4. Paste your organization ID into the “organization_id” parameter.
5. Click the “Connect”, “Allow”, “Create report” and “Create report” buttons.

![](zyte-api/usage/images/lookerstudio_screenshot.png)

### Reference

```yaml
components:
  schemas:
    HTTPError:
      properties:
        detail:
          type: object
        message:
          type: string
      type: object
    StatsResponse:
      properties:
        page:
          minimum: 1
          type: integer
        page_size:
          maximum: 500
          minimum: 1
          type: integer
        results:
          items:
            $ref: '#/components/schemas/StatsResult'
          type: array
        total_result_count:
          minimum: 1
          type: integer
      required:
      - page
      - page_size
      - total_result_count
      type: object
    StatsResult:
      properties:
        cost_microusd_avg:
          minimum: '0.00'
          type: number
        cost_microusd_p80:
          minimum: '0.00'
          type: number
        cost_microusd_total:
          minimum: '0.00'
          type: number
        day:
          format: date-time
          type: string
        domain:
          maxLength: 256
          minLength: 0
          type: string
        domain_health:
          description: "Domain health information. Returned only when `include_domain_health=true` and `groupby_domain=true`.\n\nIt aims to show detailed stats from your Top 100 most requested domains in the last 7 days. Domains not recently used or not within the Top 100 domains will have a `null` value. These stats are not real-time; they are calculated once every 3 hours."
          type: object
          properties:
            global_avg_success_rate_24h:
              type: string
            global_avg_success_rate_7d:
              type: string
            my_avg_price_microusd_24h:
              type: string
            my_avg_price_microusd_7d:
              type: string
            my_avg_response_time_24h:
              type: string
            my_avg_response_time_7d:
              type: string
            my_requests_24h:
              type: integer
            my_requests_7d:
              type: integer
            my_success_rate_24h:
              type: string
            my_success_rate_7d:
              type: string
            status:
              type: string
              enum:
              - healthy
              - possible_misconfiguration
              - issue_under_investigation
              - possible_performance_issue
            total_spent_microusd_24h:
              type: string
            total_spent_microusd_7d:
              type: string
            total_successful_requests_24h:
              type: integer
            total_successful_requests_7d:
              type: integer
        hour:
          format: date-time
          type: string
        month:
          format: date-time
          type: string
        organization_id:
          type: integer
        request_count:
          minimum: 1
          type: integer
        response_time_sec_avg:
          minimum: '0.00'
          type: number
        response_time_sec_p80:
          minimum: '0.00'
          type: number
        status_codes:
          items:
            additionalProperties:
              minimum: 0
              nullable: true
              type: integer
            type: object
          type: array
        year:
          format: date-time
          type: string
      required:
      - cost_microusd_avg
      - cost_microusd_p80
      - cost_microusd_total
      - organization_id
      - request_count
      - response_time_sec_avg
      - response_time_sec_p80
      type: object
    ValidationError:
      properties:
        detail:
          properties:
            <location>:
              properties:
                <field_name>:
                  items:
                    type: string
                  type: array
              type: object
          type: object
        message:
          type: string
      type: object
  securitySchemes:
    BasicAuth:
      scheme: basic
      type: http
info:
  title: APIFlask
  version: 0.1.0
openapi: 3.0.3
paths:
  /api/stats:
    get:
      parameters:
      - in: query
        name: organization_id
        required: true
        schema:
          type: integer
      - in: query
        name: page
        required: false
        schema:
          default: 1
          minimum: 1
          type: integer
      - in: query
        name: page_size
        required: false
        schema:
          default: 500
          maximum: 500
          minimum: 1
          type: integer
      - description: "The start date and time in\n[ISO 8601-1](https://en.wikipedia.org/wiki/ISO_8601) format (e.g. `2024-09-10T00:00:00Z`).\n\nIt defaults to 7 days in the past."
        in: query
        name: start_time
        required: false
        schema:
          format: date-time
          type: string
      - description: "The end date and time in\n[ISO 8601-1](https://en.wikipedia.org/wiki/ISO_8601) format (e.g. `2024-09-17T00:00:00Z`).\n\nIt defaults to the current date and time."
        in: query
        name: end_time
        required: false
        schema:
          format: date-time
          type: string
      - in: query
        name: domains
        required: false
        schema:
          maxLength: 64
          minLength: 0
      - in: query
        name: apikey_labels
        required: false
        schema:
          maxLength: 64
          minLength: 0
      - in: query
        name: response_codes
        required: false
        schema:
          maxLength: 64
          minLength: 0
      - in: query
        name: requested_features
        required: false
        schema:
          enum:
          - actions
          - browserHtml
          - fileDownload
          - httpResponseBody
          - networkCapture
          - screenshot
          - sessionContext
          - extendedGeolocation
      - in: query
        name: extraction_type
        required: false
        schema:
          enum:
          - article
          - articleList
          - articleNavigation
          - forumThread
          - jobPosting
          - jobPostingNavigation
          - pageContent
          - product
          - productList
          - productNavigation
          - serp
      - in: query
        name: extraction_from
        required: false
        schema:
          enum:
          - httpResponseBody
          - browserHtml
      - description: "Filter requests by\n[tags](/zyte-api/usage/reference.md).\n \nIt must be a comma-separated list of values, where each value can be:\n \n- A key-value pair separated by a colon, i.e. ``<tag>:<value>``, to include only\nrequests where the specified tag has the specified value.\n- A tag, to include only requests where the specified tag exists.\n\nOnly requests that match *all* the specified tag filters will be\nincluded in the results."
        in: query
        name: tags
        required: false
        schema:
          maxLength: 64
          minLength: 0
      - in: query
        name: groupby_time
        required: false
        schema:
          default:
          enum:
          - hour
          - day
          - month
          - year
          -
          nullable: true
          type: string
      - description: "Group results by domain.\n\nWhen set to `true`, the response will include a `domain` field for each result."
        in: query
        name: groupby_domain
        required: false
        schema:
          default: false
          type: boolean
      - description: "Include domain health information in the response.\n\nRequires `groupby_domain=true`. If `include_domain_health=true` is specified without `groupby_domain=true`, a validation error will be returned."
        in: query
        name: include_domain_health
        required: false
        schema:
          default: false
          type: boolean
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/StatsResponse'
          description: Successful response
        '401':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPError'
          description: Authentication error
        '422':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ValidationError'
          description: Validation error
      security:
      - BasicAuth: []
      summary: Stats
servers:
- name: Production Server
  url: https://zyte-api-stats.zyte.com
```

## Zyte Search API

The **Search API** provides a typed interface for search engine queries. Send a keyword
and a domain, get back structured organic results — no URL construction or HTML
parsing needed on the client side.

`POST` to `https://api.zyte.com/v1/search` with your [Zyte API key](https://app.zyte.com/o/zyte-api/api-access):

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data '{
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"]
    }' \
    https://api.zyte.com/v1/search
```

> ##### Quickstart
>
> Get your first search results in minutes.
>
> Get started

> ##### Request parameters
>
> Full reference for `domain`, `query`, `include`,
> `maxResults`, and `queryParameters`.
>
> Parameters

> ##### Response schema
>
> Response fields including `organicResults`, `html`,
> `fetchedAt`, and `meta`.
>
> Response

> ##### Geo-targeting
>
> Target specific countries, languages, and search domains.
>
> Geo-targeting

## Quickstart

This guide shows how to make your first Search API request and get structured
organic results back.

### Prerequisites

You need a [Zyte API key](https://app.zyte.com/o/zyte-api/api-access).

### Basic request

Send a `POST` request to `https://api.zyte.com/v1/search` with `domain`
and `query`. Use `include` to control what you get back.

#### curl

input.json
```json
{
    "domain": "search.engine.com",
    "query": "web scraping tools",
    "include": ["organic"]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/search \
    | jq .organicResults
```

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);
client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"domain", "search.engine.com"},
    {"query", "web scraping tools"},
    {"include", new[] {"organic"}}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/search", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var organicResults = data.RootElement.GetProperty("organicResults").ToString();

Console.WriteLine(organicResults);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.List;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "domain", "search.engine.com",
            "query", "web scraping tools",
            "include", List.of("organic"));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/search");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonArray organicResults = jsonObject.get("organicResults").getAsJsonArray();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(organicResults));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/search',
  {
    domain: 'search.engine.com',
    query: 'web scraping tools',
    include: ['organic']
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const organicResults = response.data.organicResults
  console.log(organicResults)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/search', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'domain' => 'search.engine.com',
        'query' => 'web scraping tools',
        'include' => ['organic'],
    ],
]);
$data = json_decode($response->getBody());
$organicResults = json_encode($data->organicResults);
echo $organicResults.PHP_EOL;
```

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"],
    },
)
organic_results = api_response.json()["organicResults"]
print(organic_results)
```

#### Python client

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"],
    },
)
organic_results = api_response.json()["organicResults"]
print(organic_results)
```

The response contains a structured `organicResults` array:

```json
{
    "status": "success",
    "url": "https://www.example-engine.com/search?q=web+scraping+tools",
    "fetchedAt": "2026-05-11T09:36:57Z",
    "meta": {
        "requestedAt": "2026-05-11T09:36:39Z"
    },
    "organicResults": [
        {
            "rank": 1,
            "title": "Zyte - Web Scraping API",
            "url": "https://www.zyte.com/",
            "snippet": "The leading web scraping platform...",
            "displayedUrl": "zyte.com"
        }
    ]
}
```

### Getting raw HTML

Use `include: ["html"]` to get the raw rendered HTML instead of parsed
results. You can also request both at once:

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data '{
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["html", "organic"]
    }' \
    https://api.zyte.com/v1/search
```

### More results

Set `maxResults` to get up to 100 results in a single call. The platform
fetches multiple pages automatically and returns them in one
`organicResults` array:

#### curl

input.json
```json
{
    "domain": "search.engine.com",
    "query": "web scraping tools",
    "include": ["organic"],
    "maxResults": 100
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/search \
    | jq .organicResults
```

#### C#

```cs
var input = new Dictionary<string, object>(){
    {"domain", "search.engine.com"},
    {"query", "web scraping tools"},
    {"include", new[] {"organic"}},
    {"maxResults", 100}
};
```

#### Java

```java
Map<String, Object> parameters =
    ImmutableMap.of(
        "domain", "search.engine.com",
        "query", "web scraping tools",
        "include", List.of("organic"),
        "maxResults", 100);
```

#### JS

```js
axios.post(
  'https://api.zyte.com/v1/search',
  {
    domain: 'search.engine.com',
    query: 'web scraping tools',
    include: ['organic'],
    maxResults: 100
  },
  { auth: { username: 'YOUR_ZYTE_API_KEY' } }
).then((response) => {
  console.log(response.data.organicResults)
})
```

#### PHP

```php
$response = $client->request('POST', 'https://api.zyte.com/v1/search', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'json' => [
        'domain' => 'search.engine.com',
        'query' => 'web scraping tools',
        'include' => ['organic'],
        'maxResults' => 100,
    ],
]);
```

#### Python

```python
api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"],
        "maxResults": 100,
    },
)
organic_results = api_response.json()["organicResults"]
```

#### Python client

```python
api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"],
        "maxResults": 100,
    },
)
organic_results = api_response.json()["organicResults"]
```

### Geo-targeting

Pass `queryParameters` to target a specific country and language:

#### curl

input.json
```json
{
    "domain": "search.engine.com",
    "query": "web scraping tools",
    "include": ["organic"],
    "queryParameters": {
        "style": "engineSpecific",
        "gl": "us",
        "hl": "en"
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/search \
    | jq .organicResults
```

#### C#

```cs
var input = new Dictionary<string, object>(){
    {"domain", "search.engine.com"},
    {"query", "web scraping tools"},
    {"include", new[] {"organic"}},
    {"queryParameters", new Dictionary<string, object>(){
        {"style", "engineSpecific"},
        {"gl", "us"},
        {"hl", "en"}
    }}
};
```

#### Java

```java
Map<String, Object> parameters =
    ImmutableMap.of(
        "domain", "search.engine.com",
        "query", "web scraping tools",
        "include", List.of("organic"),
        "queryParameters", ImmutableMap.of(
            "style", "engineSpecific",
            "gl", "us",
            "hl", "en"));
```

#### JS

```js
axios.post(
  'https://api.zyte.com/v1/search',
  {
    domain: 'search.engine.com',
    query: 'web scraping tools',
    include: ['organic'],
    queryParameters: {
      style: 'engineSpecific',
      gl: 'us',
      hl: 'en'
    }
  },
  { auth: { username: 'YOUR_ZYTE_API_KEY' } }
).then((response) => {
  console.log(response.data.organicResults)
})
```

#### PHP

```php
$response = $client->request('POST', 'https://api.zyte.com/v1/search', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'json' => [
        'domain' => 'search.engine.com',
        'query' => 'web scraping tools',
        'include' => ['organic'],
        'queryParameters' => [
            'style' => 'engineSpecific',
            'gl' => 'us',
            'hl' => 'en',
        ],
    ],
]);
```

#### Python

```python
api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"],
        "queryParameters": {
            "style": "engineSpecific",
            "gl": "us",
            "hl": "en",
        },
    },
)
organic_results = api_response.json()["organicResults"]
```

#### Python client

```python
api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["organic"],
        "queryParameters": {
            "style": "engineSpecific",
            "gl": "us",
            "hl": "en",
        },
    },
)
organic_results = api_response.json()["organicResults"]
```

For city-level targeting, add a `uule` value. Generate one from a city name
using the `uule_grabber` Python library:

```python
import uule_grabber
uule_grabber.uule("Chicago, USA")  # w+CAIQ...
```

### AI Overview

Add `"aiOverview"` to `include` to trigger full browser rendering. The
AI Overview block will be present in the raw `html` field:

#### curl

input.json
```json
{
    "domain": "search.engine.com",
    "query": "web scraping tools",
    "include": ["aiOverview", "organic", "html"]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/search
```

#### C#

```cs
var input = new Dictionary<string, object>(){
    {"domain", "search.engine.com"},
    {"query", "web scraping tools"},
    {"include", new[] {"aiOverview", "organic", "html"}}
};
```

#### Java

```java
Map<String, Object> parameters =
    ImmutableMap.of(
        "domain", "search.engine.com",
        "query", "web scraping tools",
        "include", List.of("aiOverview", "organic", "html"));
```

#### JS

```js
axios.post(
  'https://api.zyte.com/v1/search',
  {
    domain: 'search.engine.com',
    query: 'web scraping tools',
    include: ['aiOverview', 'organic', 'html']
  },
  { auth: { username: 'YOUR_ZYTE_API_KEY' } }
).then((response) => {
  const { organicResults, html } = response.data
  console.log(organicResults)
})
```

#### PHP

```php
$response = $client->request('POST', 'https://api.zyte.com/v1/search', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'json' => [
        'domain' => 'search.engine.com',
        'query' => 'web scraping tools',
        'include' => ['aiOverview', 'organic', 'html'],
    ],
]);
```

#### Python

```python
api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["aiOverview", "organic", "html"],
    },
)
data = api_response.json()
organic_results = data["organicResults"]
html = data["html"]  # parse AI Overview from here
```

#### Python client

```python
api_response = requests.post(
    "https://api.zyte.com/v1/search",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "domain": "search.engine.com",
        "query": "web scraping tools",
        "include": ["aiOverview", "organic", "html"],
    },
)
data = api_response.json()
organic_results = data["organicResults"]
html = data["html"]  # parse AI Overview from here
```

> ###### NOTE
>
> Parsed `aiOverview` extraction is coming in a future release. For now,
> the AI Overview block is available in the raw `html` field.

### Next steps

- request — full parameter reference
- response — response field details
- geo — geo-targeting by country, language, and domain

## Request parameters

Send a `POST` request to `https://api.zyte.com/v1/search`.

| Field             | Type     | Required   | Description                                                                                                              |
|-------------------|----------|------------|--------------------------------------------------------------------------------------------------------------------------|
| `domain`          | string   | Yes        | A supported search domain, e.g. `google.com`, `google.co.uk`, `google.de`. Unsupported domains return a 400 error.       |
| `query`           | string   | Yes        | The search keywords (1-2048 characters). URL-encoded automatically.                                                      |
| `include`         | string[] | No         | What to return: `"html"` (raw HTML), `"organic"` (parsed results), `"aiOverview"` (coming soon). Defaults to `["html"]`. |
| `maxResults`      | integer  | No         | Number of organic results: 10, 20, 30 … 100. Default: `10`. Request weight = `max(1, maxResults / 10)`.                  |
| `queryParameters` | object   | No         | Additional engine-specific query parameters. See below.                                                                  |

### queryParameters

Controls geo-targeting and other search modifiers. Two styles are
supported via the `style` field.

#### Engine-specific style

Pass engine-native parameters directly. Set `style` to `"engineSpecific"`.

```json
{
    "domain": "search.engine.com",
    "query": "web scraping tools",
    "queryParameters": {
        "style": "engineSpecific",
        "gl": "us",
        "hl": "en"
    }
}
```

Supported engine-specific fields:

| Field   | Example       | Purpose                                   |
|---------|---------------|-------------------------------------------|
| `gl`    | `"us"`        | Geographic limit (country)                |
| `hl`    | `"en"`        | Interface language                        |
| `cr`    | `"countryUS"` | Country restrict                          |
| `lr`    | `"lang_en"`   | Language restrict                         |
| `safe`  | `"active"`    | SafeSearch (`"active"` or `"off"`)        |
| `nfpr`  | `1`           | Disable autocorrect                       |
| `uule`  | `"w+CAI..."`  | Encoded location for city-level targeting |

#### Generic style

A portable, engine-agnostic interface. Set `style` to `"generic"`.

```json
{
    "domain": "search.engine.com",
    "query": "web scraping tools",
    "queryParameters": {
        "style": "generic",
        "geolocation": "US",
        "locale": "en-US"
    }
}
```

| Field         | Example   | Maps to    |
|---------------|-----------|------------|
| `geolocation` | `"US"`    | `gl=US`    |
| `locale`      | `"en-US"` | `hl=en-US` |

### Error responses

|   HTTP | Type                            | When                              |
|--------|---------------------------------|-----------------------------------|
|    400 | `/request/domain-not-supported` | Domain is not supported           |
|    400 | `/request/max-results-invalid`  | `maxResults` is not a valid value |
|    401 | `/auth/key-not-found`           | Missing or invalid API key        |
|    429 | `/limits/over-search-limit`     | Rate limit exceeded               |

## Response schema

All responses are JSON. The fields returned depend on what you request via
`include`.

### Top-level fields

| Field            | Always present   | Description                                                                                                          |
|------------------|------------------|----------------------------------------------------------------------------------------------------------------------|
| `status`         | Yes              | `"success"` on a successful response.                                                                                |
| `url`            | Yes              | The search URL that was fetched.                                                                                     |
| `fetchedAt`      | Yes              | ISO-8601 timestamp of when the page was fetched.                                                                     |
| `meta`           | Yes              | Request metadata. Includes `requestedAt`, and geo fields if `queryParameters` were passed (`geolocation`, `locale`). |
| `html`           | When requested   | Raw rendered HTML of the SERP page. Returned when `"html"` is in `include`.                                          |
| `organicResults` | When requested   | Array of parsed organic results. Returned when `"organic"` is in `include`.                                          |

### organicResults

Each item in the `organicResults` array has the following fields:

| Field          | Type    | Always present   | Description                                           |
|----------------|---------|------------------|-------------------------------------------------------|
| `rank`         | integer | Yes              | Position in results, starting at 1.                   |
| `title`        | string  | Yes              | Result title as shown on the SERP.                    |
| `url`          | string  | Yes              | Link to the result page.                              |
| `snippet`      | string  | No               | Description text shown on the SERP.                   |
| `displayedUrl` | string  | No               | URL as displayed on the SERP (may differ from `url`). |

### Example response

```json
{
    "status": "success",
    "url": "https://www.example-engine.com/search?q=web+scraping+tools",
    "fetchedAt": "2026-05-11T09:36:57Z",
    "meta": {
        "requestedAt": "2026-05-11T09:36:39Z",
        "geolocation": "US",
        "locale": "en-US"
    },
    "organicResults": [
        {
            "rank": 1,
            "title": "Zyte - Web Scraping API",
            "url": "https://www.zyte.com/",
            "snippet": "The leading web scraping platform...",
            "displayedUrl": "zyte.com"
        },
        {
            "rank": 2,
            "title": "ScraperAPI",
            "url": "https://www.scraperapi.com/",
            "snippet": "Scale data collection with a simple API.",
            "displayedUrl": "scraperapi.com"
        }
    ]
}
```

## Geo-targeting

The Search API supports two levels of geo-targeting: country/language via
`queryParameters`, and domain selection via `domain`.

### Country and language

Pass `queryParameters` to control the country and language of results.

#### Engine-specific (recommended)

Pass engine-native parameters using `style: "engineSpecific"`:

```json
{
    "domain": "search.engine.com",
    "query": "pizza restaurants",
    "queryParameters": {
        "style": "engineSpecific",
        "gl": "us",
        "hl": "en"
    }
}
```

#### Generic

Use the portable `geolocation` and `locale` fields:

```json
{
    "domain": "search.engine.com",
    "query": "pizza restaurants",
    "queryParameters": {
        "style": "generic",
        "geolocation": "US",
        "locale": "en-US"
    }
}
```

### Targeting a regional domain

To get results from a country-specific domain, change the `domain` field.
The platform fetches from that domain directly:

```json
{
    "domain": "search.engine.com",
    "query": "pizza restaurants",
    "queryParameters": {
        "style": "engineSpecific",
        "gl": "de",
        "hl": "de"
    }
}
```

Supported domains include regional variants such as `google.com`,
`google.co.uk`, `google.de`, `google.fr`, `google.co.jp`,
`google.com.br`. Unsupported domains return a 400 error.

### City-level targeting

For city-level precision, pass a `uule` value in `queryParameters`:

```json
{
    "domain": "search.engine.com",
    "query": "pizza restaurants",
    "queryParameters": {
        "style": "engineSpecific",
        "uule": "w+CAIQICINY2hpY2Fnbywgsuited"
    }
}
```

> ###### NOTE
>
> The `uule` parameter uses the canonical `w+` format. Generate a
> value from a city name using the `uule_grabber` Python library:
>
> ```python
> import uule_grabber
> uule_grabber.uule("Chicago, USA")  # w+CAIQ...
> ```

## Code examples

The Zyte API documentation features code examples for many different
technologies.

You can find those examples at the end of relevant topics in pages like
zapi-http, zapi-browser, zapi-extract or
zapi-shared-features, or find them all below.

> ###### TIP
>
> The right-hand sidebar of the Zyte API reference contains additional examples of Zyte API parameters.

### Requirements

Select a technology tab below to learn how to install and configure the
requirements to run code examples for that technology:

#### C#

**C#** code examples use C# 9.0.

To run **C#** code examples, install:

- [.NET SDK](https://dotnet.microsoft.com/en-us/download) 5.x or later
- [Html Agility Pack](https://www.nuget.org/packages/HtmlAgilityPack/), for HTML parsing

#### CLI client

**CLI client** code examples feature the command-line interface of
[python-zyte-api](http://python-zyte-api.readthedocs.io/), the official Python client of Zyte API, along with
other command-line tools.

To run **CLI client** code examples, install:

- [python-zyte-api](http://python-zyte-api.readthedocs.io/), for requests.

  Requires [installing Python](https://wiki.python.org/moin/BeginnersGuide/Download) first.

- [jq](https://stedolan.github.io/jq/download/), for JSON parsing.
- [base64](https://www.gnu.org/software/coreutils/manual/html_node/base64-invocation.html#base64-invocation), for base64 encoding and decoding.
  - On **Windows**, you can [use chocolatey to install GNU Core Utilities](https://community.chocolatey.org/packages/gnuwin32-coreutils.install),
    which includes a `base64` command-line application.
  - **macOS** comes with a `base64` command-line application pre-installed.
  - Most **Linux** distributions come with [GNU Core Utilities](https://www.gnu.org/software/coreutils/)
    pre-installed, or make it easy to install it. GNU Core Utilities includes
    a `base64` command-line application.
- [xmllint](https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home), for HTML parsing.
  - On **Windows**, [install libxml2](https://www.zlatkovic.com/projects/libxml/index.html), which provides
    `xmllint`.
  - **macOS** comes with `xmllint` pre-installed.
  - Most **Linux** distributions make it easy to install [libxml2](https://gitlab.gnome.org/GNOME/libxml2/-/blob/master/README.md), which
    provides `xmllint`.
- [xargs](https://www.gnu.org/software/findutils/manual/html_node/find_html/Invoking-xargs.html), for parallelization.
  - On **Windows**, you can [use chocolatey to install GNU findutils](https://community.chocolatey.org/packages/findutils), which includes
    a `xargs` command-line application.
  - **macOS** comes with a `xargs` command-line application pre-installed.
  - **Most** Linux distributions come with [GNU findutils](https://www.gnu.org/software/findutils/) pre-installed, or
    make it easy to install it. GNU findutils includes a `xargs`
    command-line application.

#### curl

**curl** code examples feature [curl](https://everything.curl.dev/) and other command-line tools.

To run **curl** code examples, install:

- [curl](https://everything.curl.dev/), for requests.
  > ###### NOTE
  >
  > curl comes pre-installed in many operating systems.

- [jq](https://stedolan.github.io/jq/download/), for JSON parsing.
- [base64](https://www.gnu.org/software/coreutils/manual/html_node/base64-invocation.html#base64-invocation), for base64 encoding and decoding.
  - On **Windows**, you can [use chocolatey to install GNU Core Utilities](https://community.chocolatey.org/packages/gnuwin32-coreutils.install),
    which includes a `base64` command-line application.
  - **macOS** comes with a `base64` command-line application pre-installed.
  - Most **Linux** distributions come with [GNU Core Utilities](https://www.gnu.org/software/coreutils/)
    pre-installed, or make it easy to install it. GNU Core Utilities includes
    a `base64` command-line application.
- [xmllint](https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home), for HTML parsing.
  - On **Windows**, [install libxml2](https://www.zlatkovic.com/projects/libxml/index.html), which provides
    `xmllint`.
  - **macOS** comes with `xmllint` pre-installed.
  - Most **Linux** distributions make it easy to install [libxml2](https://gitlab.gnome.org/GNOME/libxml2/-/blob/master/README.md), which
    provides `xmllint`.
- [xargs](https://www.gnu.org/software/findutils/manual/html_node/find_html/Invoking-xargs.html), for parallelization.
  - On **Windows**, you can [use chocolatey to install GNU findutils](https://community.chocolatey.org/packages/findutils), which includes
    a `xargs` command-line application.
  - **macOS** comes with a `xargs` command-line application pre-installed.
  - **Most** Linux distributions come with [GNU findutils](https://www.gnu.org/software/findutils/) pre-installed, or
    make it easy to install it. GNU findutils includes a `xargs`
    command-line application.

#### Java

**Java** code examples use Java SE 8.

To run **Java** code examples, install:

- [JDK 8u202](https://www.oracle.com/en_us/java/technologies/javase/javase8-archive-downloads.html)
  or [later](https://www.oracle.com/en_us/java/technologies/downloads/).
- [HttpClient 5.1 from Apache HttpComponents](https://hc.apache.org/httpcomponents-client-5.1.x/)
- [Gson 2.9.1](https://github.com/google/gson).

#### JS

**JS** code examples use JavaScript.

To run **JS** code examples, install:

- [Node.js](https://nodejs.org/en/).
- [axios](https://github.com/axios/axios), for requests.
- [cheerio](https://github.com/cheeriojs/cheerio), for HTML parsing.
- [https-proxy-agent](https://www.npmjs.com/package/https-proxy-agent), for proxy mode.

#### PHP

**PHP** code examples use PHP 7.4.

To run **PHP** code examples, install:

- [PHP](https://www.php.net/manual/en/install.php) 7.4.
- [Guzzle](https://docs.guzzlephp.org/en/stable/index.html), for requests.
- The `dom` [extension](https://www.php.net/manual/en/install.pecl.php), for HTML parsing.

#### Proxy mode

**Proxy mode** code examples use curl with Zyte API as a proxy. See the **curl** tab for code example requirement
details.

See zapi-proxy to learn how to use Zyte API as a proxy with
other technologies.

#### Python

**Python** code examples use Python 3.

To run **Python** code examples, install:

- [Python](https://wiki.python.org/moin/BeginnersGuide/Download).
- [Requests](https://docs.python-requests.org/), for single requests.
- [aiohttp](https://docs.aiohttp.org/en/stable/index.html), for concurrent requests.
- [Parsel](https://parsel.readthedocs.io/en/latest/), for HTML parsing.

#### Python client

**Python client** code examples feature the asyncio API of
[python-zyte-api](http://python-zyte-api.readthedocs.io/), the official Python client of Zyte API.

To run **Python client** code examples, install:

- [Python](https://wiki.python.org/moin/BeginnersGuide/Download).
- [python-zyte-api](http://python-zyte-api.readthedocs.io/), for requests.
- [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

#### Ruby

**Ruby** code examples use [Ruby 3.x](https://www.ruby-lang.org/en/).

#### Scrapy

**Scrapy** code examples feature [Scrapy](https://docs.scrapy.org/en/latest/) with the scrapy-zyte-api plugin configured in transparent mode.

To run **Scrapy** code examples, install:

- [Scrapy](https://docs.scrapy.org/en/latest/).
- scrapy-zyte-api.

After installing scrapy-zyte-api, you must also configure it in
your Scrapy project. If you configure it
enabling its components separately instead of enabling the add-on, you
also need to set `ZYTE_API_TRANSPARENT_MODE` to `True`.

> ###### TIP
>
> The web scraping tutorial covers installing
> and configuring the requirements for **Scrapy** code examples.

### All examples

#### Running the `scrollBottom` action

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/scroll"},
    {"browserHtml", true},
    {
        "actions",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"action", "scrollBottom"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var quoteCount = (double)navigator.Evaluate("count(//*[@class=\"quote\"])");
```

#### CLI client

input.jsonl
```json
{"url": "https://quotes.toscrape.com/scroll", "browserHtml": true, "actions": [{"action": "scrollBottom"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
```

#### curl

input.json
```json
{
    "url": "https://quotes.toscrape.com/scroll",
    "browserHtml": true,
    "actions": [
        {
            "action": "scrollBottom"
        }
    ]
}
```

```shell

curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> action = ImmutableMap.of("action", "scrollBottom");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://quotes.toscrape.com/scroll",
            "browserHtml",
            true,
            "actions",
            Collections.singletonList(action));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          int quoteCount = document.select(".quote").size();
          System.out.println(quoteCount);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/scroll',
    browserHtml: true,
    actions: [
      {
        action: 'scrollBottom'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
  const $ = cheerio.load(browserHtml)
  const quoteCount = $('.quote').length
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/scroll',
        'browserHtml' => true,
        'actions' => [
            ['action' => 'scrollBottom'],
        ],
    ],
]);
$data = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($data->browserHtml);
$xpath = new DOMXPath($doc);
$quote_count = $xpath->query("//*[@class='quote']")->count();
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/scroll",
        "browserHtml": True,
        "actions": [
            {
                "action": "scrollBottom",
            },
        ],
    },
)
browser_html = api_response.json()["browserHtml"]
quote_count = len(Selector(browser_html).css(".quote"))
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://quotes.toscrape.com/scroll",
            "browserHtml": True,
            "actions": [
                {
                    "action": "scrollBottom",
                },
            ],
        },
    )
    browser_html = api_response["browserHtml"]
    quote_count = len(Selector(browser_html).css(".quote"))
    print(quote_count)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "scrollBottom",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        quote_count = len(response.css(".quote"))
```

Output:

```none
100
```

#### Getting an HTTP response body

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"httpResponseBody", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "httpResponseBody": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "httpResponseBody": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "httpResponseBody", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    httpResponseBody: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'httpResponseBody' => true,
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
```

#### Proxy mode

With the proxy mode, you always get a response
body.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com \
> output.html
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "httpResponseBody": True,
    },
)
http_response_body: bytes = b64decode(api_response.json()["httpResponseBody"])
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "httpResponseBody": True,
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

In transparent mode, when you target a text
resource (e.g. HTML, JSON), regular Scrapy requests work out of the
box:

```python
from scrapy import Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        http_response_text: str = response.text
```

While regular Scrapy requests also work for binary responses at the
moment, they may stop working in future versions of
scrapy-zyte-api, so passing
[httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody) is recommended when targeting binary
resources:

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "httpResponseBody": True,
                },
            },
        )

    def parse(self, response):
        http_response_body: bytes = response.body
```

Output (first 5 lines):

```html
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
```

#### Setting a `Referer` header in a browser request

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"browserHtml", true},
    {
        "requestHeaders",
        new Dictionary<string, object>()
        {
            {"referer", "https://example.org/"}
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//text()");
nodeIterator.MoveNext();
var responseJson = nodeIterator.Current.ToString();
var responseData = JsonDocument.Parse(responseJson);
var headerEnumerator = responseData.RootElement.GetProperty("headers").EnumerateObject();
var headers = new Dictionary<string, string>();
while (headerEnumerator.MoveNext())
{
    headers.Add(
        headerEnumerator.Current.Name.ToString(),
        headerEnumerator.Current.Value.ToString()
    );
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "browserHtml": true, "requestHeaders": {"referer": "https://example.org/"}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//text()' - 2> /dev/null \
    | jq .headers
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "browserHtml": true,
    "requestHeaders": {
        "referer": "https://example.org/"
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//text()' - 2> /dev/null \
    | jq .headers
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> requestHeaders = ImmutableMap.of("referer", "https://example.org/");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "browserHtml",
            true,
            "requestHeaders",
            requestHeaders);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          JsonObject data = JsonParser.parseString(document.text()).getAsJsonObject();
          JsonObject headers = data.get("headers").getAsJsonObject();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(headers));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    browserHtml: true,
    requestHeaders: {
      referer: 'https://example.org/'
    }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const $ = cheerio.load(response.data.browserHtml)
  const data = JSON.parse($.text())
  const headers = data.headers
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'browserHtml' => true,
        'requestHeaders' => [
            'referer' => 'https://example.org/',
        ],
    ],
]);
$api = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($api->browserHtml);
$data = json_decode($doc->textContent);
$headers = $data->headers;
```

#### Python

```python
import json

import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "browserHtml": True,
        "requestHeaders": {
            "referer": "https://example.org/",
        },
    },
)
browser_html = api_response.json()["browserHtml"]
selector = Selector(browser_html)
response_json = selector.xpath("//text()").get()
response_data = json.loads(response_json)
headers = response_data["headers"]
```

#### Python client

```python
import asyncio
import json

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "browserHtml": True,
            "requestHeaders": {
                "referer": "https://example.org/",
            },
        }
    )
    browser_html = api_response["browserHtml"]
    selector = Selector(browser_html)
    response_json = selector.xpath("//text()").get()
    response_data = json.loads(response_json)
    print(json.dumps(response_data["headers"], indent=2))

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            headers={"Referer": "https://example.org/"},
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                },
            },
        )

    def parse(self, response):
        response_json = response.xpath("//text()").get()
        response_data = json.loads(response_json)
        headers = response_data["headers"]
```

Output (`"Referer"` line):

```json
  "Referer": "https://example.org/",
```

#### Getting browser HTML

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"browserHtml", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "browserHtml": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "browserHtml", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          System.out.println(browserHtml);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    browserHtml: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'browserHtml' => true,
    ],
]);
$api = json_decode($response->getBody());
$browser_html = $api->browserHtml;
```

#### Proxy mode

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Browser-Html: true" \
    https://toscrape.com
```

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "browserHtml": True,
    },
)
browser_html: str = api_response.json()["browserHtml"]
```

#### Python client

```python
import asyncio

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "browserHtml": True,
        }
    )
    print(api_response["browserHtml"])

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                },
            },
        )

    def parse(self, response):
        browser_html: str = response.text
```

Output (first 5 lines):

```html
<!DOCTYPE html><html lang="en"><head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        <link href="./css/bootstrap.min.css" rel="stylesheet">
        <link href="./css/main.css" rel="stylesheet">
```

#### Reusing browser cookies on HTTP requests

Send a browser request to the home page of a website, and use its response
cookies as request cookies in an HTTP request to a different URL of that
website.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var browserInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"browserHtml", true},
    {"responseCookies", true}
};
var browserInputJson = JsonSerializer.Serialize(browserInput);
var browserContent = new StringContent(browserInputJson, Encoding.UTF8, "application/json");
HttpResponseMessage browserResponse = await client.PostAsync("https://api.zyte.com/v1/extract", browserContent);
var browserResponseBody = await browserResponse.Content.ReadAsByteArrayAsync();
var browserData = JsonDocument.Parse(browserResponseBody);

var httpInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"httpResponseBody", true},
    {"requestCookies", browserData.RootElement.GetProperty("responseCookies")}
};
var httpInputJson = JsonSerializer.Serialize(httpInput);
var httpContent = new StringContent(httpInputJson, Encoding.UTF8, "application/json");
HttpResponseMessage httpResponse = await client.PostAsync("https://api.zyte.com/v1/extract", httpContent);
var httpResponseBody = await httpResponse.Content.ReadAsByteArrayAsync();
var httpData = JsonDocument.Parse(httpResponseBody);
var base64HttpResponseBodyField = httpData.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyField = System.Convert.FromBase64String(base64HttpResponseBodyField);
var result = System.Text.Encoding.UTF8.GetString(httpResponseBodyField);

Console.WriteLine(result);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> browserParameters =
        ImmutableMap.of(
            "url", "https://toscrape.com/", "browserHtml", true, "responseCookies", true);
    String browserRequestBody = new Gson().toJson(browserParameters);

    HttpPost browserRequest = new HttpPost("https://api.zyte.com/v1/extract");
    browserRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    browserRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    browserRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    browserRequest.setEntity(new StringEntity(browserRequestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        browserRequest,
        browserResponse -> {
          HttpEntity browserEntity = browserResponse.getEntity();
          String browserApiResponse = EntityUtils.toString(browserEntity, StandardCharsets.UTF_8);
          JsonObject browserJsonObject =
              JsonParser.parseString(browserApiResponse).getAsJsonObject();

          Map<String, Object> httpParameters =
              ImmutableMap.of(
                  "url",
                  "https://books.toscrape.com/",
                  "httpResponseBody",
                  true,
                  "requestCookies",
                  browserJsonObject.get("responseCookies"));
          String httpRequestBody = new Gson().toJson(httpParameters);

          HttpPost httpRequest = new HttpPost("https://api.zyte.com/v1/extract");
          httpRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
          httpRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
          httpRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
          httpRequest.setEntity(new StringEntity(httpRequestBody));

          client.execute(
              httpRequest,
              httpResponse -> {
                HttpEntity httpEntity = httpResponse.getEntity();
                String httpApiResponse = EntityUtils.toString(httpEntity, StandardCharsets.UTF_8);
                JsonObject httpJsonObject =
                    JsonParser.parseString(httpApiResponse).getAsJsonObject();
                String base64HttpResponseBody =
                    httpJsonObject.get("httpResponseBody").getAsString();
                byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
                String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
                System.out.println(httpResponseBody);
                return null;
              });
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com/',
    browserHtml: true,
    responseCookies: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((browserResponse) => {
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://books.toscrape.com/',
      httpResponseBody: true,
      requestCookies: browserResponse.data.responseCookies
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((httpResponse) => {
    const httpResponseBody = Buffer.from(
      httpResponse.data.httpResponseBody,
      'base64'
    )
    console.log(httpResponseBody.toString())
  })
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$browser_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com/',
        'browserHtml' => true,
        'responseCookies' => true,
    ],
]);
$browser_data = json_decode($browser_response->getBody());
$http_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://books.toscrape.com/',
        'httpResponseBody' => true,
        'requestCookies' => $browser_data->responseCookies,
    ],
]);
$http_data = json_decode($http_response->getBody());
$http_response_body = base64_decode($http_data->httpResponseBody);
echo $http_response_body;
```

#### Python

```python
from base64 import b64decode

import requests

browser_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com/",
        "browserHtml": True,
        "responseCookies": True,
    },
)
http_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://books.toscrape.com/",
        "httpResponseBody": True,
        "requestCookies": browser_response.json()["responseCookies"],
    },
)
http_response_body = b64decode(http_response.json()["httpResponseBody"])
print(http_response_body.decode())
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    browser_response = await client.get(
        {
            "url": "https://toscrape.com/",
            "browserHtml": True,
            "responseCookies": True,
        }
    )
    http_response = await client.get(
        {
            "url": "https://books.toscrape.com/",
            "httpResponseBody": True,
            "requestCookies": browser_response["responseCookies"],
        }
    )
    http_response_body = b64decode(http_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com/",
            callback=self.parse_browser,
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "responseCookies": True,
                },
            },
        )

    def parse_browser(self, response):
        yield response.follow(
            "https://books.toscrape.com/",
            callback=self.parse_http,
            meta={
                "zyte_api_automap": {
                    "requestCookies": response.raw_api_response["responseCookies"],
                },
            },
        )

    def parse_http(self, response):
        print(response.text)
```

#### Setting a geolocation

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "http://ip-api.com/json"},
    {"httpResponseBody", true},
    {"geolocation", "AU"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var countryCode = responseData.RootElement.GetProperty("countryCode").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "http://ip-api.com/json", "httpResponseBody": true, "geolocation": "AU"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .countryCode
```

#### curl

input.json
```json
{
    "url": "http://ip-api.com/json",
    "httpResponseBody": true,
    "geolocation": "AU"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .countryCode
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url", "http://ip-api.com/json", "httpResponseBody", true, "geolocation", "AU");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String countryCode = data.get("countryCode").getAsString();
          System.out.println(countryCode);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'http://ip-api.com/json',
    httpResponseBody: true,
    geolocation: 'AU'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const data = JSON.parse(httpResponseBody)
  const countryCode = data.countryCode
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'http://ip-api.com/json',
        'httpResponseBody' => true,
        'geolocation' => 'AU',
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
$data = json_decode($http_response_body);
$country_code = $data->countryCode;
```

#### Proxy mode

With the proxy mode, use the
zyte-geolocation header.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Geolocation: US" \
    http://ip-api.com/json \
    | jq .countryCode
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "http://ip-api.com/json",
        "httpResponseBody": True,
        "geolocation": "AU",
    },
)
http_response_body: bytes = b64decode(api_response.json()["httpResponseBody"])
response_data = json.loads(http_response_body)
country_code = response_data["countryCode"]
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "http://ip-api.com/json",
            "httpResponseBody": True,
            "geolocation": "AU",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    response_data = json.loads(http_response_body)
    print(response_data["countryCode"])

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class IPAPIComSpider(Spider):
    name = "ip_api_com"

    async def start(self):
        yield Request(
            "http://ip-api.com/json",
            meta={
                "zyte_api_automap": {
                    "geolocation": "AU",
                },
            },
        )

    def parse(self, response):
        response_data = json.loads(response.body)
        country_code = response_data["countryCode"]
```

Output:

```none
AU
```

#### Making an HTTP request seem like it comes from a mobile device

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/user-agent"},
    {"httpResponseBody", true},
    {"device", "mobile"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var headerEnumerator = responseData.RootElement.EnumerateObject();
while (headerEnumerator.MoveNext())
{
    if (headerEnumerator.Current.Name.ToString() == "user-agent")
    {
        Console.WriteLine(headerEnumerator.Current.Value.ToString());
    }
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/user-agent", "httpResponseBody": true, "device": "mobile"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output '.["user-agent"]'
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/user-agent",
    "httpResponseBody": true,
    "device": "mobile"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output '.["user-agent"]'
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url", "https://httpbin.org/user-agent", "httpResponseBody", true, "device", "mobile");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String userAgent = data.get("user-agent").getAsString();
          System.out.println(userAgent);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/user-agent',
    httpResponseBody: true,
    device: 'mobile'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(JSON.parse(httpResponseBody)['user-agent'])
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/user-agent',
        'httpResponseBody' => true,
        'device' => 'mobile',
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
$data = json_decode($http_response_body);
echo $data->{'user-agent'}.PHP_EOL;
```

#### Proxy mode

With the proxy mode, use the
zyte-device header.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Device: mobile" \
    https://httpbin.org/user-agent \
    | jq --raw-output '.["user-agent"]'
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/user-agent",
        "httpResponseBody": True,
        "device": "mobile",
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
user_agent = json.loads(http_response_body)["user-agent"]
print(user_agent)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/user-agent",
            "httpResponseBody": True,
            "device": "mobile",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    user_agent = json.loads(http_response_body)["user-agent"]
    print(user_agent)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/user-agent",
            meta={
                "zyte_api_automap": {
                    "device": "mobile",
                }
            },
        )

    def parse(self, response):
        user_agent = json.loads(response.text)["user-agent"]
        print(user_agent)
```

Example output (may vary):

```none
Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Mobile Safari/537.36
```

#### Getting structured data from a product details page of an e-commerce website

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"},
    {"product", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var product = data.RootElement.GetProperty("product").ToString();

Console.WriteLine(product);
```

#### CLI client

input.jsonl
```json
{"url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html", "product": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .product
```

#### curl

input.json
```json
{
    "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
    "product": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .product
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
            "product",
            true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonObject product = jsonObject.get("product").getAsJsonObject();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(product));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html',
    product: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const product = response.data.product
  console.log(product)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html',
        'product' => true,
    ],
]);
$data = json_decode($response->getBody());
$product = json_encode($data->product);
echo $product.PHP_EOL;
```

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": (
            "https://books.toscrape.com/catalogue"
            "/a-light-in-the-attic_1000/index.html"
        ),
        "product": True,
    },
)
product = api_response.json()["product"]
print(product)
```

#### Python client

```python
import asyncio
import json

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": (
                "https://books.toscrape.com/catalogue"
                "/a-light-in-the-attic_1000/index.html"
            ),
            "product": True,
        }
    )
    product = api_response["product"]
    print(json.dumps(product, indent=2, ensure_ascii=False))

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class BooksToScrapeComSpider(Spider):
    name = "books_toscrape_com"

    async def start(self):
        yield Request(
            (
                "https://books.toscrape.com/catalogue"
                "/a-light-in-the-attic_1000/index.html"
            ),
            meta={
                "zyte_api_automap": {
                    "product": True,
                },
            },
        )

    def parse(self, response):
        product = response.raw_api_response["product"]
        print(product)
```

Output (first 5 lines):

```json
{
  "name": "A Light in the Attic",
  "price": "51.77",
  "currency": "GBP",
  "currencyRaw": "£",
```

#### Submitting an HTML form with an HTTP request

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

In [https://quotes.toscrape.com/search.aspx](https://quotes.toscrape.com/search.aspx) you get an HTML form that could be
stripped down to:

```html
<form action="/filter.aspx" method="post" >
    <select name="author">
        <option>----------</option>
        <option value="Albert Einstein">
            Albert Einstein
        </option>
        <!-- [more options] -->
    </select>
    <select name="tag">
        <option>----------</option>
    </select>
    <input type="hidden" name="__VIEWSTATE" value="ZTYzZDZ…">
</form>
```

When you select an **Author** (e.g. Albert Einstein), a form request is sent,
and the **Tag** options fill up.

To reproduce that:

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using System.Web;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input1 = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/search.aspx"},
    {"httpResponseBody", true}
};
var inputJson1 = JsonSerializer.Serialize(input1);
var content1 = new StringContent(inputJson1, Encoding.UTF8, "application/json");

HttpResponseMessage response1 = await client.PostAsync("https://api.zyte.com/v1/extract", content1);
var body1 = await response1.Content.ReadAsByteArrayAsync();
var data1 = JsonDocument.Parse(body1);
var base64HttpResponseBody1 = data1.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes1 = System.Convert.FromBase64String(base64HttpResponseBody1);
var httpResponseBody1 = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes1);
var htmlDocument1 = new HtmlDocument();
htmlDocument1.LoadHtml(httpResponseBody1);
var navigator1 = htmlDocument1.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator1.Evaluate("//*[@name='__VIEWSTATE']/@value");
nodeIterator.MoveNext();
var viewState = nodeIterator.Current.ToString();

var httpRequestTextParameters = new Dictionary<string, string>
{
    { "author", "Albert Einstein" },
    { "tag", "----------" },
    { "__VIEWSTATE", viewState}
};
var httpRequestText = string.Join("&",
    httpRequestTextParameters.Select(kvp => $"{HttpUtility.UrlEncode(kvp.Key)}={HttpUtility.UrlEncode(kvp.Value)}"));

var input2 = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/filter.aspx"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"},
    {
        "customHttpRequestHeaders",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"name", "Content-Type"},
                {"value", "application/x-www-form-urlencoded"}
            }
        }
    },
    {"httpRequestText", httpRequestText}
};
var inputJson2 = JsonSerializer.Serialize(input2);
var content2 = new StringContent(inputJson2, Encoding.UTF8, "application/json");

HttpResponseMessage response2 = await client.PostAsync("https://api.zyte.com/v1/extract", content2);
var body2 = await response2.Content.ReadAsByteArrayAsync();
var data2 = JsonDocument.Parse(body2);
var base64HttpResponseBody2 = data2.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes2 = System.Convert.FromBase64String(base64HttpResponseBody2);
var httpResponseBody2 = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes2);
var htmlDocument2 = new HtmlDocument();
htmlDocument2.LoadHtml(httpResponseBody2);
var navigator2 = htmlDocument2.CreateNavigator();
var nodeIterator2 = (XPathNodeIterator)navigator2.Evaluate("//*[@name='tag']//option");
int tagCount = 0;
while (nodeIterator2.MoveNext())
{
    tagCount++;
}
Console.WriteLine($"{tagCount}");
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Base64;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.entity.UrlEncodedFormEntity;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.NameValuePair;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.apache.hc.core5.http.message.BasicNameValuePair;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters1 =
        ImmutableMap.of("url", "https://quotes.toscrape.com/search.aspx", "httpResponseBody", true);
    String requestBody1 = new Gson().toJson(parameters1);

    HttpPost request1 = new HttpPost("https://api.zyte.com/v1/extract");
    request1.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request1.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request1.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request1.setEntity(new StringEntity(requestBody1));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request1,
        (response1) -> {
          HttpEntity httpEntity1 = response1.getEntity();
          String httpApiResponse1 = EntityUtils.toString(httpEntity1, StandardCharsets.UTF_8);
          JsonObject httpJsonObject1 = JsonParser.parseString(httpApiResponse1).getAsJsonObject();
          String base64HttpResponseBody1 = httpJsonObject1.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes1 = Base64.getDecoder().decode(base64HttpResponseBody1);
          String httpResponseBody1 = new String(httpResponseBodyBytes1, StandardCharsets.UTF_8);
          Document document1 = Jsoup.parse(httpResponseBody1);
          String viewState = document1.select("[name='__VIEWSTATE']").attr("value");
          Map<String, String> params =
              ImmutableMap.of(
                  "author", "Albert Einstein",
                  "tag", "----------",
                  "__VIEWSTATE", viewState);
          List<NameValuePair> formParams = new ArrayList<>();
          for (Map.Entry<String, String> entry : params.entrySet()) {
            formParams.add(new BasicNameValuePair(entry.getKey(), entry.getValue()));
          }
          UrlEncodedFormEntity entity =
              new UrlEncodedFormEntity(formParams, StandardCharsets.UTF_8);
          String httpRequestText = EntityUtils.toString(entity);
          Map<String, Object> customHttpRequestHeader =
              ImmutableMap.of("name", "Content-Type", "value", "application/x-www-form-urlencoded");
          Map<String, Object> parameters2 =
              ImmutableMap.of(
                  "url",
                  "https://quotes.toscrape.com/filter.aspx",
                  "httpResponseBody",
                  true,
                  "httpRequestMethod",
                  "POST",
                  "customHttpRequestHeaders",
                  Collections.singletonList(customHttpRequestHeader),
                  "httpRequestText",
                  httpRequestText);
          String requestBody2 = new Gson().toJson(parameters2);

          HttpPost request2 = new HttpPost("https://api.zyte.com/v1/extract");
          request2.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
          request2.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
          request2.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
          request2.setEntity(new StringEntity(requestBody2));

          client.execute(
              request2,
              (response2) -> {
                HttpEntity httpEntity2 = response2.getEntity();
                String httpApiResponse2 = EntityUtils.toString(httpEntity2, StandardCharsets.UTF_8);
                JsonObject httpJsonObject2 =
                    JsonParser.parseString(httpApiResponse2).getAsJsonObject();
                String base64HttpResponseBody2 =
                    httpJsonObject2.get("httpResponseBody").getAsString();
                byte[] httpResponseBodyBytes2 = Base64.getDecoder().decode(base64HttpResponseBody2);
                String httpResponseBody2 =
                    new String(httpResponseBodyBytes2, StandardCharsets.UTF_8);
                Document document2 = Jsoup.parse(httpResponseBody2);
                Elements tags = document2.select("select[name='tag'] option");
                System.out.println(tags.size());
                return null;
              });

          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')
const querystring = require('querystring')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/search.aspx',
    httpResponseBody: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const $ = cheerio.load(httpResponseBody)
  const viewState = $('[name="__VIEWSTATE"]').get(0).attribs.value
  const httpRequestText = querystring.stringify(
    {
      author: 'Albert Einstein',
      tag: '----------',
      __VIEWSTATE: viewState
    }
  )
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://quotes.toscrape.com/filter.aspx',
      httpResponseBody: true,
      httpRequestMethod: 'POST',
      customHttpRequestHeaders: [
        {
          name: 'Content-Type',
          value: 'application/x-www-form-urlencoded'
        }
      ],
      httpRequestText
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((response) => {
    const httpResponseBody = Buffer.from(
      response.data.httpResponseBody,
      'base64'
    )
    const $ = cheerio.load(httpResponseBody)
    console.log($('select[name="tag"] option').length)
  })
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response_1 = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/search.aspx',
        'httpResponseBody' => true,
    ],
]);
$data = json_decode($response_1->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$doc = new DOMDocument();
$doc->loadHTML($http_response_body);
$xpath_1 = new DOMXPath($doc);
$view_state = $xpath_1->query('//*[@name="__VIEWSTATE"]/@value')->item(0)->nodeValue;
$http_request_text = http_build_query(
    [
        'author' => 'Albert Einstein',
        'tag' => '----------',
        '__VIEWSTATE' => $view_state,
    ]
);
$response_2 = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/filter.aspx',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
        'customHttpRequestHeaders' => [
            [
                'name' => 'Content-Type',
                'value' => 'application/x-www-form-urlencoded',
            ],
        ],
        'httpRequestText' => $http_request_text,
    ],
]);
$data = json_decode($response_2->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$doc->loadHTML($http_response_body);
$xpath_2 = new DOMXPath($doc);
$tags = $xpath_2->query('//*[@name="tag"]/option');
echo count($tags).PHP_EOL;
```

#### Python

Install form2request, which makes it easier
to handle HTML forms in Python.

Then:

```python
from base64 import b64decode

from form2request import form2request
from parsel import Selector
import requests

api_response_1 = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/search.aspx",
        "httpResponseBody": True,
    },
)
api_response_1_data = api_response_1.json()
http_response_body_1 = b64decode(api_response_1_data["httpResponseBody"])
selector_1 = Selector(body=http_response_body_1, base_url=api_response_1_data["url"])
form = selector_1.css("form")
request = form2request(form, {"author": "Albert Einstein"}, click=False)
api_response_2 = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": request.url,
        "httpRequestMethod": request.method,
        "customHttpRequestHeaders": [
            {"name": k, "value": v} for k, v in request.headers
        ],
        "httpRequestText": request.body.decode(),
        "httpResponseBody": True,
    },
)
http_response_body_2 = b64decode(api_response_2.json()["httpResponseBody"])
selector_2 = Selector(body=http_response_body_2)
print(len(selector_2.css("select[name='tag'] option")))
```

#### Python client

Install form2request, which makes it easier
to handle HTML forms in Python.

Then:

```python
import asyncio
from base64 import b64decode

from form2request import form2request
from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response_1 = await client.get(
        {
            "url": "https://quotes.toscrape.com/search.aspx",
            "httpResponseBody": True,
        }
    )
    http_response_body_1 = b64decode(api_response_1["httpResponseBody"])
    selector_1 = Selector(body=http_response_body_1, base_url=api_response_1["url"])
    form = selector_1.css("form")
    request = form2request(form, {"author": "Albert Einstein"}, click=False)
    api_response_2 = await client.get(
        {
            "url": request.url,
            "httpRequestMethod": request.method,
            "customHttpRequestHeaders": [
                {"name": k, "value": v} for k, v in request.headers
            ],
            "httpRequestText": request.body.decode(),
            "httpResponseBody": True,
        }
    )
    http_response_body_2 = b64decode(api_response_2["httpResponseBody"])
    selector_2 = Selector(body=http_response_body_2)
    print(len(selector_2.css("select[name='tag'] option")))

asyncio.run(main())
```

#### Scrapy

Install form2request, which makes it easier
to handle HTML forms in Scrapy.

Then, use it and let transparent mode take care of
the rest:

```python
from form2request import form2request
from scrapy import Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"
    start_urls = ["https://quotes.toscrape.com/search.aspx"]

    def parse(self, response):
        form = response.css("form")
        request = form2request(form, {"author": "Albert Einstein"}, click=False)
        yield request.to_scrapy(callback=self.parse_tags)

    def parse_tags(self, response):
        print(len(response.css("select[name='tag'] option")))
```

Output (number of **Tag** options):

```json
25
```

#### Decoding HTML from an HTTP response body (i.e. from bytes to text)

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### curl

Use [file](https://www.darwinsys.com/file/) to find the media type of a previously-downloaded
response based solely on its body (i.e. not
following the HTML encoding sniffing algorithm).

```shell
file --mime-encoding output.html
```

#### JS

Use [content-type-parser](https://www.npmjs.com/package/content-type-parser), [html-encoding-sniffer](https://www.npmjs.com/package/html-encoding-sniffer) and [whatwg-encoding](https://www.npmjs.com/package/whatwg-encoding):

```js
const contentTypeParser = require('content-type-parser')
const htmlEncodingSniffer = require('html-encoding-sniffer')
const whatwgEncoding = require('whatwg-encoding')

// …

const httpResponseHeaders = response.data.httpResponseHeaders
let contentTypeCharset
httpResponseHeaders.forEach(function (item) {
  if (item.name.toLowerCase() === 'content-type') {
    contentTypeCharset = contentTypeParser(item.value).get('charset')
  }
})
const httpResponseBody = Buffer.from(response.data.httpResponseBody, 'base64')
const encoding = htmlEncodingSniffer(httpResponseBody, {
  transportLayerEncodingLabel: contentTypeCharset
})
const html = whatwgEncoding.decode(httpResponseBody, encoding)
```

#### Python

[web-poet](https://web-poet.readthedocs.io/en/stable/index.html) provides a response wrapper that automatically decodes the
response body following an encoding sniffing algorithm similar to the
one defined in the HTML standard.

Provided that you have extracted a response with both body and
headers, and you have Base64-decoded the
response body, you can decode the HTML bytes as
follows:

```python
from web_poet import HttpResponse

# …

headers = tuple(
    (item['name'], item['value'])
    for item in http_response_headers
)
response = HttpResponse(
    url='https://example.com',
    body=http_response_body,
    status=200,
    headers=headers,
)
html = response.text
```

#### Scrapy

In transparent mode, regular Scrapy requests
targeting HTML resources decode them by default. See
zapi-text.

#### Setting arbitrary headers in HTTP requests

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {
        "customHttpRequestHeaders",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"name", "Accept-Language"},
                {"value", "fa"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var headerEnumerator = responseData.RootElement.GetProperty("headers").EnumerateObject();
var headers = new Dictionary<string, string>();
while (headerEnumerator.MoveNext())
{
    headers.Add(
        headerEnumerator.Current.Name.ToString(),
        headerEnumerator.Current.Value.ToString()
    );
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "customHttpRequestHeaders": [{"name": "Accept-Language", "value": "fa"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .headers
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "customHttpRequestHeaders": [
        {
            "name": "Accept-Language",
            "value": "fa"
        }
    ]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .headers
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> customHttpRequestHeader =
        ImmutableMap.of("name", "Accept-Language", "value", "fa");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "customHttpRequestHeaders",
            Collections.singletonList(customHttpRequestHeader));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          JsonObject headers = data.get("headers").getAsJsonObject();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(headers));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    customHttpRequestHeaders: [
      {
        name: 'Accept-Language',
        value: 'fa'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const headers = JSON.parse(httpResponseBody).headers
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'customHttpRequestHeaders' => [
            [
                'name' => 'Accept-Language',
                'value' => 'fa',
            ],
        ],
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
$data = json_decode($http_response_body);
$headers = $data->headers;
```

#### Proxy mode

With the proxy mode, the request headers
from your requests are used automatically.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Accept-Language: fa" \
    https://httpbin.org/anything \
    | jq .headers
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "customHttpRequestHeaders": [
            {
                "name": "Accept-Language",
                "value": "fa",
            },
        ],
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
headers = json.loads(http_response_body)["headers"]
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "customHttpRequestHeaders": [
                {
                    "name": "Accept-Language",
                    "value": "fa",
                },
            ],
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    headers = json.loads(http_response_body)["headers"]
    print(json.dumps(headers, indent=2))

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            headers={"Accept-Language": "fa"},
        )

    def parse(self, response):
        headers = json.loads(response.text)["headers"]
```

Output (first 5 lines):

```json
{
  "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
  "Accept-Encoding": "gzip, deflate, br",
  "Accept-Language": "fa",
  "Host": "httpbin.org",
```

#### Forcing data center IPs or device residential IPs

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

string[] ipTypes = { "datacenter", "residential" };
for (int i = 0; i < ipTypes.Length; i++)
{
    var input = new Dictionary<string, object>(){
        {"url", "https://www.whatismyisp.com/"},
        {"httpResponseBody", true},
        {"ipType", ipTypes[i]}
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

    HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
    var body = await response.Content.ReadAsByteArrayAsync();

    var data = JsonDocument.Parse(body);
    var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
    var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
    var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);
    var htmlDocument = new HtmlDocument();
    htmlDocument.LoadHtml(httpResponseBody);
    var navigator = htmlDocument.CreateNavigator();
    var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//h1/span/text()");
    nodeIterator.MoveNext();
    var isp = nodeIterator.Current.ToString();

    Console.WriteLine(isp);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "datacenter"}
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "residential"}
```

```shell
zyte-api input.jsonl 2> /dev/null \
    | xargs -d\\n -n 1 \
    bash -c "
        jq --raw-output .httpResponseBody <<< \"\$0\" \
        | base64 --decode \
        | xmllint --html --xpath 'string(//h1/span/text())' --noblanks - 2> /dev/null
"
```

#### curl

input.jsonl
```json
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "datacenter"}
{"url": "https://www.whatismyisp.com/", "httpResponseBody": true, "ipType": "residential"}
```

```shell
cat input.jsonl \
    | xargs -P 2 -d\\n -n 1 \
    bash -c "
        curl \
                --user YOUR_ZYTE_API_KEY: \
                --header 'Content-Type: application/json' \
                --data \"\$0\" \
                --compressed \
                https://api.zyte.com/v1/extract \
            2> /dev/null \
            | jq --raw-output .httpResponseBody \
            | base64 --decode \
            | xmllint --html --xpath 'string(//h1/span/text())' --noblanks - 2> /dev/null
"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    String[] ipTypes = {"datacenter", "residential"};
    for (String ipType : ipTypes) {
      Map<String, Object> parameters =
          ImmutableMap.of(
              "url", "https://www.whatismyisp.com/", "httpResponseBody", true, "ipType", ipType);
      String requestBody = new Gson().toJson(parameters);

      HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
      request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
      request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
      request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
      request.setEntity(new StringEntity(requestBody));

      CloseableHttpClient client = HttpClients.createDefault();
      client.execute(
          request,
          response -> {
            HttpEntity entity = response.getEntity();
            String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
            JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
            String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
            byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
            String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
            Document document = Jsoup.parse(httpResponseBody);
            String logout = document.select("h1 > span:first-of-type").text();
            System.out.println(logout);
            return null;
          });
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

const ipTypes = ['datacenter', 'residential']
for (const ipType of ipTypes) {
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://www.whatismyisp.com/',
      httpResponseBody: true,
      ipType
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((response) => {
    const httpResponseBody = Buffer.from(
      response.data.httpResponseBody,
      'base64'
    )
    const $ = cheerio.load(httpResponseBody)
    const logout = $('h1 > span:first-of-type').text()
    console.log(logout)
  })
}
```

#### PHP

```php
<?php

error_reporting(E_ERROR | E_PARSE);
$client = new GuzzleHttp\Client();
$ip_types = ['datacenter', 'residential'];
foreach ($ip_types as &$ip_type) {
    $response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => 'https://www.whatismyisp.com/',
            'httpResponseBody' => true,
            'ipType' => $ip_type,
        ],
    ]);
    $data = json_decode($response->getBody());
    $http_response_body = base64_decode($data->httpResponseBody);
    $doc = new DOMDocument();
    $doc->loadHTML($http_response_body);
    $xpath = new DOMXPath($doc);
    $logout = $xpath->query('//h1/span/text()')->item(0)->nodeValue;
    echo $logout.PHP_EOL;
}
```

#### Proxy mode

With the proxy mode, use the
zyte-iptype header.

```shell
for ip_type in datacenter residential
do
    curl \
        --proxy api.zyte.com:8011 \
        --proxy-user YOUR_ZYTE_API_KEY: \
        --header "Zyte-IPType: $ip_type" \
        --compressed \
        https://www.whatismyisp.com/ \
        2> /dev/null \
        | xmllint --html --xpath 'string(//h1/span/text())' --noblanks - 2> /dev/null
done
```

#### Python

```python
from base64 import b64decode

import requests
from parsel import Selector

for ip_type in ("datacenter", "residential"):
    api_response = requests.post(
        "https://api.zyte.com/v1/extract",
        auth=("YOUR_ZYTE_API_KEY", ""),
        json={
            "url": "https://www.whatismyisp.com/",
            "httpResponseBody": True,
            "ipType": ip_type,
        },
    )
    http_response_body_bytes = b64decode(api_response.json()["httpResponseBody"])
    http_response_body = http_response_body_bytes.decode()
    logout = Selector(http_response_body).css("h1 > span::text").get()
    print(logout)
```

#### Python client

```python
import asyncio
from base64 import b64decode

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    for ip_type in ("datacenter", "residential"):
        api_response = await client.get(
            {
                "url": "https://www.whatismyisp.com/",
                "httpResponseBody": True,
                "ipType": ip_type,
            },
        )
        http_response_body_bytes = b64decode(api_response["httpResponseBody"])
        http_response_body = http_response_body_bytes.decode()
        logout = Selector(http_response_body).css("h1 > span::text").get()
        print(logout)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class WhatIsMyIspComSpider(Spider):
    name = "whatismyisp_com"

    async def start(self):
        for ip_type in ("datacenter", "residential"):
            yield Request(
                "https://www.whatismyisp.com/",
                meta={
                    "zyte_api_automap": {
                        "ipType": ip_type,
                    },
                },
            )

    def parse(self, response):
        print(response.css("h1 > span::text").get())
```

Output:

```none
[A web hosting company]
[An Internet service provider]
```

#### Disabling JavaScript in a browser request

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://www.whatismybrowser.com/detect/is-javascript-enabled"},
    {"browserHtml", true},
    {"javascript", false}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//*[@id=\"detected_value\"]/text()");
nodeIterator.MoveNext();
var isJavaScriptEnabled = nodeIterator.Current.ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://www.whatismybrowser.com/detect/is-javascript-enabled", "browserHtml": true, "javascript": false}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//*[@id="detected_value"]/text()' - 2> /dev/null
```

#### curl

input.json
```json
{
    "url": "https://www.whatismybrowser.com/detect/is-javascript-enabled",
    "browserHtml": true,
    "javascript": false
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath '//*[@id="detected_value"]/text()' - 2> /dev/null
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://www.whatismybrowser.com/detect/is-javascript-enabled",
            "browserHtml",
            true,
            "javascript",
            false);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          String isJavaScriptEnabled = document.select("#detected_value").text();
          System.out.println(isJavaScriptEnabled);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://www.whatismybrowser.com/detect/is-javascript-enabled',
    browserHtml: true,
    javascript: false
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const $ = cheerio.load(response.data.browserHtml)
  const isJavaScriptEnabled = $('#detected_value').text()
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://www.whatismybrowser.com/detect/is-javascript-enabled',
        'browserHtml' => true,
        'javascript' => false,
    ],
]);
$api = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($api->browserHtml);
$xpath = new DOMXPath($doc);
$is_javascript_enabled = $xpath->query("//*[@id='detected_value']")->item(0)->textContent;
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://www.whatismybrowser.com/detect/is-javascript-enabled",
        "browserHtml": True,
        "javascript": False,
    },
)
browser_html = api_response.json()["browserHtml"]
selector = Selector(browser_html)
is_javascript_enabled: str = selector.css("#detected_value::text").get()
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://www.whatismybrowser.com/detect/is-javascript-enabled",
            "browserHtml": True,
            "javascript": False,
        }
    )
    browser_html = api_response["browserHtml"]
    selector = Selector(browser_html)
    is_javascript_enabled = selector.css("#detected_value::text").get()
    print(is_javascript_enabled)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class WhatIsMyBrowserComSpider(Spider):
    name = "whatismybrowser_com"

    async def start(self):
        yield Request(
            "https://www.whatismybrowser.com/detect/is-javascript-enabled",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "javascript": False,
                },
            },
        )

    def parse(self, response):
        is_javascript_enabled: str = response.css("#detected_value::text").get()
```

Output:

```none
No
```

#### Appending arbitrary metadata to a request

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

var inputData = new List<List<object>>()
{
    new List<object>(){"https://toscrape.com", 1},
    new List<object>(){"https://books.toscrape.com", 2},
    new List<object>(){"https://quotes.toscrape.com", 3},
};
var output = new List<HttpResponseMessage>();

var handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All,
    MaxConnectionsPerServer = 15
};
var client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var responseTasks = new List<Task<HttpResponseMessage>>();
foreach (var entry in inputData)
{
    var input = new Dictionary<string, object>(){
        {"url", entry[0]},
        {"browserHtml", true},
        {"echoData", entry[1]}
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");
    var responseTask = client.PostAsync("https://api.zyte.com/v1/extract", content);
    responseTasks.Add(responseTask);
}

while (responseTasks.Any())
{
    var responseTask = await Task.WhenAny(responseTasks);
    responseTasks.Remove(responseTask);
    var response = await responseTask;
    output.Add(response);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true, "echoData": 1}
{"url": "https://books.toscrape.com", "browserHtml": true, "echoData": 2}
{"url": "https://quotes.toscrape.com", "browserHtml": true, "echoData": 3}
```

```shell
zyte-api --n-conn 15 input.jsonl -o output.jsonl
```

#### curl

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true, "echoData": 1}
{"url": "https://books.toscrape.com", "browserHtml": true, "echoData": 2}
{"url": "https://quotes.toscrape.com", "browserHtml": true, "echoData": 3}
```

```shell
cat input.jsonl \
    | xargs -P 15 -d\\n -n 1 \
    bash -c "
        curl \
            --user $ZYTE_API_KEY: \
            --header 'Content-Type: application/json' \
            --data \"\$0\" \
            --compressed \
            https://api.zyte.com/v1/extract \
        | jq .echoData \
        | awk '{print \$1}' \
        >> output.jsonl
"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Base64;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import org.apache.hc.client5.http.async.methods.SimpleHttpRequest;
import org.apache.hc.client5.http.async.methods.SimpleHttpResponse;
import org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient;
import org.apache.hc.client5.http.impl.async.HttpAsyncClients;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManagerBuilder;
import org.apache.hc.client5.http.ssl.ClientTlsStrategyBuilder;
import org.apache.hc.core5.concurrent.FutureCallback;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.nio.ssl.TlsStrategy;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws ExecutionException, InterruptedException, IOException, ParseException {

    Object[][] input = {
      {"https://toscrape.com", 1},
      {"https://bookstoscrape.com", 2},
      {"https://quotes.toscrape.com", 3}
    };
    List<Future> futures = new ArrayList<Future>();
    List<String> output = new ArrayList<String>();

    int concurrency = 15;

    // https://issues.apache.org/jira/browse/HTTPCLIENT-2219
    final TlsStrategy tlsStrategy = ClientTlsStrategyBuilder.create().useSystemProperties().build();

    PoolingAsyncClientConnectionManager connectionManager =
        PoolingAsyncClientConnectionManagerBuilder.create().setTlsStrategy(tlsStrategy).build();
    connectionManager.setMaxTotal(concurrency);
    connectionManager.setDefaultMaxPerRoute(concurrency);

    CloseableHttpAsyncClient client =
        HttpAsyncClients.custom().setConnectionManager(connectionManager).build();
    try {
      client.start();
      for (int i = 0; i < input.length; i++) {
        Map<String, Object> parameters =
            ImmutableMap.of("url", input[i][0], "browserHtml", true, "echoData", input[i][1]);
        String requestBody = new Gson().toJson(parameters);

        SimpleHttpRequest request =
            new SimpleHttpRequest("POST", "https://api.zyte.com/v1/extract");
        request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
        request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
        request.setBody(requestBody, ContentType.APPLICATION_JSON);

        final Future<SimpleHttpResponse> future =
            client.execute(
                request,
                new FutureCallback<SimpleHttpResponse>() {
                  public void completed(final SimpleHttpResponse response) {
                    String apiResponse = response.getBodyText();
                    output.add(apiResponse);
                  }

                  public void failed(final Exception ex) {}

                  public void cancelled() {}
                });
        futures.add(future);
      }
      for (int i = 0; i < futures.size(); i++) {
        futures.get(i).get();
      }
    } finally {
      client.close();
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const { ConcurrencyManager } = require('axios-concurrency')
const axios = require('axios')

const urls = [
  ['https://toscrape.com', 1],
  ['https://books.toscrape.com', 2],
  ['https://quotes.toscrape.com', 3]
]
const output = []

const client = axios.create()
ConcurrencyManager(client, 15)

Promise.all(
  urls.map((input) =>
    client.post(
      'https://api.zyte.com/v1/extract',
      { url: input[0], browserHtml: true, echoData: input[1] },
      {
        auth: { username: 'YOUR_ZYTE_API_KEY' }
      }
    ).then((response) => output.push(response.data))
  )
)
```

#### PHP

```php
<?php

$input = [
    ['https://toscrape.com', 1],
    ['https://books.toscrape.com', 2],
    ['https://quotes.toscrape.com', 3],
];
$output = [];
$promises = [];

$client = new GuzzleHttp\Client();

foreach ($input as $url_and_index) {
    $options = [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => $url_and_index[0],
            'browserHtml' => true,
            'echoData' => $url_and_index[1],
        ],
    ];
    $request = new \GuzzleHttp\Psr7\Request('POST', 'https://api.zyte.com/v1/extract');
    global $promises;
    $promises[] = $client->sendAsync($request, $options)->then(function ($response) {
        global $output;
        $output[] = json_decode($response->getBody());
    });
}

foreach ($promises as $promise) {
    $promise->wait();
}
```

#### Proxy mode

With the proxy mode you cannot set
request metadata.

#### Python

```python
import asyncio

import aiohttp

input_data = [
    ("https://toscrape.com", 1),
    ("https://books.toscrape.com", 2),
    ("https://quotes.toscrape.com", 3),
]
output = []

async def extract(client, url, index):
    response = await client.post(
        "https://api.zyte.com/v1/extract",
        json={"url": url, "browserHtml": True, "echoData": index},
        auth=aiohttp.BasicAuth("YOUR_ZYTE_API_KEY"),
    )
    output.append(await response.json())

async def main():
    connector = aiohttp.TCPConnector(limit_per_host=15)
    async with aiohttp.ClientSession(connector=connector) as client:
        await asyncio.gather(
            *[extract(client, url, index) for url, index in input_data]
        )

asyncio.run(main())
```

#### Python client

```python
import asyncio
import json

from zyte_api import AsyncZyteAPI

input_data = [
    ("https://toscrape.com", 1),
    ("https://books.toscrape.com", 2),
    ("https://quotes.toscrape.com", 3),
]

async def main():
    client = AsyncZyteAPI(n_conn=15)
    queries = [
        {"url": url, "browserHtml": True, "echoData": index}
        for url, index in input_data
    ]
    async with client.session() as session:
        for future in session.iter(queries):
            response = await future
            print(json.dumps(response))

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

input_data = [
    ("https://toscrape.com", 1),
    ("https://books.toscrape.com", 2),
    ("https://quotes.toscrape.com", 3),
]

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    custom_settings = {
        "CONCURRENT_REQUESTS": 15,
        "CONCURRENT_REQUESTS_PER_DOMAIN": 15,
    }

    async def start(self):
        for url, index in input_data:
            yield Request(
                url,
                meta={
                    "zyte_api_automap": {
                        "browserHtml": True,
                        "echoData": index,
                    },
                },
            )

    def parse(self, response):
        yield {
            "index": response.raw_api_response["echoData"],
            "html": response.text,
        }
```

Alternatively, you can use Scrapy’s `Request.cb_kwargs` directly for a
similar purpose:

```python

    async def start(self):
        for url, index in input_data:
            yield Request(
                url,
                cb_kwargs={"index": index},
                meta={
                    "zyte_api_automap": {
                        "browserHtml": True,
                    },
                },
            )

    def parse(self, response, index):
        yield {
            "index": index,
            "html": response.text,
        }

```

Output:

```json
{"url": "https://quotes.toscrape.com/", "statusCode": 200, "browserHtml": "<!DOCTYPE html><html lang=\"en\"><head>\n\t<meta charset=\"UTF-8\">\n\t<title>Quotes to Scrape</title>\n    <link rel=\"stylesheet\" href=\"/static/bootstrap.min.css\">\n    <link rel=\"stylesheet\" href=\"/static/main.css\">\n</head>\n<body>\n    <div class=\"container\">\n        <div class=\"row header-box\">\n            <div class=\"col-md-8\">\n                <h1>\n                    <a href=\"/\" style=\"text-decoration: none\">Quotes to Scrape</a>\n                </h1>\n            </div>\n            <div class=\"col-md-4\">\n                <p>\n                \n                    <a href=\"/login\">Login</a>\n                \n                </p>\n            </div>\n        </div>\n    \n\n<div class=\"row\">\n    <div class=\"col-md-8\">\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n        <a href=\"/author/Albert-Einstein\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"change,deep-thoughts,thinking,world\"> \n            \n            <a class=\"tag\" href=\"/tag/change/page/1/\">change</a>\n            \n            <a class=\"tag\" href=\"/tag/deep-thoughts/page/1/\">deep-thoughts</a>\n            \n            <a class=\"tag\" href=\"/tag/thinking/page/1/\">thinking</a>\n            \n            <a class=\"tag\" href=\"/tag/world/page/1/\">world</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“It is our choices, Harry, that show what we truly are, far more than our abilities.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">J.K. Rowling</small>\n        <a href=\"/author/J-K-Rowling\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"abilities,choices\"> \n            \n            <a class=\"tag\" href=\"/tag/abilities/page/1/\">abilities</a>\n            \n            <a class=\"tag\" href=\"/tag/choices/page/1/\">choices</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n        <a href=\"/author/Albert-Einstein\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"inspirational,life,live,miracle,miracles\"> \n            \n            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n            \n            <a class=\"tag\" href=\"/tag/life/page/1/\">life</a>\n            \n            <a class=\"tag\" href=\"/tag/live/page/1/\">live</a>\n            \n            <a class=\"tag\" href=\"/tag/miracle/page/1/\">miracle</a>\n            \n            <a class=\"tag\" href=\"/tag/miracles/page/1/\">miracles</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Jane Austen</small>\n        <a href=\"/author/Jane-Austen\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"aliteracy,books,classic,humor\"> \n            \n            <a class=\"tag\" href=\"/tag/aliteracy/page/1/\">aliteracy</a>\n            \n            <a class=\"tag\" href=\"/tag/books/page/1/\">books</a>\n            \n            <a class=\"tag\" href=\"/tag/classic/page/1/\">classic</a>\n            \n            <a class=\"tag\" href=\"/tag/humor/page/1/\">humor</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Marilyn Monroe</small>\n        <a href=\"/author/Marilyn-Monroe\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"be-yourself,inspirational\"> \n            \n            <a class=\"tag\" href=\"/tag/be-yourself/page/1/\">be-yourself</a>\n            \n            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“Try not to become a man of success. Rather become a man of value.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Albert Einstein</small>\n        <a href=\"/author/Albert-Einstein\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"adulthood,success,value\"> \n            \n            <a class=\"tag\" href=\"/tag/adulthood/page/1/\">adulthood</a>\n            \n            <a class=\"tag\" href=\"/tag/success/page/1/\">success</a>\n            \n            <a class=\"tag\" href=\"/tag/value/page/1/\">value</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“It is better to be hated for what you are than to be loved for what you are not.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">André Gide</small>\n        <a href=\"/author/Andre-Gide\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"life,love\"> \n            \n            <a class=\"tag\" href=\"/tag/life/page/1/\">life</a>\n            \n            <a class=\"tag\" href=\"/tag/love/page/1/\">love</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“I have not failed. I've just found 10,000 ways that won't work.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Thomas A. Edison</small>\n        <a href=\"/author/Thomas-A-Edison\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"edison,failure,inspirational,paraphrased\"> \n            \n            <a class=\"tag\" href=\"/tag/edison/page/1/\">edison</a>\n            \n            <a class=\"tag\" href=\"/tag/failure/page/1/\">failure</a>\n            \n            <a class=\"tag\" href=\"/tag/inspirational/page/1/\">inspirational</a>\n            \n            <a class=\"tag\" href=\"/tag/paraphrased/page/1/\">paraphrased</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“A woman is like a tea bag; you never know how strong it is until it's in hot water.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Eleanor Roosevelt</small>\n        <a href=\"/author/Eleanor-Roosevelt\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"misattributed-eleanor-roosevelt\"> \n            \n            <a class=\"tag\" href=\"/tag/misattributed-eleanor-roosevelt/page/1/\">misattributed-eleanor-roosevelt</a>\n            \n        </div>\n    </div>\n\n    <div class=\"quote\" itemscope=\"\" itemtype=\"http://schema.org/CreativeWork\">\n        <span class=\"text\" itemprop=\"text\">“A day without sunshine is like, you know, night.”</span>\n        <span>by <small class=\"author\" itemprop=\"author\">Steve Martin</small>\n        <a href=\"/author/Steve-Martin\">(about)</a>\n        </span>\n        <div class=\"tags\">\n            Tags:\n            <meta class=\"keywords\" itemprop=\"keywords\" content=\"humor,obvious,simile\"> \n            \n            <a class=\"tag\" href=\"/tag/humor/page/1/\">humor</a>\n            \n            <a class=\"tag\" href=\"/tag/obvious/page/1/\">obvious</a>\n            \n            <a class=\"tag\" href=\"/tag/simile/page/1/\">simile</a>\n            \n        </div>\n    </div>\n\n    <nav>\n        <ul class=\"pager\">\n            \n            \n            <li class=\"next\">\n                <a href=\"/page/2/\">Next <span aria-hidden=\"true\">→</span></a>\n            </li>\n            \n        </ul>\n    </nav>\n    </div>\n    <div class=\"col-md-4 tags-box\">\n        \n            <h2>Top Ten tags</h2>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 28px\" href=\"/tag/love/\">love</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 26px\" href=\"/tag/inspirational/\">inspirational</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 26px\" href=\"/tag/life/\">life</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 24px\" href=\"/tag/humor/\">humor</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 22px\" href=\"/tag/books/\">books</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 14px\" href=\"/tag/reading/\">reading</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 10px\" href=\"/tag/friendship/\">friendship</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 8px\" href=\"/tag/friends/\">friends</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 8px\" href=\"/tag/truth/\">truth</a>\n            </span>\n            \n            <span class=\"tag-item\">\n            <a class=\"tag\" style=\"font-size: 6px\" href=\"/tag/simile/\">simile</a>\n            </span>\n            \n        \n    </div>\n</div>\n\n    </div>\n    <footer class=\"footer\">\n        <div class=\"container\">\n            <p class=\"text-muted\">\n                Quotes by: <a href=\"https://www.goodreads.com/quotes\">GoodReads.com</a>\n            </p>\n            <p class=\"copyright\">\n                Made with <span class=\"zyte\">❤</span> by <a class=\"zyte\" href=\"https://www.zyte.com\">Zyte</a>\n            </p>\n        </div>\n    </footer>\n\n</body></html>", "echoData": 3}
{"url": "https://books.toscrape.com/", "statusCode": 200, "browserHtml": "<!DOCTYPE html><!--[if lt IE 7]>      <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8 lt-ie7\"> <![endif]--><!--[if IE 7]>         <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8\"> <![endif]--><!--[if IE 8]>         <html lang=\"en-us\" class=\"no-js lt-ie9\"> <![endif]--><!--[if gt IE 8]><!--><html lang=\"en-us\" class=\"no-js\"><!--<![endif]--><head>\n        <title>\n    All products | Books to Scrape - Sandbox\n</title>\n\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n        <meta name=\"created\" content=\"24th Jun 2016 09:29\">\n        <meta name=\"description\" content=\"\">\n        <meta name=\"viewport\" content=\"width=device-width\">\n        <meta name=\"robots\" content=\"NOARCHIVE,NOCACHE\">\n\n        <!-- Le HTML5 shim, for IE6-8 support of HTML elements -->\n        <!--[if lt IE 9]>\n        <script src=\"//html5shim.googlecode.com/svn/trunk/html5.js\"></script>\n        <![endif]-->\n\n        \n            <link rel=\"shortcut icon\" href=\"static/oscar/favicon.ico\">\n        \n\n        \n        \n    \n    \n        <link rel=\"stylesheet\" type=\"text/css\" href=\"static/oscar/css/styles.css\">\n    \n    <link rel=\"stylesheet\" href=\"static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.css\">\n    <link rel=\"stylesheet\" type=\"text/css\" href=\"static/oscar/css/datetimepicker.css\">\n\n\n        \n        \n\n        \n\n        \n            \n            \n\n        \n    </head>\n\n    <body id=\"default\" class=\"default\">\n        \n        \n    \n    \n    <header class=\"header container-fluid\">\n        <div class=\"page_inner\">\n            <div class=\"row\">\n                <div class=\"col-sm-8 h1\"><a href=\"index.html\">Books to Scrape</a><small> We love being scraped!</small>\n</div>\n\n                \n            </div>\n        </div>\n    </header>\n\n    \n    \n<div class=\"container-fluid page\">\n    <div class=\"page_inner\">\n        \n    <ul class=\"breadcrumb\">\n        <li>\n            <a href=\"index.html\">Home</a>\n        </li>\n        <li class=\"active\">All products</li>\n    </ul>\n\n        <div class=\"row\">\n\n            <aside class=\"sidebar col-sm-4 col-md-3\">\n                \n                <div id=\"promotions_left\">\n                    \n                </div>\n                \n    \n    \n        \n        <div class=\"side_categories\">\n            <ul class=\"nav nav-list\">\n                \n                    <li>\n                        <a href=\"catalogue/category/books_1/index.html\">\n                            \n                                Books\n                            \n                        </a>\n\n                        <ul>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/travel_2/index.html\">\n                            \n                                Travel\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/mystery_3/index.html\">\n                            \n                                Mystery\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/historical-fiction_4/index.html\">\n                            \n                                Historical Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/sequential-art_5/index.html\">\n                            \n                                Sequential Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/classics_6/index.html\">\n                            \n                                Classics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/philosophy_7/index.html\">\n                            \n                                Philosophy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/romance_8/index.html\">\n                            \n                                Romance\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/womens-fiction_9/index.html\">\n                            \n                                Womens Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/fiction_10/index.html\">\n                            \n                                Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/childrens_11/index.html\">\n                            \n                                Childrens\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/religion_12/index.html\">\n                            \n                                Religion\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/nonfiction_13/index.html\">\n                            \n                                Nonfiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/music_14/index.html\">\n                            \n                                Music\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/default_15/index.html\">\n                            \n                                Default\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/science-fiction_16/index.html\">\n                            \n                                Science Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/sports-and-games_17/index.html\">\n                            \n                                Sports and Games\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/add-a-comment_18/index.html\">\n                            \n                                Add a comment\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/fantasy_19/index.html\">\n                            \n                                Fantasy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/new-adult_20/index.html\">\n                            \n                                New Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/young-adult_21/index.html\">\n                            \n                                Young Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/science_22/index.html\">\n                            \n                                Science\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/poetry_23/index.html\">\n                            \n                                Poetry\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/paranormal_24/index.html\">\n                            \n                                Paranormal\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/art_25/index.html\">\n                            \n                                Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/psychology_26/index.html\">\n                            \n                                Psychology\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/autobiography_27/index.html\">\n                            \n                                Autobiography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/parenting_28/index.html\">\n                            \n                                Parenting\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/adult-fiction_29/index.html\">\n                            \n                                Adult Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/humor_30/index.html\">\n                            \n                                Humor\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/horror_31/index.html\">\n                            \n                                Horror\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/history_32/index.html\">\n                            \n                                History\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/food-and-drink_33/index.html\">\n                            \n                                Food and Drink\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/christian-fiction_34/index.html\">\n                            \n                                Christian Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/business_35/index.html\">\n                            \n                                Business\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/biography_36/index.html\">\n                            \n                                Biography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/thriller_37/index.html\">\n                            \n                                Thriller\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/contemporary_38/index.html\">\n                            \n                                Contemporary\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/spirituality_39/index.html\">\n                            \n                                Spirituality\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/academic_40/index.html\">\n                            \n                                Academic\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/self-help_41/index.html\">\n                            \n                                Self Help\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/historical_42/index.html\">\n                            \n                                Historical\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/christian_43/index.html\">\n                            \n                                Christian\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/suspense_44/index.html\">\n                            \n                                Suspense\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/short-stories_45/index.html\">\n                            \n                                Short Stories\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/novels_46/index.html\">\n                            \n                                Novels\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/health_47/index.html\">\n                            \n                                Health\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/politics_48/index.html\">\n                            \n                                Politics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/cultural_49/index.html\">\n                            \n                                Cultural\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/erotica_50/index.html\">\n                            \n                                Erotica\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"catalogue/category/books/crime_51/index.html\">\n                            \n                                Crime\n                            \n                        </a>\n\n                        </li>\n                        \n                            </ul></li>\n                        \n                \n            </ul>\n        </div>\n    \n    \n\n            </aside>\n\n            <div class=\"col-sm-8 col-md-9\">\n                \n                <div class=\"page-header action\">\n                    <h1>All products</h1>\n                </div>\n                \n\n                \n\n\n\n<div id=\"messages\">\n\n</div>\n\n\n                <div id=\"promotions\">\n                    \n                </div>\n\n                \n    <form method=\"get\" class=\"form-horizontal\">\n        \n        <div style=\"display:none\">\n            \n            \n        </div>\n\n        \n            \n                \n                    <strong>1000</strong> results - showing <strong>1</strong> to <strong>20</strong>.\n                \n            \n            \n        \n    </form>\n    \n        <section>\n            <div class=\"alert alert-warning\" role=\"alert\"><strong>Warning!</strong> This is a demo website for web scraping purposes. Prices and ratings here were randomly assigned and have no real meaning.</div>\n\n            <div>\n                <ol class=\"row\">\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/a-light-in-the-attic_1000/index.html\"><img src=\"media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg\" alt=\"A Light in the Attic\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/a-light-in-the-attic_1000/index.html\" title=\"A Light in the Attic\">A Light in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.77</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/tipping-the-velvet_999/index.html\"><img src=\"media/cache/26/0c/260c6ae16bce31c8f8c95daddd9f4a1c.jpg\" alt=\"Tipping the Velvet\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/tipping-the-velvet_999/index.html\" title=\"Tipping the Velvet\">Tipping the Velvet</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£53.74</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/soumission_998/index.html\"><img src=\"media/cache/3e/ef/3eef99c9d9adef34639f510662022830.jpg\" alt=\"Soumission\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/soumission_998/index.html\" title=\"Soumission\">Soumission</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£50.10</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/sharp-objects_997/index.html\"><img src=\"media/cache/32/51/3251cf3a3412f53f339e42cac2134093.jpg\" alt=\"Sharp Objects\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/sharp-objects_997/index.html\" title=\"Sharp Objects\">Sharp Objects</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£47.82</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/sapiens-a-brief-history-of-humankind_996/index.html\"><img src=\"media/cache/be/a5/bea5697f2534a2f86a3ef27b5a8c12a6.jpg\" alt=\"Sapiens: A Brief History of Humankind\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/sapiens-a-brief-history-of-humankind_996/index.html\" title=\"Sapiens: A Brief History of Humankind\">Sapiens: A Brief History ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£54.23</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-requiem-red_995/index.html\"><img src=\"media/cache/68/33/68339b4c9bc034267e1da611ab3b34f8.jpg\" alt=\"The Requiem Red\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-requiem-red_995/index.html\" title=\"The Requiem Red\">The Requiem Red</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.65</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\"><img src=\"media/cache/92/27/92274a95b7c251fea59a2b8a78275ab4.jpg\" alt=\"The Dirty Little Secrets of Getting Your Dream Job\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\" title=\"The Dirty Little Secrets of Getting Your Dream Job\">The Dirty Little Secrets ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.34</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\"><img src=\"media/cache/3d/54/3d54940e57e662c4dd1f3ff00c78cc64.jpg\" alt=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\" title=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\">The Coming Woman: A ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.93</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\"><img src=\"media/cache/66/88/66883b91f6804b2323c8369331cb7dd1.jpg\" alt=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\" title=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\">The Boys in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.60</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/the-black-maria_991/index.html\"><img src=\"media/cache/58/46/5846057e28022268153beff6d352b06c.jpg\" alt=\"The Black Maria\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/the-black-maria_991/index.html\" title=\"The Black Maria\">The Black Maria</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.15</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/starving-hearts-triangular-trade-trilogy-1_990/index.html\"><img src=\"media/cache/be/f4/bef44da28c98f905a3ebec0b87be8530.jpg\" alt=\"Starving Hearts (Triangular Trade Trilogy, #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/starving-hearts-triangular-trade-trilogy-1_990/index.html\" title=\"Starving Hearts (Triangular Trade Trilogy, #1)\">Starving Hearts (Triangular Trade ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£13.99</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/shakespeares-sonnets_989/index.html\"><img src=\"media/cache/10/48/1048f63d3b5061cd2f424d20b3f9b666.jpg\" alt=\"Shakespeare's Sonnets\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/shakespeares-sonnets_989/index.html\" title=\"Shakespeare's Sonnets\">Shakespeare's Sonnets</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£20.66</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/set-me-free_988/index.html\"><img src=\"media/cache/5b/88/5b88c52633f53cacf162c15f4f823153.jpg\" alt=\"Set Me Free\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/set-me-free_988/index.html\" title=\"Set Me Free\">Set Me Free</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.46</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\"><img src=\"media/cache/94/b1/94b1b8b244bce9677c2f29ccc890d4d2.jpg\" alt=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\" title=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\">Scott Pilgrim's Precious Little ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.29</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/rip-it-up-and-start-again_986/index.html\"><img src=\"media/cache/81/c4/81c4a973364e17d01f217e1188253d5e.jpg\" alt=\"Rip it Up and Start Again\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/rip-it-up-and-start-again_986/index.html\" title=\"Rip it Up and Start Again\">Rip it Up and ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£35.02</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\"><img src=\"media/cache/54/60/54607fe8945897cdcced0044103b10b6.jpg\" alt=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\" title=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\">Our Band Could Be ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£57.25</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/olio_984/index.html\"><img src=\"media/cache/55/33/553310a7162dfbc2c6d19a84da0df9e1.jpg\" alt=\"Olio\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/olio_984/index.html\" title=\"Olio\">Olio</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£23.88</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\"><img src=\"media/cache/09/a3/09a3aef48557576e1a85ba7efea8ecb7.jpg\" alt=\"Mesaerion: The Best Science Fiction Stories 1800-1849\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\" title=\"Mesaerion: The Best Science Fiction Stories 1800-1849\">Mesaerion: The Best Science ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£37.59</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/libertarianism-for-beginners_982/index.html\"><img src=\"media/cache/0b/bc/0bbcd0a6f4bcd81ccb1049a52736406e.jpg\" alt=\"Libertarianism for Beginners\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/libertarianism-for-beginners_982/index.html\" title=\"Libertarianism for Beginners\">Libertarianism for Beginners</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.33</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"catalogue/its-only-the-himalayas_981/index.html\"><img src=\"media/cache/27/a5/27a53d0bb95bdd88288eaf66c9230d7e.jpg\" alt=\"It's Only the Himalayas\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"catalogue/its-only-the-himalayas_981/index.html\" title=\"It's Only the Himalayas\">It's Only the Himalayas</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£45.17</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                </ol>\n                \n\n\n\n    <div>\n        <ul class=\"pager\">\n            \n            <li class=\"current\">\n            \n                Page 1 of 50\n            \n            </li>\n            \n                <li class=\"next\"><a href=\"catalogue/page-2.html\">next</a></li>\n            \n        </ul>\n    </div>\n\n\n            </div>\n        </section>\n    \n\n\n            </div>\n\n        </div><!-- /row -->\n    </div><!-- /page_inner -->\n</div><!-- /container-fluid -->\n\n\n    \n<footer class=\"footer container-fluid\">\n    \n        \n    \n</footer>\n\n\n        \n        \n  \n            <!-- jQuery -->\n            <script src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js\"></script>\n            <script>window.jQuery || document.write('<script src=\"static/oscar/js/jquery/jquery-1.9.1.min.js\"><\\/script>')</script><script src=\"static/oscar/js/jquery/jquery-1.9.1.min.js\"></script>\n        \n  \n\n\n        \n        \n    \n        \n    <script type=\"text/javascript\" src=\"static/oscar/js/bootstrap3/bootstrap.min.js\"></script>\n    <!-- Oscar -->\n    <script src=\"static/oscar/js/oscar/ui.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n    <script src=\"static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n    <script src=\"static/oscar/js/bootstrap-datetimepicker/locales/bootstrap-datetimepicker.all.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n\n        \n        \n    \n\n    \n\n\n        \n        <script type=\"text/javascript\">\n            $(function() {\n                \n    \n    \n    oscar.init();\n\n    oscar.search.init();\n\n            });\n        </script>\n\n        \n        <!-- Version: N/A -->\n        \n    \n\n</body></html>", "echoData": 2}
{"url": "https://toscrape.com/", "statusCode": 200, "browserHtml": "<!DOCTYPE html><html lang=\"en\"><head>\n        <meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\n        <title>Scraping Sandbox</title>\n        <link href=\"./css/bootstrap.min.css\" rel=\"stylesheet\">\n        <link href=\"./css/main.css\" rel=\"stylesheet\">\n    </head>\n    <body>\n        <div class=\"container\">\n            <div class=\"row\">\n                <div class=\"col-md-1\"></div>\n                <div class=\"col-md-10 well\">\n                    <img class=\"logo\" src=\"img/zyte.png\" width=\"200px\">\n                    <h1 class=\"text-right\">Web Scraping Sandbox</h1>\n                </div>\n            </div>\n\n            <div class=\"row\">\n                <div class=\"col-md-1\"></div>\n                <div class=\"col-md-10\">\n                    <h2>Books</h2>\n                    <p>A <a href=\"http://books.toscrape.com\">fictional bookstore</a> that desperately wants to be scraped. It's a safe place for beginners learning web scraping and for developers validating their scraping technologies as well. Available at: <a href=\"http://books.toscrape.com\">books.toscrape.com</a></p>\n                    <div class=\"col-md-6\">\n                        <a href=\"http://books.toscrape.com\"><img src=\"./img/books.png\" class=\"img-thumbnail\"></a>\n                    </div>\n                    <div class=\"col-md-6\">\n                        <table class=\"table table-hover\">\n                            <tbody><tr><th colspan=\"2\">Details</th></tr>\n                            <tr><td>Amount of items </td><td>1000</td></tr>\n                            <tr><td>Pagination </td><td>✔</td></tr>\n                            <tr><td>Items per page </td><td>max 20</td></tr>\n                            <tr><td>Requires JavaScript </td><td>✘</td></tr>\n                        </tbody></table>\n                    </div>\n                </div>\n            </div>\n\n            <div class=\"row\">\n                <div class=\"col-md-1\"></div>\n                <div class=\"col-md-10\">\n                    <h2>Quotes</h2>\n                    <p><a href=\"http://quotes.toscrape.com/\">A website</a> that lists quotes from famous people. It has many endpoints showing the quotes in many different ways, each of them including new scraping challenges for you, as described below.</p>\n                    <div class=\"col-md-6\">\n                        <a href=\"http://quotes.toscrape.com\"><img src=\"./img/quotes.png\" class=\"img-thumbnail\"></a>\n                    </div>\n                    <div class=\"col-md-6\">\n                        <table class=\"table table-hover\">\n                            <tbody><tr><th colspan=\"2\">Endpoints</th></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/\">Default</a></td><td>Microdata and pagination</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/scroll\">Scroll</a> </td><td>infinite scrolling pagination</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/js\">JavaScript</a> </td><td>JavaScript generated content</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/js-delayed\">Delayed</a> </td><td>Same as JavaScript but with a delay (?delay=10000)</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/tableful\">Tableful</a> </td><td>a table based messed-up layout</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/login\">Login</a> </td><td>login with CSRF token (any user/passwd works)</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/search.aspx\">ViewState</a> </td><td>an AJAX based filter form with ViewStates</td></tr>\n                            <tr><td><a href=\"http://quotes.toscrape.com/random\">Random</a> </td><td>a single random quote</td></tr>\n                        </tbody></table>\n                    </div>\n                </div>\n            </div>\n        </div>\n    \n\n</body></html>", "echoData": 1}
```

#### Sending a POST request

> ###### TIP
>
> For a more complete example featuring a request body and headers,
> see the HTML form example.

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var method = responseData.RootElement.GetProperty("method").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "httpRequestMethod": "POST"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .method
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "httpRequestMethod": "POST"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq .method
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "httpRequestMethod",
            "POST");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String method = data.get("method").getAsString();
          System.out.println(method);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    httpRequestMethod: 'POST'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const method = JSON.parse(httpResponseBody).method
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$method = json_decode($http_response_body)->method;
```

#### Proxy mode

With the proxy mode, the request method
from your requests is used automatically.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -X POST \
    https://httpbin.org/anything \
    | jq .method
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "httpRequestMethod": "POST",
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
method = json.loads(http_response_body)["method"]
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "httpRequestMethod": "POST",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    method = json.loads(http_response_body)["method"]
    print(method)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            method="POST",
        )

    def parse(self, response):
        method = json.loads(response.text)["method"]
```

Output:

```json
"POST"
```

#### Using network capture to intercept background requests sent during browser rendering

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/scroll"},
    {"browserHtml", true},
    {
        "networkCapture",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"filterType", "url"},
                {"httpResponseBody", true},
                {"value", "/api/"},
                {"matchType", "contains"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var apiBody = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(apiBody);
var captureEnumerator = data.RootElement.GetProperty("networkCapture").EnumerateArray();
captureEnumerator.MoveNext();
var capture = captureEnumerator.Current;
var base64Body = capture.GetProperty("httpResponseBody").ToString();
var body = System.Convert.FromBase64String(base64Body);

var captureData = JsonDocument.Parse(body);
var quoteEnumerator = captureData.RootElement.GetProperty("quotes").EnumerateArray();
quoteEnumerator.MoveNext();
var quote = quoteEnumerator.Current;
var authorEnumerator = quote.GetProperty("author").EnumerateObject();
while (authorEnumerator.MoveNext())
{
    if (authorEnumerator.Current.Name.ToString() == "name")
    {
        Console.WriteLine(authorEnumerator.Current.Value.ToString());
        break;
    }
}
```

#### CLI client

input.jsonl
```json
{"url": "https://quotes.toscrape.com/scroll", "browserHtml": true, "networkCapture": [{"filterType": "url", "httpResponseBody": true, "value": "/api/", "matchType": "contains"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output ".networkCapture[0].httpResponseBody" \
    | base64 --decode \
    | jq --raw-output ".quotes[0].author.name"
```

#### curl

input.json
```json
{
    "url": "https://quotes.toscrape.com/scroll",
    "browserHtml": true,
    "networkCapture": [
        {
            "filterType": "url",
            "httpResponseBody": true,
            "value": "/api/",
            "matchType": "contains"
        }
    ]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output ".networkCapture[0].httpResponseBody" \
    | base64 --decode \
    | jq --raw-output ".quotes[0].author.name"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> filter =
        ImmutableMap.of(
            "filterType",
            "url",
            "httpResponseBody",
            true,
            "value",
            "/api/",
            "matchType",
            "contains");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://quotes.toscrape.com/scroll",
            "browserHtml",
            true,
            "networkCapture",
            Collections.singletonList(filter));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonArray captures = jsonObject.get("networkCapture").getAsJsonArray();
          JsonObject capture = captures.get(0).getAsJsonObject();
          byte[] bodyBytes =
              Base64.getDecoder().decode(capture.get("httpResponseBody").getAsString());
          String body = new String(bodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(body).getAsJsonObject();
          JsonObject quote = data.get("quotes").getAsJsonArray().get(0).getAsJsonObject();
          String authorName = quote.get("author").getAsJsonObject().get("name").getAsString();
          System.out.println(authorName);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/scroll',
    browserHtml: true,
    networkCapture: [
      {
        filterType: 'url',
        httpResponseBody: true,
        value: '/api/',
        matchType: 'contains'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const capture = response.data.networkCapture[0]
  const data = JSON.parse(Buffer.from(capture.httpResponseBody, 'base64'))
  console.log(data.quotes[0].author.name)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/scroll',
        'browserHtml' => true,
        'networkCapture' => [
            [
                'filterType' => 'url',
                'httpResponseBody' => true,
                'value' => '/api/',
                'matchType' => 'contains',
            ],
        ],
    ],
]);
$api_response = json_decode($response->getBody());
$capture = $api_response->networkCapture[0];
$data = json_decode(base64_decode($capture->httpResponseBody));
echo $data->quotes[0]->author->name.PHP_EOL;
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/scroll",
        "browserHtml": True,
        "networkCapture": [
            {
                "filterType": "url",
                "httpResponseBody": True,
                "value": "/api/",
                "matchType": "contains",
            },
        ],
    },
)
capture = api_response.json()["networkCapture"][0]
data = json.loads(b64decode(capture["httpResponseBody"]).decode())
print(data["quotes"][0]["author"]["name"])
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://quotes.toscrape.com/scroll",
            "browserHtml": True,
            "networkCapture": [
                {
                    "filterType": "url",
                    "httpResponseBody": True,
                    "value": "/api/",
                    "matchType": "contains",
                },
            ],
        },
    )
    capture = api_response["networkCapture"][0]
    data = json.loads(b64decode(capture["httpResponseBody"]).decode())
    print(data["quotes"][0]["author"]["name"])

asyncio.run(main())
```

#### Scrapy

```python
import json
from base64 import b64decode

from scrapy import Request, Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "networkCapture": [
                        {
                            "filterType": "url",
                            "httpResponseBody": True,
                            "value": "/api/",
                            "matchType": "contains",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        capture = response.raw_api_response["networkCapture"][0]
        data = json.loads(b64decode(capture["httpResponseBody"]).decode())
        print(data["quotes"][0]["author"]["name"])
```

Output:

```none
Albert Einstein
```

#### Sending multiple requests in parallel

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

var urls = new string[2];
urls[0] = "https://books.toscrape.com/catalogue/page-1.html";
urls[1] = "https://books.toscrape.com/catalogue/page-2.html";
var output = new List<HttpResponseMessage>();

var handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All,
    MaxConnectionsPerServer = 15
};
var client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var responseTasks = new List<Task<HttpResponseMessage>>();
foreach (var url in urls)
{
    var input = new Dictionary<string, object>(){
        {"url", url},
        {"browserHtml", true}
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");
    var responseTask = client.PostAsync("https://api.zyte.com/v1/extract", content);
    responseTasks.Add(responseTask);
}

while (responseTasks.Any())
{
    var responseTask = await Task.WhenAny(responseTasks);
    responseTasks.Remove(responseTask);
    var response = await responseTask;
    output.Add(response);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://books.toscrape.com/catalogue/page-1.html", "browserHtml": true}
{"url": "https://books.toscrape.com/catalogue/page-2.html", "browserHtml": true}
```

```shell
zyte-api --n-conn 15 input.jsonl -o output.jsonl
```

#### curl

input.jsonl
```json
{"url": "https://books.toscrape.com/catalogue/page-1.html", "browserHtml": true}
{"url": "https://books.toscrape.com/catalogue/page-2.html", "browserHtml": true}
```

```shell
cat input.jsonl \
    | xargs -P 15 -d\\n -n 1 \
    bash -c "
        curl \
            --user YOUR_ZYTE_API_KEY: \
            --header 'Content-Type: application/json' \
            --data \"\$0\" \
            --compressed \
            https://api.zyte.com/v1/extract \
        | awk '{print \$1}' \
        >> output.jsonl
"
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Base64;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import org.apache.hc.client5.http.async.methods.SimpleHttpRequest;
import org.apache.hc.client5.http.async.methods.SimpleHttpResponse;
import org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient;
import org.apache.hc.client5.http.impl.async.HttpAsyncClients;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager;
import org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManagerBuilder;
import org.apache.hc.client5.http.ssl.ClientTlsStrategyBuilder;
import org.apache.hc.core5.concurrent.FutureCallback;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.nio.ssl.TlsStrategy;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws ExecutionException, InterruptedException, IOException, ParseException {

    String[] urls = {
      "https://books.toscrape.com/catalogue/page-1.html",
      "https://books.toscrape.com/catalogue/page-2.html"
    };
    List<Future> futures = new ArrayList<Future>();
    List<String> output = new ArrayList<String>();

    int concurrency = 15;

    // https://issues.apache.org/jira/browse/HTTPCLIENT-2219
    final TlsStrategy tlsStrategy = ClientTlsStrategyBuilder.create().useSystemProperties().build();

    PoolingAsyncClientConnectionManager connectionManager =
        PoolingAsyncClientConnectionManagerBuilder.create().setTlsStrategy(tlsStrategy).build();
    connectionManager.setMaxTotal(concurrency);
    connectionManager.setDefaultMaxPerRoute(concurrency);

    CloseableHttpAsyncClient client =
        HttpAsyncClients.custom().setConnectionManager(connectionManager).build();
    try {
      client.start();
      for (int i = 0; i < urls.length; i++) {
        Map<String, Object> parameters = ImmutableMap.of("url", urls[i], "browserHtml", true);
        String requestBody = new Gson().toJson(parameters);

        SimpleHttpRequest request =
            new SimpleHttpRequest("POST", "https://api.zyte.com/v1/extract");
        request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
        request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
        request.setBody(requestBody, ContentType.APPLICATION_JSON);

        final Future<SimpleHttpResponse> future =
            client.execute(
                request,
                new FutureCallback<SimpleHttpResponse>() {
                  public void completed(final SimpleHttpResponse response) {
                    String apiResponse = response.getBodyText();
                    JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
                    String browserHtml = jsonObject.get("browserHtml").getAsString();
                    output.add(browserHtml);
                  }

                  public void failed(final Exception ex) {}

                  public void cancelled() {}
                });
        futures.add(future);
      }
      for (int i = 0; i < futures.size(); i++) {
        futures.get(i).get();
      }
    } finally {
      client.close();
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const { ConcurrencyManager } = require('axios-concurrency')
const axios = require('axios')

const urls = [
  'https://books.toscrape.com/catalogue/page-1.html',
  'https://books.toscrape.com/catalogue/page-2.html'
]
const output = []

const client = axios.create()
ConcurrencyManager(client, 15)

Promise.all(
  urls.map((url) =>
    client.post(
      'https://api.zyte.com/v1/extract',
      { url, browserHtml: true },
      {
        auth: { username: 'YOUR_ZYTE_API_KEY' }
      }
    ).then((response) => output.push(response.data))
  )
)
```

#### PHP

```php
<?php

$urls = [
  'https://books.toscrape.com/catalogue/page-1.html',
  'https://books.toscrape.com/catalogue/page-2.html',
];
$output = [];
$promises = [];

$client = new GuzzleHttp\Client();

foreach ($urls as $url) {
    $options = [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => $url,
            'browserHtml' => true,
        ],
    ];
    $request = new \GuzzleHttp\Psr7\Request('POST', 'https://api.zyte.com/v1/extract');
    global $promises;
    $promises[] = $client->sendAsync($request, $options)->then(function ($response) {
        global $output;
        $output[] = json_decode($response->getBody());
    });
}

foreach ($promises as $promise) {
    $promise->wait();
}
```

#### Python

```python
import asyncio

import aiohttp

urls = [
    "https://books.toscrape.com/catalogue/page-1.html",
    "https://books.toscrape.com/catalogue/page-2.html",
]
output = []

async def extract(client, url):
    response = await client.post(
        "https://api.zyte.com/v1/extract",
        json={"url": url, "browserHtml": True},
        auth=aiohttp.BasicAuth("YOUR_ZYTE_API_KEY"),
    )
    output.append(await response.json())

async def main():
    connector = aiohttp.TCPConnector(limit_per_host=15)
    async with aiohttp.ClientSession(connector=connector) as client:
        await asyncio.gather(*[extract(client, url) for url in urls])

asyncio.run(main())
```

#### Python client

```python
import asyncio

from zyte_api import AsyncZyteAPI

urls = [
    "https://books.toscrape.com/catalogue/page-1.html",
    "https://books.toscrape.com/catalogue/page-2.html",
]

async def main():
    client = AsyncZyteAPI(n_conn=15)
    queries = [{"url": url, "browserHtml": True} for url in urls]
    async with client.session() as session:
        for future in session.iter(queries):
            response = await future
            print(response)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

urls = [
    "https://books.toscrape.com/catalogue/page-1.html",
    "https://books.toscrape.com/catalogue/page-2.html",
]

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    custom_settings = {
        "CONCURRENT_REQUESTS": 15,
        "CONCURRENT_REQUESTS_PER_DOMAIN": 15,
    }

    async def start(self):
        for url in urls:
            yield Request(
                url,
                meta={
                    "zyte_api_automap": {
                        "browserHtml": True,
                    },
                },
            )

    def parse(self, response):
        yield {
            "url": response.url,
            "browserHtml": response.text,
        }
```

Output:

```json
{"url": "https://books.toscrape.com/catalogue/page-1.html", "statusCode": 200, "browserHtml": "<!DOCTYPE html><!--[if lt IE 7]>      <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8 lt-ie7\"> <![endif]--><!--[if IE 7]>         <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8\"> <![endif]--><!--[if IE 8]>         <html lang=\"en-us\" class=\"no-js lt-ie9\"> <![endif]--><!--[if gt IE 8]><!--><html lang=\"en-us\" class=\"no-js\"><!--<![endif]--><head>\n        <title>\n    All products | Books to Scrape - Sandbox\n</title>\n\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n        <meta name=\"created\" content=\"24th Jun 2016 09:30\">\n        <meta name=\"description\" content=\"\">\n        <meta name=\"viewport\" content=\"width=device-width\">\n        <meta name=\"robots\" content=\"NOARCHIVE,NOCACHE\">\n\n        <!-- Le HTML5 shim, for IE6-8 support of HTML elements -->\n        <!--[if lt IE 9]>\n        <script src=\"//html5shim.googlecode.com/svn/trunk/html5.js\"></script>\n        <![endif]-->\n\n        \n            <link rel=\"shortcut icon\" href=\"../static/oscar/favicon.ico\">\n        \n\n        \n        \n    \n    \n        <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/styles.css\">\n    \n    <link rel=\"stylesheet\" href=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.css\">\n    <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/datetimepicker.css\">\n\n\n        \n        \n\n        \n\n        \n            \n            \n\n        \n    </head>\n\n    <body id=\"default\" class=\"default\">\n        \n        \n    \n    \n    <header class=\"header container-fluid\">\n        <div class=\"page_inner\">\n            <div class=\"row\">\n                <div class=\"col-sm-8 h1\"><a href=\"../index.html\">Books to Scrape</a><small> We love being scraped!</small>\n</div>\n\n                \n            </div>\n        </div>\n    </header>\n\n    \n    \n<div class=\"container-fluid page\">\n    <div class=\"page_inner\">\n        \n    <ul class=\"breadcrumb\">\n        <li>\n            <a href=\"../index.html\">Home</a>\n        </li>\n        <li class=\"active\">All products</li>\n    </ul>\n\n        <div class=\"row\">\n\n            <aside class=\"sidebar col-sm-4 col-md-3\">\n                \n                <div id=\"promotions_left\">\n                    \n                </div>\n                \n    \n    \n        \n        <div class=\"side_categories\">\n            <ul class=\"nav nav-list\">\n                \n                    <li>\n                        <a href=\"category/books_1/index.html\">\n                            \n                                Books\n                            \n                        </a>\n\n                        <ul>\n                        \n                \n                    <li>\n                        <a href=\"category/books/travel_2/index.html\">\n                            \n                                Travel\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/mystery_3/index.html\">\n                            \n                                Mystery\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical-fiction_4/index.html\">\n                            \n                                Historical Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sequential-art_5/index.html\">\n                            \n                                Sequential Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/classics_6/index.html\">\n                            \n                                Classics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/philosophy_7/index.html\">\n                            \n                                Philosophy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/romance_8/index.html\">\n                            \n                                Romance\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/womens-fiction_9/index.html\">\n                            \n                                Womens Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fiction_10/index.html\">\n                            \n                                Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/childrens_11/index.html\">\n                            \n                                Childrens\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/religion_12/index.html\">\n                            \n                                Religion\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/nonfiction_13/index.html\">\n                            \n                                Nonfiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/music_14/index.html\">\n                            \n                                Music\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/default_15/index.html\">\n                            \n                                Default\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science-fiction_16/index.html\">\n                            \n                                Science Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sports-and-games_17/index.html\">\n                            \n                                Sports and Games\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/add-a-comment_18/index.html\">\n                            \n                                Add a comment\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fantasy_19/index.html\">\n                            \n                                Fantasy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/new-adult_20/index.html\">\n                            \n                                New Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/young-adult_21/index.html\">\n                            \n                                Young Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science_22/index.html\">\n                            \n                                Science\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/poetry_23/index.html\">\n                            \n                                Poetry\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/paranormal_24/index.html\">\n                            \n                                Paranormal\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/art_25/index.html\">\n                            \n                                Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/psychology_26/index.html\">\n                            \n                                Psychology\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/autobiography_27/index.html\">\n                            \n                                Autobiography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/parenting_28/index.html\">\n                            \n                                Parenting\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/adult-fiction_29/index.html\">\n                            \n                                Adult Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/humor_30/index.html\">\n                            \n                                Humor\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/horror_31/index.html\">\n                            \n                                Horror\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/history_32/index.html\">\n                            \n                                History\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/food-and-drink_33/index.html\">\n                            \n                                Food and Drink\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian-fiction_34/index.html\">\n                            \n                                Christian Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/business_35/index.html\">\n                            \n                                Business\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/biography_36/index.html\">\n                            \n                                Biography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/thriller_37/index.html\">\n                            \n                                Thriller\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/contemporary_38/index.html\">\n                            \n                                Contemporary\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/spirituality_39/index.html\">\n                            \n                                Spirituality\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/academic_40/index.html\">\n                            \n                                Academic\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/self-help_41/index.html\">\n                            \n                                Self Help\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical_42/index.html\">\n                            \n                                Historical\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian_43/index.html\">\n                            \n                                Christian\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/suspense_44/index.html\">\n                            \n                                Suspense\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/short-stories_45/index.html\">\n                            \n                                Short Stories\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/novels_46/index.html\">\n                            \n                                Novels\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/health_47/index.html\">\n                            \n                                Health\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/politics_48/index.html\">\n                            \n                                Politics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/cultural_49/index.html\">\n                            \n                                Cultural\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/erotica_50/index.html\">\n                            \n                                Erotica\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/crime_51/index.html\">\n                            \n                                Crime\n                            \n                        </a>\n\n                        </li>\n                        \n                            </ul></li>\n                        \n                \n            </ul>\n        </div>\n    \n    \n\n            </aside>\n\n            <div class=\"col-sm-8 col-md-9\">\n                \n                <div class=\"page-header action\">\n                    <h1>All products</h1>\n                </div>\n                \n\n                \n\n\n\n<div id=\"messages\">\n\n</div>\n\n\n                <div id=\"promotions\">\n                    \n                </div>\n\n                \n    <form method=\"get\" class=\"form-horizontal\">\n        \n        <div style=\"display:none\">\n            \n            \n        </div>\n\n        \n            \n                \n                    <strong>1000</strong> results - showing <strong>1</strong> to <strong>20</strong>.\n                \n            \n            \n        \n    </form>\n    \n        <section>\n            <div class=\"alert alert-warning\" role=\"alert\"><strong>Warning!</strong> This is a demo website for web scraping purposes. Prices and ratings here were randomly assigned and have no real meaning.</div>\n\n            <div>\n                <ol class=\"row\">\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"a-light-in-the-attic_1000/index.html\"><img src=\"../media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg\" alt=\"A Light in the Attic\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"a-light-in-the-attic_1000/index.html\" title=\"A Light in the Attic\">A Light in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.77</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"tipping-the-velvet_999/index.html\"><img src=\"../media/cache/26/0c/260c6ae16bce31c8f8c95daddd9f4a1c.jpg\" alt=\"Tipping the Velvet\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"tipping-the-velvet_999/index.html\" title=\"Tipping the Velvet\">Tipping the Velvet</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£53.74</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"soumission_998/index.html\"><img src=\"../media/cache/3e/ef/3eef99c9d9adef34639f510662022830.jpg\" alt=\"Soumission\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"soumission_998/index.html\" title=\"Soumission\">Soumission</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£50.10</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"sharp-objects_997/index.html\"><img src=\"../media/cache/32/51/3251cf3a3412f53f339e42cac2134093.jpg\" alt=\"Sharp Objects\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"sharp-objects_997/index.html\" title=\"Sharp Objects\">Sharp Objects</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£47.82</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"sapiens-a-brief-history-of-humankind_996/index.html\"><img src=\"../media/cache/be/a5/bea5697f2534a2f86a3ef27b5a8c12a6.jpg\" alt=\"Sapiens: A Brief History of Humankind\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"sapiens-a-brief-history-of-humankind_996/index.html\" title=\"Sapiens: A Brief History of Humankind\">Sapiens: A Brief History ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£54.23</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-requiem-red_995/index.html\"><img src=\"../media/cache/68/33/68339b4c9bc034267e1da611ab3b34f8.jpg\" alt=\"The Requiem Red\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-requiem-red_995/index.html\" title=\"The Requiem Red\">The Requiem Red</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.65</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\"><img src=\"../media/cache/92/27/92274a95b7c251fea59a2b8a78275ab4.jpg\" alt=\"The Dirty Little Secrets of Getting Your Dream Job\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-dirty-little-secrets-of-getting-your-dream-job_994/index.html\" title=\"The Dirty Little Secrets of Getting Your Dream Job\">The Dirty Little Secrets ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.34</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\"><img src=\"../media/cache/3d/54/3d54940e57e662c4dd1f3ff00c78cc64.jpg\" alt=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-coming-woman-a-novel-based-on-the-life-of-the-infamous-feminist-victoria-woodhull_993/index.html\" title=\"The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull\">The Coming Woman: A ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.93</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\"><img src=\"../media/cache/66/88/66883b91f6804b2323c8369331cb7dd1.jpg\" alt=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html\" title=\"The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics\">The Boys in the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.60</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-black-maria_991/index.html\"><img src=\"../media/cache/58/46/5846057e28022268153beff6d352b06c.jpg\" alt=\"The Black Maria\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-black-maria_991/index.html\" title=\"The Black Maria\">The Black Maria</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.15</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"starving-hearts-triangular-trade-trilogy-1_990/index.html\"><img src=\"../media/cache/be/f4/bef44da28c98f905a3ebec0b87be8530.jpg\" alt=\"Starving Hearts (Triangular Trade Trilogy, #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"starving-hearts-triangular-trade-trilogy-1_990/index.html\" title=\"Starving Hearts (Triangular Trade Trilogy, #1)\">Starving Hearts (Triangular Trade ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£13.99</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"shakespeares-sonnets_989/index.html\"><img src=\"../media/cache/10/48/1048f63d3b5061cd2f424d20b3f9b666.jpg\" alt=\"Shakespeare's Sonnets\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"shakespeares-sonnets_989/index.html\" title=\"Shakespeare's Sonnets\">Shakespeare's Sonnets</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£20.66</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"set-me-free_988/index.html\"><img src=\"../media/cache/5b/88/5b88c52633f53cacf162c15f4f823153.jpg\" alt=\"Set Me Free\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"set-me-free_988/index.html\" title=\"Set Me Free\">Set Me Free</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.46</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\"><img src=\"../media/cache/94/b1/94b1b8b244bce9677c2f29ccc890d4d2.jpg\" alt=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"scott-pilgrims-precious-little-life-scott-pilgrim-1_987/index.html\" title=\"Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)\">Scott Pilgrim's Precious Little ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.29</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"rip-it-up-and-start-again_986/index.html\"><img src=\"../media/cache/81/c4/81c4a973364e17d01f217e1188253d5e.jpg\" alt=\"Rip it Up and Start Again\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"rip-it-up-and-start-again_986/index.html\" title=\"Rip it Up and Start Again\">Rip it Up and ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£35.02</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\"><img src=\"../media/cache/54/60/54607fe8945897cdcced0044103b10b6.jpg\" alt=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"our-band-could-be-your-life-scenes-from-the-american-indie-underground-1981-1991_985/index.html\" title=\"Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991\">Our Band Could Be ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£57.25</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"olio_984/index.html\"><img src=\"../media/cache/55/33/553310a7162dfbc2c6d19a84da0df9e1.jpg\" alt=\"Olio\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"olio_984/index.html\" title=\"Olio\">Olio</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£23.88</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\"><img src=\"../media/cache/09/a3/09a3aef48557576e1a85ba7efea8ecb7.jpg\" alt=\"Mesaerion: The Best Science Fiction Stories 1800-1849\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"mesaerion-the-best-science-fiction-stories-1800-1849_983/index.html\" title=\"Mesaerion: The Best Science Fiction Stories 1800-1849\">Mesaerion: The Best Science ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£37.59</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"libertarianism-for-beginners_982/index.html\"><img src=\"../media/cache/0b/bc/0bbcd0a6f4bcd81ccb1049a52736406e.jpg\" alt=\"Libertarianism for Beginners\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"libertarianism-for-beginners_982/index.html\" title=\"Libertarianism for Beginners\">Libertarianism for Beginners</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£51.33</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"its-only-the-himalayas_981/index.html\"><img src=\"../media/cache/27/a5/27a53d0bb95bdd88288eaf66c9230d7e.jpg\" alt=\"It's Only the Himalayas\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"its-only-the-himalayas_981/index.html\" title=\"It's Only the Himalayas\">It's Only the Himalayas</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£45.17</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                </ol>\n                \n\n\n\n    <div>\n        <ul class=\"pager\">\n            \n            <li class=\"current\">\n            \n                Page 1 of 50\n            \n            </li>\n            \n                <li class=\"next\"><a href=\"page-2.html\">next</a></li>\n            \n        </ul>\n    </div>\n\n\n            </div>\n        </section>\n    \n\n\n            </div>\n\n        </div><!-- /row -->\n    </div><!-- /page_inner -->\n</div><!-- /container-fluid -->\n\n\n    \n<footer class=\"footer container-fluid\">\n    \n        \n    \n</footer>\n\n\n        \n        \n  \n            <!-- jQuery -->\n            <script src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js\"></script>\n            <script>window.jQuery || document.write('<script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"><\\/script>')</script><script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"></script>\n        \n  \n\n\n        \n        \n    \n        \n    <script type=\"text/javascript\" src=\"../static/oscar/js/bootstrap3/bootstrap.min.js\"></script>\n    <!-- Oscar -->\n    <script src=\"../static/oscar/js/oscar/ui.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/locales/bootstrap-datetimepicker.all.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n\n        \n        \n    \n\n    \n\n\n        \n        <script type=\"text/javascript\">\n            $(function() {\n                \n    \n    \n    oscar.init();\n\n    oscar.search.init();\n\n            });\n        </script>\n\n        \n        <!-- Version: N/A -->\n        \n    \n\n</body></html>", "echoData": "https://books.toscrape.com/catalogue/page-1.html"}
{"url": "https://books.toscrape.com/catalogue/page-2.html", "statusCode": 200, "browserHtml": "<!DOCTYPE html><!--[if lt IE 7]>      <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8 lt-ie7\"> <![endif]--><!--[if IE 7]>         <html lang=\"en-us\" class=\"no-js lt-ie9 lt-ie8\"> <![endif]--><!--[if IE 8]>         <html lang=\"en-us\" class=\"no-js lt-ie9\"> <![endif]--><!--[if gt IE 8]><!--><html lang=\"en-us\" class=\"no-js\"><!--<![endif]--><head>\n        <title>\n    All products | Books to Scrape - Sandbox\n</title>\n\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n        <meta name=\"created\" content=\"24th Jun 2016 09:29\">\n        <meta name=\"description\" content=\"\">\n        <meta name=\"viewport\" content=\"width=device-width\">\n        <meta name=\"robots\" content=\"NOARCHIVE,NOCACHE\">\n\n        <!-- Le HTML5 shim, for IE6-8 support of HTML elements -->\n        <!--[if lt IE 9]>\n        <script src=\"//html5shim.googlecode.com/svn/trunk/html5.js\"></script>\n        <![endif]-->\n\n        \n            <link rel=\"shortcut icon\" href=\"../static/oscar/favicon.ico\">\n        \n\n        \n        \n    \n    \n        <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/styles.css\">\n    \n    <link rel=\"stylesheet\" href=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.css\">\n    <link rel=\"stylesheet\" type=\"text/css\" href=\"../static/oscar/css/datetimepicker.css\">\n\n\n        \n        \n\n        \n\n        \n            \n            \n\n        \n    </head>\n\n    <body id=\"default\" class=\"default\">\n        \n        \n    \n    \n    <header class=\"header container-fluid\">\n        <div class=\"page_inner\">\n            <div class=\"row\">\n                <div class=\"col-sm-8 h1\"><a href=\"../index.html\">Books to Scrape</a><small> We love being scraped!</small>\n</div>\n\n                \n            </div>\n        </div>\n    </header>\n\n    \n    \n<div class=\"container-fluid page\">\n    <div class=\"page_inner\">\n        \n    <ul class=\"breadcrumb\">\n        <li>\n            <a href=\"../index.html\">Home</a>\n        </li>\n        <li class=\"active\">All products</li>\n    </ul>\n\n        <div class=\"row\">\n\n            <aside class=\"sidebar col-sm-4 col-md-3\">\n                \n                <div id=\"promotions_left\">\n                    \n                </div>\n                \n    \n    \n        \n        <div class=\"side_categories\">\n            <ul class=\"nav nav-list\">\n                \n                    <li>\n                        <a href=\"category/books_1/index.html\">\n                            \n                                Books\n                            \n                        </a>\n\n                        <ul>\n                        \n                \n                    <li>\n                        <a href=\"category/books/travel_2/index.html\">\n                            \n                                Travel\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/mystery_3/index.html\">\n                            \n                                Mystery\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical-fiction_4/index.html\">\n                            \n                                Historical Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sequential-art_5/index.html\">\n                            \n                                Sequential Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/classics_6/index.html\">\n                            \n                                Classics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/philosophy_7/index.html\">\n                            \n                                Philosophy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/romance_8/index.html\">\n                            \n                                Romance\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/womens-fiction_9/index.html\">\n                            \n                                Womens Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fiction_10/index.html\">\n                            \n                                Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/childrens_11/index.html\">\n                            \n                                Childrens\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/religion_12/index.html\">\n                            \n                                Religion\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/nonfiction_13/index.html\">\n                            \n                                Nonfiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/music_14/index.html\">\n                            \n                                Music\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/default_15/index.html\">\n                            \n                                Default\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science-fiction_16/index.html\">\n                            \n                                Science Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/sports-and-games_17/index.html\">\n                            \n                                Sports and Games\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/add-a-comment_18/index.html\">\n                            \n                                Add a comment\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/fantasy_19/index.html\">\n                            \n                                Fantasy\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/new-adult_20/index.html\">\n                            \n                                New Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/young-adult_21/index.html\">\n                            \n                                Young Adult\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/science_22/index.html\">\n                            \n                                Science\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/poetry_23/index.html\">\n                            \n                                Poetry\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/paranormal_24/index.html\">\n                            \n                                Paranormal\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/art_25/index.html\">\n                            \n                                Art\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/psychology_26/index.html\">\n                            \n                                Psychology\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/autobiography_27/index.html\">\n                            \n                                Autobiography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/parenting_28/index.html\">\n                            \n                                Parenting\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/adult-fiction_29/index.html\">\n                            \n                                Adult Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/humor_30/index.html\">\n                            \n                                Humor\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/horror_31/index.html\">\n                            \n                                Horror\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/history_32/index.html\">\n                            \n                                History\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/food-and-drink_33/index.html\">\n                            \n                                Food and Drink\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian-fiction_34/index.html\">\n                            \n                                Christian Fiction\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/business_35/index.html\">\n                            \n                                Business\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/biography_36/index.html\">\n                            \n                                Biography\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/thriller_37/index.html\">\n                            \n                                Thriller\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/contemporary_38/index.html\">\n                            \n                                Contemporary\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/spirituality_39/index.html\">\n                            \n                                Spirituality\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/academic_40/index.html\">\n                            \n                                Academic\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/self-help_41/index.html\">\n                            \n                                Self Help\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/historical_42/index.html\">\n                            \n                                Historical\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/christian_43/index.html\">\n                            \n                                Christian\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/suspense_44/index.html\">\n                            \n                                Suspense\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/short-stories_45/index.html\">\n                            \n                                Short Stories\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/novels_46/index.html\">\n                            \n                                Novels\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/health_47/index.html\">\n                            \n                                Health\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/politics_48/index.html\">\n                            \n                                Politics\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/cultural_49/index.html\">\n                            \n                                Cultural\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/erotica_50/index.html\">\n                            \n                                Erotica\n                            \n                        </a>\n\n                        </li>\n                        \n                \n                    <li>\n                        <a href=\"category/books/crime_51/index.html\">\n                            \n                                Crime\n                            \n                        </a>\n\n                        </li>\n                        \n                            </ul></li>\n                        \n                \n            </ul>\n        </div>\n    \n    \n\n            </aside>\n\n            <div class=\"col-sm-8 col-md-9\">\n                \n                <div class=\"page-header action\">\n                    <h1>All products</h1>\n                </div>\n                \n\n                \n\n\n\n<div id=\"messages\">\n\n</div>\n\n\n                <div id=\"promotions\">\n                    \n                </div>\n\n                \n    <form method=\"get\" class=\"form-horizontal\">\n        \n        <div style=\"display:none\">\n            \n            \n        </div>\n\n        \n            \n                \n                    <strong>1000</strong> results - showing <strong>21</strong> to <strong>40</strong>.\n                \n            \n            \n        \n    </form>\n    \n        <section>\n            <div class=\"alert alert-warning\" role=\"alert\"><strong>Warning!</strong> This is a demo website for web scraping purposes. Prices and ratings here were randomly assigned and have no real meaning.</div>\n\n            <div>\n                <ol class=\"row\">\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"in-her-wake_980/index.html\"><img src=\"../media/cache/5d/72/5d72709c6a7a9584a4d1cf07648bfce1.jpg\" alt=\"In Her Wake\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"in-her-wake_980/index.html\" title=\"In Her Wake\">In Her Wake</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£12.84</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"how-music-works_979/index.html\"><img src=\"../media/cache/5c/c8/5cc8e107246cb478960d4f0aba1e1c8e.jpg\" alt=\"How Music Works\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"how-music-works_979/index.html\" title=\"How Music Works\">How Music Works</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£37.32</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"foolproof-preserving-a-guide-to-small-batch-jams-jellies-pickles-condiments-and-more-a-foolproof-guide-to-making-small-batch-jams-jellies-pickles-condiments-and-more_978/index.html\"><img src=\"../media/cache/9f/59/9f59f01fa916a7bb8f0b28a4012179a4.jpg\" alt=\"Foolproof Preserving: A Guide to Small Batch Jams, Jellies, Pickles, Condiments, and More: A Foolproof Guide to Making Small Batch Jams, Jellies, Pickles, Condiments, and More\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"foolproof-preserving-a-guide-to-small-batch-jams-jellies-pickles-condiments-and-more-a-foolproof-guide-to-making-small-batch-jams-jellies-pickles-condiments-and-more_978/index.html\" title=\"Foolproof Preserving: A Guide to Small Batch Jams, Jellies, Pickles, Condiments, and More: A Foolproof Guide to Making Small Batch Jams, Jellies, Pickles, Condiments, and More\">Foolproof Preserving: A Guide ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£30.52</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"chase-me-paris-nights-2_977/index.html\"><img src=\"../media/cache/9c/2e/9c2e0eb8866b8e3f3b768994fd3d1c1a.jpg\" alt=\"Chase Me (Paris Nights #2)\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"chase-me-paris-nights-2_977/index.html\" title=\"Chase Me (Paris Nights #2)\">Chase Me (Paris Nights ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£25.27</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"black-dust_976/index.html\"><img src=\"../media/cache/44/cc/44ccc99c8f82c33d4f9d2afa4ef25787.jpg\" alt=\"Black Dust\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"black-dust_976/index.html\" title=\"Black Dust\">Black Dust</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£34.53</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"birdsong-a-story-in-pictures_975/index.html\"><img src=\"../media/cache/af/6e/af6e796160fe63e0cf19d44395c7ddf2.jpg\" alt=\"Birdsong: A Story in Pictures\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"birdsong-a-story-in-pictures_975/index.html\" title=\"Birdsong: A Story in Pictures\">Birdsong: A Story in ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£54.64</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"americas-cradle-of-quarterbacks-western-pennsylvanias-football-factory-from-johnny-unitas-to-joe-montana_974/index.html\"><img src=\"../media/cache/ef/0b/ef0bed08de4e083dba5e20fdb98d9c36.jpg\" alt=\"America's Cradle of Quarterbacks: Western Pennsylvania's Football Factory from Johnny Unitas to Joe Montana\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"americas-cradle-of-quarterbacks-western-pennsylvanias-football-factory-from-johnny-unitas-to-joe-montana_974/index.html\" title=\"America's Cradle of Quarterbacks: Western Pennsylvania's Football Factory from Johnny Unitas to Joe Montana\">America's Cradle of Quarterbacks: ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£22.50</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"aladdin-and-his-wonderful-lamp_973/index.html\"><img src=\"../media/cache/d6/da/d6da0371958068bbaf39ea9c174275cd.jpg\" alt=\"Aladdin and His Wonderful Lamp\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"aladdin-and-his-wonderful-lamp_973/index.html\" title=\"Aladdin and His Wonderful Lamp\">Aladdin and His Wonderful ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£53.13</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"worlds-elsewhere-journeys-around-shakespeares-globe_972/index.html\"><img src=\"../media/cache/2e/98/2e98c332bf8563b584784971541c4445.jpg\" alt=\"Worlds Elsewhere: Journeys Around Shakespeare’s Globe\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"worlds-elsewhere-journeys-around-shakespeares-globe_972/index.html\" title=\"Worlds Elsewhere: Journeys Around Shakespeare’s Globe\">Worlds Elsewhere: Journeys Around ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£40.30</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"wall-and-piece_971/index.html\"><img src=\"../media/cache/a5/41/a5416b9646aaa7287baa287ec2590270.jpg\" alt=\"Wall and Piece\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"wall-and-piece_971/index.html\" title=\"Wall and Piece\">Wall and Piece</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£44.18</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-four-agreements-a-practical-guide-to-personal-freedom_970/index.html\"><img src=\"../media/cache/0f/7e/0f7ee69495c0df1d35723f012624a9f8.jpg\" alt=\"The Four Agreements: A Practical Guide to Personal Freedom\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-four-agreements-a-practical-guide-to-personal-freedom_970/index.html\" title=\"The Four Agreements: A Practical Guide to Personal Freedom\">The Four Agreements: A ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£17.66</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-five-love-languages-how-to-express-heartfelt-commitment-to-your-mate_969/index.html\"><img src=\"../media/cache/38/c5/38c56fba316c07305643a8065269594e.jpg\" alt=\"The Five Love Languages: How to Express Heartfelt Commitment to Your Mate\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-five-love-languages-how-to-express-heartfelt-commitment-to-your-mate_969/index.html\" title=\"The Five Love Languages: How to Express Heartfelt Commitment to Your Mate\">The Five Love Languages: ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£31.05</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-elephant-tree_968/index.html\"><img src=\"../media/cache/5d/7e/5d7ecde8e81513eba8a64c9fe000744b.jpg\" alt=\"The Elephant Tree\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-elephant-tree_968/index.html\" title=\"The Elephant Tree\">The Elephant Tree</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£23.82</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"the-bear-and-the-piano_967/index.html\"><img src=\"../media/cache/cf/bb/cfbb5e62715c6d888fd07794c9bab5d6.jpg\" alt=\"The Bear and the Piano\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"the-bear-and-the-piano_967/index.html\" title=\"The Bear and the Piano\">The Bear and the ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£36.89</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"sophies-world_966/index.html\"><img src=\"../media/cache/65/71/6571919836ec51ed54f0050c31d8a0cd.jpg\" alt=\"Sophie's World\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Five\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"sophies-world_966/index.html\" title=\"Sophie's World\">Sophie's World</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£15.94</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"penny-maybe_965/index.html\"><img src=\"../media/cache/12/53/1253c21c5ef3c6d075c5fa3f5fecee6a.jpg\" alt=\"Penny Maybe\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Three\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"penny-maybe_965/index.html\" title=\"Penny Maybe\">Penny Maybe</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.29</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"maude-1883-1993she-grew-up-with-the-country_964/index.html\"><img src=\"../media/cache/f5/88/f5889d038f5d8e949b494d147c2dcf54.jpg\" alt=\"Maude (1883-1993):She Grew Up with the country\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"maude-1883-1993she-grew-up-with-the-country_964/index.html\" title=\"Maude (1883-1993):She Grew Up with the country\">Maude (1883-1993):She Grew Up ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£18.02</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"in-a-dark-dark-wood_963/index.html\"><img src=\"../media/cache/23/85/238570a1c284e730dbc737a7e631ae2b.jpg\" alt=\"In a Dark, Dark Wood\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating One\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"in-a-dark-dark-wood_963/index.html\" title=\"In a Dark, Dark Wood\">In a Dark, Dark ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£19.63</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"behind-closed-doors_962/index.html\"><img src=\"../media/cache/e1/5c/e15c289ba58cea38519e1281e859f0c1.jpg\" alt=\"Behind Closed Doors\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Four\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"behind-closed-doors_962/index.html\" title=\"Behind Closed Doors\">Behind Closed Doors</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£52.22</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                        <li class=\"col-xs-6 col-sm-4 col-md-3 col-lg-3\">\n\n\n\n\n\n\n    <article class=\"product_pod\">\n        \n            <div class=\"image_container\">\n                \n                    \n                    <a href=\"you-cant-bury-them-all-poems_961/index.html\"><img src=\"../media/cache/e9/20/e9203b733126c4a0832a1c7885dc27cf.jpg\" alt=\"You can't bury them all: Poems\" class=\"thumbnail\"></a>\n                    \n                \n            </div>\n        \n\n        \n            \n                <p class=\"star-rating Two\">\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                    <i class=\"icon-star\"></i>\n                </p>\n            \n        \n\n        \n            <h3><a href=\"you-cant-bury-them-all-poems_961/index.html\" title=\"You can't bury them all: Poems\">You can't bury them ...</a></h3>\n        \n\n        \n            <div class=\"product_price\">\n                \n\n\n\n\n\n\n    \n        <p class=\"price_color\">£33.63</p>\n    \n\n<p class=\"instock availability\">\n    <i class=\"icon-ok\"></i>\n    \n        In stock\n    \n</p>\n\n                \n                    \n\n\n\n\n\n\n    \n    <form>\n        <button type=\"submit\" class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\">Add to basket</button>\n    </form>\n\n\n                \n            </div>\n        \n    </article>\n\n</li>\n                    \n                </ol>\n                \n\n\n\n    <div>\n        <ul class=\"pager\">\n            \n                <li class=\"previous\"><a href=\"page-1.html\">previous</a></li>\n            \n            <li class=\"current\">\n            \n                Page 2 of 50\n            \n            </li>\n            \n                <li class=\"next\"><a href=\"page-3.html\">next</a></li>\n            \n        </ul>\n    </div>\n\n\n            </div>\n        </section>\n    \n\n\n            </div>\n\n        </div><!-- /row -->\n    </div><!-- /page_inner -->\n</div><!-- /container-fluid -->\n\n\n    \n<footer class=\"footer container-fluid\">\n    \n        \n    \n</footer>\n\n\n        \n        \n  \n            <!-- jQuery -->\n            <script src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js\"></script>\n            <script>window.jQuery || document.write('<script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"><\\/script>')</script><script src=\"../static/oscar/js/jquery/jquery-1.9.1.min.js\"></script>\n        \n  \n\n\n        \n        \n    \n        \n    <script type=\"text/javascript\" src=\"../static/oscar/js/bootstrap3/bootstrap.min.js\"></script>\n    <!-- Oscar -->\n    <script src=\"../static/oscar/js/oscar/ui.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/bootstrap-datetimepicker.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n    <script src=\"../static/oscar/js/bootstrap-datetimepicker/locales/bootstrap-datetimepicker.all.js\" type=\"text/javascript\" charset=\"utf-8\"></script>\n\n\n        \n        \n    \n\n    \n\n\n        \n        <script type=\"text/javascript\">\n            $(function() {\n                \n    \n    \n    oscar.init();\n\n    oscar.search.init();\n\n            });\n        </script>\n\n        \n        <!-- Version: N/A -->\n        \n    \n\n</body></html>", "echoData": "https://books.toscrape.com/catalogue/page-2.html"}
```

#### Getting browser HTML in proxy mode

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### curl

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Browser-Html: true" \
    https://toscrape.com
```

#### C#

```cs
using System;
using System.Net;
using System.Net.Http;

var proxy = new WebProxy("http://api.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var httpClientHandler = new HttpClientHandler
{
    Proxy = proxy,
};

var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
client.DefaultRequestHeaders.Add("Zyte-Browser-Html", "true");
var message = new HttpRequestMessage(HttpMethod.Get, "https://toscrape.com");
var response = client.Send(message);
var body = await response.Content.ReadAsStringAsync();

Console.WriteLine(body);
```

#### Java

```java
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.hc.client5.http.auth.AuthCache;
import org.apache.hc.client5.http.auth.AuthScope;
import org.apache.hc.client5.http.auth.CredentialsProvider;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.impl.auth.BasicAuthCache;
import org.apache.hc.client5.http.impl.auth.BasicScheme;
import org.apache.hc.client5.http.impl.auth.CredentialsProviderBuilder;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.client5.http.impl.routing.DefaultProxyRoutePlanner;
import org.apache.hc.client5.http.protocol.HttpClientContext;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHost;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;

class Example {
  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {

    HttpHost proxy = new HttpHost("api.zyte.com", 8011);
    DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
    CredentialsProvider credentialsProvider =
        CredentialsProviderBuilder.create()
            .add(new AuthScope(proxy), "YOUR_ZYTE_API_KEY", "".toCharArray())
            .build();

    AuthCache authCache = new BasicAuthCache();
    BasicScheme basicAuth = new BasicScheme();
    authCache.put(proxy, basicAuth);
    HttpClientContext context = HttpClientContext.create();
    context.setCredentialsProvider(credentialsProvider);
    context.setAuthCache(authCache);

    CloseableHttpClient client =
        HttpClients.custom()
            .setRoutePlanner(routePlanner)
            .setDefaultCredentialsProvider(credentialsProvider)
            .build();

    HttpGet request = new HttpGet("https://toscrape.com");
    request.setHeader("Zyte-Browser-Html", "true");
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String httpResponseBody = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }
}
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      headers: {
        'Zyte-Browser-Html': 'true'
      },
      proxy: {
        protocol: 'http',
        host: 'api.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'headers' => [
        'Zyte-Browser-Html' => 'true',
    ],
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

```python
import requests

response = requests.get(
    "https://toscrape.com",
    headers={
        "Zyte-Browser-Html": "true",
    },
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011" for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Ruby

```ruby
# frozen_string_literal: true

require 'net/http'

url = URI('https://toscrape.com/')
proxy_host = 'api.zyte.com'
proxy_port = '8011'

http = Net::HTTP.new(url.host, url.port, proxy_host, proxy_port, 'YOUR_ZYTE_API_KEY', '')
http.use_ssl = true

request = Net::HTTP::Get.new(url)
request['Zyte-Browser-Html'] = 'true'

r = http.start do |h|
  h.request(request)
end

puts r.body
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request("https://toscrape.com", headers={"Zyte-Browser-Html": "true"})

    def parse(self, response):
        print(response.text)
```

Output (first 5 lines):

```html
<!DOCTYPE html><html lang="en"><head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        <link href="./css/bootstrap.min.css" rel="stylesheet">
        <link href="./css/main.css" rel="stylesheet">
```

#### Using proxy mode

#### C#

```cs
using System;
using System.Net;
using System.Net.Http;

var proxy = new WebProxy("http://api.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var httpClientHandler = new HttpClientHandler
{
    Proxy = proxy,
};

var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
var message = new HttpRequestMessage(HttpMethod.Get, "https://toscrape.com");
var response = client.Send(message);
var body = await response.Content.ReadAsStringAsync();

Console.WriteLine(body);
```

#### curl

```bash
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### Java

```java
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.hc.client5.http.auth.AuthCache;
import org.apache.hc.client5.http.auth.AuthScope;
import org.apache.hc.client5.http.auth.CredentialsProvider;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.impl.auth.BasicAuthCache;
import org.apache.hc.client5.http.impl.auth.BasicScheme;
import org.apache.hc.client5.http.impl.auth.CredentialsProviderBuilder;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.client5.http.impl.routing.DefaultProxyRoutePlanner;
import org.apache.hc.client5.http.protocol.HttpClientContext;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHost;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;

class Example {
  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {

    HttpHost proxy = new HttpHost("api.zyte.com", 8011);
    DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
    CredentialsProvider credentialsProvider =
        CredentialsProviderBuilder.create()
            .add(new AuthScope(proxy), "YOUR_ZYTE_API_KEY", "".toCharArray())
            .build();

    AuthCache authCache = new BasicAuthCache();
    BasicScheme basicAuth = new BasicScheme();
    authCache.put(proxy, basicAuth);
    HttpClientContext context = HttpClientContext.create();
    context.setCredentialsProvider(credentialsProvider);
    context.setAuthCache(authCache);

    CloseableHttpClient client =
        HttpClients.custom()
            .setRoutePlanner(routePlanner)
            .setDefaultCredentialsProvider(credentialsProvider)
            .build();

    HttpGet request = new HttpGet("https://toscrape.com");
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String httpResponseBody = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }
}
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'api.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

> ###### NOTE
>
> You need to install and configure our CA certificate for
> the requests library.

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011" for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Ruby

```ruby
# frozen_string_literal: true

require 'net/http'

url = URI('https://toscrape.com/')
proxy_host = 'api.zyte.com'
proxy_port = '8011'

http = Net::HTTP.new(url.host, url.port, proxy_host, proxy_port, 'YOUR_ZYTE_API_KEY', '')
http.use_ssl = true

r = http.start do |h|
  h.request(Net::HTTP::Get.new(url))
end

puts r.body
```

#### Scrapy

When using [scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy), set the `ZYTE_SMARTPROXY_URL`
setting to `"http://api.zyte.com:8011"` and the
`ZYTE_SMARTPROXY_APIKEY` setting to [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access) for Zyte API.

> ###### NOTE
>
> **Important**: Use your **Zyte API key** here, not a Scrapy Cloud API key. Make sure you get this from the Zyte API access page.

Then you can continue using Scrapy as usual and all requests will be
proxied through Zyte API automatically.

```python
from scrapy import Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)
```

#### Using the HTTPS endpoint of proxy mode

#### curl

```bash
curl \
    --proxy https://api.zyte.com:8014 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### JS

```js
const HttpsProxyAgent = require('https-proxy-agent')
const httpsAgent = new HttpsProxyAgent.HttpsProxyAgent('https://YOUR_ZYTE_API_KEY:@api.zyte.com:8014')
const axiosDefaultConfig = { httpsAgent }
const axios = require('axios').create(axiosDefaultConfig)

axios
  .get('https://toscrape.com')
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### Python

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "https://YOUR_ZYTE_API_KEY:@api.zyte.com:8014"
        for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Sending arbitrary bytes in an HTTP request

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"},
    {"httpRequestBody", "Zm9v"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var requestBody = responseData.RootElement.GetProperty("data").ToString();

Console.WriteLine(requestBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "httpRequestMethod": "POST", "httpRequestBody": "Zm9v"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output .data
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "httpRequestMethod": "POST",
    "httpRequestBody": "Zm9v"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode \
| jq --raw-output .data
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "httpRequestMethod",
            "POST",
            "httpRequestBody",
            "Zm9v");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String body = data.get("data").getAsString();
          System.out.println(body);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    httpRequestMethod: 'POST',
    httpRequestBody: 'Zm9v'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const body = JSON.parse(httpResponseBody).data
  console.log(body)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
        'httpRequestBody' => 'Zm9v',
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$body = json_decode($http_response_body)->data;
echo $body.PHP_EOL;
```

#### Proxy mode

With the proxy mode, the request body from
your requests is used automatically, be it plain text or binary.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -X POST \
    -H "Content-Type: application/octet-stream" \
    --data foo \
    https://httpbin.org/anything \
    | jq .data
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "httpRequestMethod": "POST",
        "httpRequestBody": "Zm9v",
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
body: str = json.loads(http_response_body)["data"]
print(body)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "httpRequestMethod": "POST",
            "httpRequestBody": "Zm9v",
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])
    body = json.loads(http_response_body)["data"]
    print(body)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            method="POST",
            body=b"foo",
        )

    def parse(self, response):
        body = json.loads(response.body)["data"]
        print(body)
```

Output:

```none
foo
```

#### Sending cookies

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

The following code example sends a cookie to [httpbin.org](https://httpbin.org) and prints the
cookies that [httpbin.org](https://httpbin.org) reports to have received:

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/cookies"},
    {"httpResponseBody", true},
    {
        "requestCookies",
        new List<Dictionary<string, string>>()
        {
            new Dictionary<string, string>()
            {
                {"name", "foo"},
                {"value", "bar"},
                {"domain", "httpbin.org"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);
var result = System.Text.Encoding.UTF8.GetString(httpResponseBody);

Console.WriteLine(result);
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/cookies", "httpResponseBody": true, "requestCookies": [{"name": "foo", "value": "bar", "domain": "httpbin.org"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/cookies",
    "httpResponseBody": true,
    "requestCookies": [
        {
            "name": "foo",
            "value": "bar",
            "domain": "httpbin.org"
        }
    ]
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, String> cookies =
        ImmutableMap.of("name", "foo", "value", "bar", "domain", "httpbin.org");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/cookies",
            "httpResponseBody",
            true,
            "requestCookies",
            Collections.singletonList(cookies));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/cookies',
    httpResponseBody: true,
    requestCookies: [
      {
        name: 'foo',
        value: 'bar',
        domain: 'httpbin.org'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(httpResponseBody.toString())
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/cookies',
        'httpResponseBody' => true,
        'requestCookies' => [
            [
                'name' => 'foo',
                'value' => 'bar',
                'domain' => 'httpbin.org',
            ],
        ],
    ],
]);
$api = json_decode($response->getBody());
$http_response_body = base64_decode($api->httpResponseBody);
echo $http_response_body;
```

#### Proxy mode

With the proxy mode, the request
`Cookie` header from your requests is used automatically to set
cookies for the target URL domain.

> ###### NOTE
>
> Setting cookies for additional domains is not supported.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Cookie: foo=bar" \
    https://httpbin.org/cookies
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/cookies",
        "httpResponseBody": True,
        "requestCookies": [
            {
                "name": "foo",
                "value": "bar",
                "domain": "httpbin.org",
            },
        ],
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
print(http_response_body.decode())
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/cookies",
            "httpResponseBody": True,
            "requestCookies": [
                {
                    "name": "foo",
                    "value": "bar",
                    "domain": "httpbin.org",
                },
            ],
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/cookies",
            meta={
                "zyte_api_automap": {
                    "requestCookies": [
                        {
                            "name": "foo",
                            "value": "bar",
                            "domain": "httpbin.org",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        print(response.text)
```

Output:

```json
{
  "cookies": {
    "foo": "bar"
  }
}
```

#### Sending text (Unicode) in an HTTP request

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://httpbin.org/anything"},
    {"httpResponseBody", true},
    {"httpRequestMethod", "POST"},
    {"httpRequestText", "{\"foo\": \"bar\"}"}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);

var responseData = JsonDocument.Parse(httpResponseBody);
var requestBody = responseData.RootElement.GetProperty("data").ToString();

Console.WriteLine(requestBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/anything", "httpResponseBody": true, "httpRequestMethod": "POST", "httpRequestText": "{\"foo\": \"bar\"}"}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output .data
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/anything",
    "httpResponseBody": true,
    "httpRequestMethod": "POST",
    "httpRequestText": "{\"foo\": \"bar\"}"
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode \
| jq --raw-output .data
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://httpbin.org/anything",
            "httpResponseBody",
            true,
            "httpRequestMethod",
            "POST",
            "httpRequestText",
            "{\"foo\": \"bar\"}");
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
          String body = data.get("data").getAsString();
          System.out.println(body);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/anything',
    httpResponseBody: true,
    httpRequestMethod: 'POST',
    httpRequestText: '{"foo": "bar"}'
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const body = JSON.parse(httpResponseBody).data
  console.log(body)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://httpbin.org/anything',
        'httpResponseBody' => true,
        'httpRequestMethod' => 'POST',
        'httpRequestText' => '{"foo": "bar"}',
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
$body = json_decode($http_response_body)->data;
echo $body.PHP_EOL;
```

#### Proxy mode

With the proxy mode, the request body from
your requests is used automatically, be it plain text or binary.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -X POST \
    -H "Content-Type: application/json" \
    --data '{"foo": "bar"}' \
    https://httpbin.org/anything \
    | jq .data
```

#### Python

```python
import json
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://httpbin.org/anything",
        "httpResponseBody": True,
        "httpRequestMethod": "POST",
        "httpRequestText": '{"foo": "bar"}',
    },
)
http_response_body = b64decode(api_response.json()["httpResponseBody"])
body: str = json.loads(http_response_body)["data"]
print(body)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://httpbin.org/anything",
            "httpResponseBody": True,
            "httpRequestMethod": "POST",
            "httpRequestText": '{"foo": "bar"}',
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"])
    body = json.loads(http_response_body)["data"]
    print(body)

asyncio.run(main())
```

#### Scrapy

```python
import json

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://httpbin.org/anything",
            method="POST",
            body='{"foo": "bar"}',
        )

    def parse(self, response):
        body = json.loads(response.body)["data"]
        print(body)
```

Output:

```json
{"foo": "bar"}
```

#### Getting response headers

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"httpResponseHeaders", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var headerEnumerator = data.RootElement.GetProperty("httpResponseHeaders").EnumerateArray();
var headers = new Dictionary<string, string>();
while (headerEnumerator.MoveNext())
{
    headers.Add(
        headerEnumerator.Current.GetProperty("name").ToString(),
        headerEnumerator.Current.GetProperty("value").ToString()
    );
}
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "httpResponseHeaders": true}
```

```shell
zyte-api input.jsonl \
    | jq .httpResponseHeaders
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "httpResponseHeaders": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq .httpResponseHeaders
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url", "https://toscrape.com", "browserHtml", true, "httpResponseHeaders", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          JsonArray httpResponseHeaders = jsonObject.get("httpResponseHeaders").getAsJsonArray();
          Gson gson = new GsonBuilder().setPrettyPrinting().create();
          System.out.println(gson.toJson(httpResponseHeaders));
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    httpResponseHeaders: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseHeaders = response.data.httpResponseHeaders
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'httpResponseHeaders' => true,
    ],
]);
$api = json_decode($response->getBody());
$http_response_headers = $api->httpResponseHeaders;
```

#### Proxy mode

With the proxy mode, response headers
are always included in the HTTP response, no need to ask for them
explicitly.

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "httpResponseHeaders": True,
    },
)
http_response_headers = api_response.json()["httpResponseHeaders"]
```

#### Python client

```python
import asyncio
import json

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "httpResponseHeaders": True,
        }
    )
    http_response_headers = api_response["httpResponseHeaders"]
    print(json.dumps(http_response_headers, indent=2))

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "httpResponseBody": False,
                    "httpResponseHeaders": True,
                },
            },
        )

    def parse(self, response):
        headers = response.headers
```

> ###### NOTE
>
> In transparent mode,
> [httpResponseHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseHeaders) is sent by default for
> httpResponseBody requests, but sending it
> explicitly is still recommended, as future versions of
> scrapy-zyte-api may stop sending it
> by default.

Output (first 5 lines):

```json
[
  {
    "name": "date",
    "value": "Fri, 25 Aug 2023 07:08:05 GMT"
  },
```

#### Taking a screenshot

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"screenshot", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64Screenshot = data.RootElement.GetProperty("screenshot").ToString();
var screenshot = System.Convert.FromBase64String(base64Screenshot);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "screenshot": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .screenshot \
    | base64 --decode \
    > screenshot.jpg
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "screenshot": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .screenshot \
    | base64 --decode \
    > screenshot.jpg
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "screenshot", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64Screenshot = jsonObject.get("screenshot").getAsString();
          byte[] screenshot = Base64.getDecoder().decode(base64Screenshot);
          try (FileOutputStream fos = new FileOutputStream("screenshot.jpg")) {
            fos.write(screenshot);
          }
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    screenshot: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const screenshot = Buffer.from(response.data.screenshot, 'base64')
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'screenshot' => true,
    ],
]);
$api = json_decode($response->getBody());
$screenshot = base64_decode($api->screenshot);
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "screenshot": True,
    },
)
screenshot: bytes = b64decode(api_response.json()["screenshot"])
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "screenshot": True,
        }
    )
    screenshot = b64decode(api_response["screenshot"])
    with open("screenshot.jpg", "wb") as f:
        f.write(screenshot)

asyncio.run(main())
```

#### Scrapy

```python
from base64 import b64decode

from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "screenshot": True,
                },
            },
        )

    def parse(self, response):
        screenshot: bytes = b64decode(response.raw_api_response["screenshot"])
```

Output:

![](zyte-api/usage/code-examples/output/screenshot.jpg)

#### Start a client-managed session with a browser request and reuse it in an HTTP request

Start a session with a browser request to the home page of a website, and reuse
that session for an HTTP request to a different URL of that website.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var sessionId = Guid.NewGuid().ToString();

var browserInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"browserHtml", true},
    {
        "session",
        new Dictionary<string, string>()
        {
            {"id", sessionId}
        }
    }
};
var browserInputJson = JsonSerializer.Serialize(browserInput);
var browserContent = new StringContent(browserInputJson, Encoding.UTF8, "application/json");
await client.PostAsync("https://api.zyte.com/v1/extract", browserContent);

var httpInput = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"httpResponseBody", true},
    {
        "session",
        new Dictionary<string, string>()
        {
            {"id", sessionId}
        }
    }
};
var httpInputJson = JsonSerializer.Serialize(httpInput);
var httpContent = new StringContent(httpInputJson, Encoding.UTF8, "application/json");
HttpResponseMessage httpResponse = await client.PostAsync("https://api.zyte.com/v1/extract", httpContent);
var httpResponseBody = await httpResponse.Content.ReadAsByteArrayAsync();
var httpData = JsonDocument.Parse(httpResponseBody);
var base64HttpResponseBodyField = httpData.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyField = System.Convert.FromBase64String(base64HttpResponseBodyField);
var result = System.Text.Encoding.UTF8.GetString(httpResponseBodyField);

Console.WriteLine(result);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import java.util.UUID;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    String sessionId = UUID.randomUUID().toString();
    Map<String, Object> session = ImmutableMap.of("id", sessionId);

    Map<String, Object> browserParameters =
        ImmutableMap.of("url", "https://toscrape.com/", "browserHtml", true, "session", session);
    String browserRequestBody = new Gson().toJson(browserParameters);

    HttpPost browserRequest = new HttpPost("https://api.zyte.com/v1/extract");
    browserRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    browserRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    browserRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    browserRequest.setEntity(new StringEntity(browserRequestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        browserRequest,
        browserResponse -> {
          Map<String, Object> httpParameters =
              ImmutableMap.of(
                  "url",
                  "https://books.toscrape.com/",
                  "httpResponseBody",
                  true,
                  "session",
                  session);
          String httpRequestBody = new Gson().toJson(httpParameters);

          HttpPost httpRequest = new HttpPost("https://api.zyte.com/v1/extract");
          httpRequest.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
          httpRequest.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
          httpRequest.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
          httpRequest.setEntity(new StringEntity(httpRequestBody));

          client.execute(
              httpRequest,
              httpResponse -> {
                HttpEntity httpEntity = httpResponse.getEntity();
                String httpApiResponse = EntityUtils.toString(httpEntity, StandardCharsets.UTF_8);
                JsonObject httpJsonObject =
                    JsonParser.parseString(httpApiResponse).getAsJsonObject();
                String base64HttpResponseBody =
                    httpJsonObject.get("httpResponseBody").getAsString();
                byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
                String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
                System.out.println(httpResponseBody);
                return null;
              });

          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const crypto = require('crypto')

const sessionId = String(crypto.randomUUID())

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com/',
    browserHtml: true,
    session: { id: sessionId }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((browserResponse) => {
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://books.toscrape.com/',
      httpResponseBody: true,
      session: { id: sessionId }
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((httpResponse) => {
    const httpResponseBody = Buffer.from(
      httpResponse.data.httpResponseBody,
      'base64'
    )
    console.log(httpResponseBody.toString())
  })
})
```

#### PHP

```php
<?php

// https://stackoverflow.com/a/15875555
function uuidv4()
{
    $data = random_bytes(16);

    $data[6] = chr(ord($data[6]) & 0x0F | 0x40); // set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3F | 0x80); // set bits 6-7 to 10

    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

$client = new GuzzleHttp\Client();
$session_id = uuidv4();

$browser_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com/',
        'browserHtml' => true,
        'session' => ['id' => $session_id],
    ],
]);
$http_response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://books.toscrape.com/',
        'httpResponseBody' => true,
        'session' => ['id' => $session_id],
    ],
]);
$http_data = json_decode($http_response->getBody());
$http_response_body = base64_decode($http_data->httpResponseBody);
echo $http_response_body;
```

#### Python

```python
from base64 import b64decode
from uuid import uuid4

import requests

session_id = str(uuid4())

browser_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com/",
        "browserHtml": True,
        "session": {"id": session_id},
    },
)
http_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://books.toscrape.com/",
        "httpResponseBody": True,
        "session": {"id": session_id},
    },
)
http_response_body = b64decode(http_response.json()["httpResponseBody"])
print(http_response_body.decode())
```

#### Python client

```python
import asyncio
from base64 import b64decode
from uuid import uuid4

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    session_id = str(uuid4())
    browser_response = await client.get(
        {
            "url": "https://toscrape.com/",
            "browserHtml": True,
            "session": {"id": session_id},
        }
    )
    http_response = await client.get(
        {
            "url": "https://books.toscrape.com/",
            "httpResponseBody": True,
            "session": {"id": session_id},
        }
    )
    http_response_body = b64decode(http_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from uuid import uuid4

from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        session_id = str(uuid4())
        yield Request(
            "https://toscrape.com/",
            callback=self.parse_browser,
            cb_kwargs={"session_id": session_id},
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "session": {"id": session_id},
                },
            },
        )

    def parse_browser(self, response, session_id):
        yield response.follow(
            "https://books.toscrape.com/",
            callback=self.parse_http,
            meta={
                "zyte_api_automap": {
                    "session": {"id": session_id},
                },
            },
        )

    def parse_http(self, response):
        print(response.text)
```

#### Send HTTP requests with server-managed sessions started with browser requests

Set a no-op action in [sessionContextParameters](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/sessionContextParameters) to force
sessions to start with a browser request, but use HTTP requests.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com/"},
    {"httpResponseBody", true},
    {
        "sessionContext",
        new List<Dictionary<string, string>>()
        {
            new Dictionary<string, string>()
            {
                {"name", "id"},
                {"value", "browser"}
            }
        }
    },
    {
        "sessionContextParameters",
        new Dictionary<string, object>()
        {
            {
                "actions",
                new List<Dictionary<string, object>>()
                {
                    new Dictionary<string, object>()
                    {
                        {"action", "waitForTimeout"},
                        {"timeout", 0},
                    }
                }
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);

Console.WriteLine(httpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com/", "httpResponseBody": true, "sessionContext": [{"name": "id", "value": "browser"}], "sessionContextParameters": {"actions": [{"action": "waitForTimeout", "timeout": 0}]}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com/",
    "httpResponseBody": true,
    "sessionContext": [
        {
            "name": "id",
            "value": "browser"
        }
    ],
    "sessionContextParameters": {
        "actions": [
            {
                "action": "waitForTimeout",
                "timeout": 0
            }
        ]
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### Java

```java
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://toscrape.com/",
            "httpResponseBody",
            true,
            "sessionContext",
            ImmutableList.of(ImmutableMap.of("name", "id", "value", "browser")),
            "sessionContextParameters",
            ImmutableMap.of(
                "actions",
                ImmutableList.of(ImmutableMap.of("action", "waitForTimeout", "timeout", 0))));

    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com/',
    httpResponseBody: true,
    sessionContext: [
      {
        name: 'id',
        value: 'browser'
      }
    ],
    sessionContextParameters: {
      actions: [
        {
          action: 'waitForTimeout',
          timeout: 0
        }
      ]
    }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(httpResponseBody.toString())
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com/',
        'httpResponseBody' => true,
        'sessionContext' => [
            [
                'name' => 'id',
                'value' => 'browser',
            ],
        ],
        'sessionContextParameters' => [
            'actions' => [
                [
                    'action' => 'waitForTimeout',
                    'timeout' => 0,
                ],
            ],
        ],
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
echo $http_response_body.PHP_EOL;
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com/",
        "httpResponseBody": True,
        "sessionContext": [{"name": "id", "value": "browser"}],
        "sessionContextParameters": {
            "actions": [
                {
                    "action": "waitForTimeout",
                    "timeout": 0,
                },
            ],
        },
    },
)
http_response_body_bytes = b64decode(api_response.json()["httpResponseBody"])
http_response_body = http_response_body_bytes.decode()
print(http_response_body)
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    http_response = await client.get(
        {
            "url": "https://toscrape.com/",
            "httpResponseBody": True,
            "sessionContext": [{"name": "id", "value": "browser"}],
            "sessionContextParameters": {
                "actions": [
                    {
                        "action": "waitForTimeout",
                        "timeout": 0,
                    },
                ],
            },
        }
    )
    http_response_body = b64decode(http_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "https://toscrape.com/",
            meta={
                "zyte_api_automap": {
                    "sessionContext": [
                        {
                            "name": "id",
                            "value": "browser",
                        },
                    ],
                    "sessionContextParameters": {
                        "actions": [
                            {
                                "action": "waitForTimeout",
                                "timeout": 0,
                            },
                        ],
                    },
                },
            },
        )

    def parse(self, response):
        print(response.text)
```

#### Send HTTP requests with server-managed sessions started with a browser action that visits a specific URL

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "http://httpbin.org/cookies"},
    {"httpResponseBody", true},
    {
        "sessionContext",
        new List<Dictionary<string, string>>()
        {
            new Dictionary<string, string>()
            {
                {"name", "id"},
                {"value", "cookies"}
            }
        }
    },
    {
        "sessionContextParameters",
        new Dictionary<string, object>()
        {
            {
                "actions",
                new List<Dictionary<string, object>>()
                {
                    new Dictionary<string, object>()
                    {
                        {"action", "goto"},
                        {"url", "http://httpbin.org/cookies/set/foo/bar"},
                    }
                }
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);

Console.WriteLine(httpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "http://httpbin.org/cookies", "httpResponseBody": true, "sessionContext": [{"name": "id", "value": "cookies"}], "sessionContextParameters": {"actions": [{"action": "goto", "url": "http://httpbin.org/cookies/set/foo/bar"}]}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### curl

input.json
```json
{
    "url": "http://httpbin.org/cookies",
    "httpResponseBody": true,
    "sessionContext": [
        {
            "name": "id",
            "value": "cookies"
        }
    ],
    "sessionContextParameters": {
        "actions": [
            {
                "action": "goto",
                "url": "http://httpbin.org/cookies/set/foo/bar"
            }
        ]
    }
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode
```

#### Java

```java
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "http://httpbin.org/cookies",
            "httpResponseBody",
            true,
            "sessionContext",
            ImmutableList.of(ImmutableMap.of("name", "id", "value", "cookies")),
            "sessionContextParameters",
            ImmutableMap.of(
                "actions",
                ImmutableList.of(
                    ImmutableMap.of(
                        "action", "goto", "url", "http://httpbin.org/cookies/set/foo/bar"))));

    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'http://httpbin.org/cookies',
    httpResponseBody: true,
    sessionContext: [
      {
        name: 'id',
        value: 'cookies'
      }
    ],
    sessionContextParameters: {
      actions: [
        {
          action: 'goto',
          url: 'http://httpbin.org/cookies/set/foo/bar'
        }
      ]
    }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  console.log(httpResponseBody.toString())
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'http://httpbin.org/cookies',
        'httpResponseBody' => true,
        'sessionContext' => [
            [
                'name' => 'id',
                'value' => 'cookies',
            ],
        ],
        'sessionContextParameters' => [
            'actions' => [
                [
                    'action' => 'goto',
                    'url' => 'http://httpbin.org/cookies/set/foo/bar',
                ],
            ],
        ],
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
echo $http_response_body.PHP_EOL;
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "http://httpbin.org/cookies",
        "httpResponseBody": True,
        "sessionContext": [
            {
                "name": "id",
                "value": "cookies",
            },
        ],
        "sessionContextParameters": {
            "actions": [
                {
                    "action": "goto",
                    "url": "http://httpbin.org/cookies/set/foo/bar",
                },
            ],
        },
    },
)
http_response_body_bytes = b64decode(api_response.json()["httpResponseBody"])
http_response_body = http_response_body_bytes.decode()
print(http_response_body)
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "http://httpbin.org/cookies",
            "httpResponseBody": True,
            "sessionContext": [
                {
                    "name": "id",
                    "value": "cookies",
                },
            ],
            "sessionContextParameters": {
                "actions": [
                    {
                        "action": "goto",
                        "url": "http://httpbin.org/cookies/set/foo/bar",
                    },
                ],
            },
        },
    )
    http_response_body_bytes = b64decode(api_response["httpResponseBody"])
    http_response_body = http_response_body_bytes.decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

> ###### TIP
>
> scrapy-zyte-api also provides its own session management
> API, similar to that of
> server-managed sessions, but
> built on top of client-managed sessions.

```python
from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        yield Request(
            "http://httpbin.org/cookies",
            meta={
                "zyte_api_automap": {
                    "sessionContext": [
                        {
                            "name": "id",
                            "value": "cookies",
                        },
                    ],
                    "sessionContextParameters": {
                        "actions": [
                            {
                                "action": "goto",
                                "url": "http://httpbin.org/cookies/set/foo/bar",
                            },
                        ],
                    },
                },
            },
        )

    def parse(self, response):
        print(response.text)
```

Output:

```json
{
  "cookies": {
    "foo": "bar"
  }
}
```

#### Send 2 consecutive requests through the same IP address using a client-managed session

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var sessionId = Guid.NewGuid().ToString();

for (int i = 0; i < 2; i++)
{
    var input = new Dictionary<string, object>(){
        {"url", "https://httpbin.org/ip"},
        {"httpResponseBody", true},
        {
            "session",
            new Dictionary<string, string>()
            {
                {"id", sessionId}
            }
        }
    };
    var inputJson = JsonSerializer.Serialize(input);
    var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

    HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
    var body = await response.Content.ReadAsByteArrayAsync();

    var data = JsonDocument.Parse(body);
    var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
    var httpResponseBodyBytes = System.Convert.FromBase64String(base64HttpResponseBody);
    var httpResponseBody = System.Text.Encoding.UTF8.GetString(httpResponseBodyBytes);

    var responseData = JsonDocument.Parse(httpResponseBody);
    var ipAddress = responseData.RootElement.GetProperty("origin").ToString();

    Console.WriteLine(ipAddress);
}
```

#### CLI client

input.jsonl
```json
{"url": "https://httpbin.org/ip", "httpResponseBody": true, "session": {"id": "e07843b4-fd72-4a02-82b4-3376c6ceba92"}}
{"url": "https://httpbin.org/ip", "httpResponseBody": true, "session": {"id": "e07843b4-fd72-4a02-82b4-3376c6ceba92"}}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    | jq --raw-output .origin
```

#### curl

input.json
```json
{
    "url": "https://httpbin.org/ip",
    "httpResponseBody": true,
    "session": {
        "id": "e07843b4-fd72-4a02-82b4-3376c6ceba92"
    }
}
```

```shell
for i in {1..2}
do
    curl \
        --user YOUR_ZYTE_API_KEY: \
        --header 'Content-Type: application/json' \
        --data @input.json \
        --compressed \
        https://api.zyte.com/v1/extract \
        | jq --raw-output .httpResponseBody \
        | base64 --decode \
        | jq --raw-output .origin
done
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import java.util.UUID;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    String sessionId = UUID.randomUUID().toString();
    CloseableHttpClient client = HttpClients.createDefault();

    for (int i = 0; i < 2; i++) {
      Map<String, Object> session = ImmutableMap.of("id", sessionId);
      Map<String, Object> parameters =
          ImmutableMap.of(
              "url", "https://httpbin.org/ip", "httpResponseBody", true, "session", session);
      String requestBody = new Gson().toJson(parameters);

      HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
      request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
      request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
      request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
      request.setEntity(new StringEntity(requestBody));

      client.execute(
          request,
          response -> {
            HttpEntity entity = response.getEntity();
            String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
            JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
            String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
            byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
            String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
            JsonObject data = JsonParser.parseString(httpResponseBody).getAsJsonObject();
            String body = data.get("origin").getAsString();
            System.out.println(body);
            return null;
          });
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const crypto = require('crypto')

const sessionId = String(crypto.randomUUID())

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://httpbin.org/ip',
    httpResponseBody: true,
    session: { id: sessionId }
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
  const body = JSON.parse(httpResponseBody).origin
  console.log(body)
  axios.post(
    'https://api.zyte.com/v1/extract',
    {
      url: 'https://httpbin.org/ip',
      httpResponseBody: true,
      session: { id: sessionId }
    },
    {
      auth: { username: 'YOUR_ZYTE_API_KEY' }
    }
  ).then((response) => {
    const httpResponseBody = Buffer.from(
      response.data.httpResponseBody,
      'base64'
    )
    const body = JSON.parse(httpResponseBody).origin
    console.log(body)
  })
})
```

#### PHP

```php
<?php

// https://stackoverflow.com/a/15875555
function uuidv4()
{
    $data = random_bytes(16);

    $data[6] = chr(ord($data[6]) & 0x0F | 0x40); // set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3F | 0x80); // set bits 6-7 to 10

    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

$client = new GuzzleHttp\Client();
$session_id = uuidv4();

for ($i = 0; $i < 2; ++$i) {
    $response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
        'auth' => ['YOUR_ZYTE_API_KEY', ''],
        'headers' => ['Accept-Encoding' => 'gzip'],
        'json' => [
            'url' => 'https://httpbin.org/anything',
            'httpResponseBody' => true,
            'session' => ['id' => $session_id],
        ],
    ]);
    $data = json_decode($response->getBody());
    $http_response_body = base64_decode($data->httpResponseBody);
    $body = json_decode($http_response_body)->origin;
    echo $body.PHP_EOL;
}
```

#### Proxy mode

With the proxy mode, use the
`Zyte-Session-ID` header.

```shell
for i in {1..2}
do
    curl \
        --proxy api.zyte.com:8011 \
        --proxy-user YOUR_ZYTE_API_KEY: \
        --header 'Content-Type: application/json' \
        --header 'Zyte-Session-ID: e07843b4-fd72-4a02-82b4-3376c6ceba92' \
        --compressed \
        https://httpbin.org/ip \
        | jq --raw-output .origin
done
```

#### Python

```python
import json
from base64 import b64decode
from uuid import uuid4

import requests

session_id = str(uuid4())

for _ in range(2):
    api_response = requests.post(
        "https://api.zyte.com/v1/extract",
        auth=("YOUR_ZYTE_API_KEY", ""),
        json={
            "url": "https://httpbin.org/ip",
            "httpResponseBody": True,
            "session": {"id": session_id},
        },
    )
    http_response_body = b64decode(api_response.json()["httpResponseBody"])
    body: str = json.loads(http_response_body)["origin"]
    print(body)
```

#### Python client

```python
import asyncio
import json
from base64 import b64decode
from uuid import uuid4

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    session_id = str(uuid4())
    for i in range(2):
        api_response = await client.get(
            {
                "url": "https://httpbin.org/ip",
                "httpResponseBody": True,
                "session": {"id": session_id},
            },
        )
        http_response_body = b64decode(api_response["httpResponseBody"]).decode()
        data = json.loads(http_response_body)
        print(data["origin"])

asyncio.run(main())
```

#### Scrapy

> ###### TIP
>
> scrapy-zyte-api also provides its own session management
> API, similar to that of
> server-managed sessions, but
> built on top of client-managed sessions.

```python
import json
from uuid import uuid4

from scrapy import Request, Spider

class HTTPBinOrgSpider(Spider):
    name = "httpbin_org"

    async def start(self):
        session_id = str(uuid4())
        yield Request(
            "https://httpbin.org/ip",
            cb_kwargs={"session_id": session_id},
            meta={"zyte_api_automap": {"session": {"id": session_id}}},
        )

    def parse(self, response, session_id):
        print(json.loads(response.body)["origin"])
        yield Request(
            "https://httpbin.org/ip",
            meta={"zyte_api_automap": {"session": {"id": session_id}}},
            dont_filter=True,
            callback=self.parse2,
        )

    def parse2(self, response):
        print(json.loads(response.body)["origin"])
```

Output:

```none
203.0.113.122
203.0.113.122
```

#### Access a [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM)

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

To get content from the [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/Web_Components/Using_shadow_DOM), use the `evaluate` action to create an
invisible DOM element, which you will get in [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml), and
fill it with the desired content from the shadow DOM.

> ###### TIP
>
> If your `evaluate` action does not work as expected, check the
> [actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/actions) response field for errors.

The following example code shows how to access the shadow DOM paragraph from
[a shadow DOM example in CodePen](https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=) using the
`evaluate` action with the following `source`:

```js
const div = document.createElement('div')
div.setAttribute('id', 'shadow-root-content')
// Hide, in case you also want to take a screenshot.
div.style.display = 'none'
const iframe = document.getElementById('result')
div.innerText = iframe
  .contentWindow.document
  .getElementById('shadow-root')
  .shadowRoot.querySelector('p').textContent
document.body.appendChild(div)
```

#### C#

```cs
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Xml.XPath;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view="},
    {"browserHtml", true},
    {
        "actions",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"action", "evaluate"},
                {"source", @"
                  const div = document.createElement('div')
                  div.setAttribute('id', 'shadow-root-content')
                  div.style.display = 'none'
                  const iframe = document.getElementById('result')
                  div.innerText = iframe
                    .contentWindow.document
                    .getElementById('shadow-root')
                    .shadowRoot.querySelector('p').textContent
                  document.body.appendChild(div)
                "}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var nodeIterator = (XPathNodeIterator)navigator.Evaluate("//*[@id=\"shadow-root-content\"]/text()");
nodeIterator.MoveNext();
var shadowText = nodeIterator.Current.ToString();
Console.WriteLine(shadowText);
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> actions =
        ImmutableMap.of(
            "action",
            "evaluate",
            "source",
            "const div = document.createElement('div')\n"
                + "div.setAttribute('id', 'shadow-root-content')\n"
                + "div.style.display = 'none'\n"
                + "const iframe = document.getElementById('result')\n"
                + "div.innerText = iframe\n"
                + "  .contentWindow.document\n"
                + "  .getElementById('shadow-root')\n"
                + "  .shadowRoot.querySelector('p').textContent\n"
                + "document.body.appendChild(div)");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
            "browserHtml",
            true,
            "actions",
            Collections.singletonList(actions));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          String shadowText = document.select("#shadow-root-content").text();
          System.out.println(shadowText);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=',
    browserHtml: true,
    actions: [
      {
        action: 'evaluate',
        source: `
          const div = document.createElement('div')
          div.setAttribute('id', 'shadow-root-content')
          div.style.display = 'none'
          const iframe = document.getElementById('result')
          div.innerText = iframe
            .contentWindow.document
            .getElementById('shadow-root')
            .shadowRoot.querySelector('p').textContent
          document.body.appendChild(div)
        `
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
  const $ = cheerio.load(browserHtml)
  const shadowText = $('#shadow-root-content').text()
  console.log(shadowText)
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=',
        'browserHtml' => true,
        'actions' => [
            [
                'action' => 'evaluate',
                'source' => "
                    const div = document.createElement('div')
                    div.setAttribute('id', 'shadow-root-content')
                    div.style.display = 'none'
                    const iframe = document.getElementById('result')
                    div.innerText = iframe
                      .contentWindow.document
                      .getElementById('shadow-root')
                      .shadowRoot.querySelector('p').textContent
                    document.body.appendChild(div)
                ",
            ],
        ],
    ],
]);
$data = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($data->browserHtml);
$xpath = new DOMXPath($doc);
$shadow_text = $xpath->query("//*[@id='shadow-root-content']")->item(0)->textContent;
echo $shadow_text.PHP_EOL;
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
        "browserHtml": True,
        "actions": [
            {
                "action": "evaluate",
                "source": """
                    const div = document.createElement('div')
                    div.setAttribute('id', 'shadow-root-content')
                    div.style.display = 'none'
                    const iframe = document.getElementById('result')
                    div.innerText = iframe
                      .contentWindow.document
                      .getElementById('shadow-root')
                      .shadowRoot.querySelector('p').textContent
                    document.body.appendChild(div)
                """,
            },
        ],
    },
)
browser_html = api_response.json()["browserHtml"]
shadow_text = Selector(browser_html).css("#shadow-root-content::text").get()
print(shadow_text)
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
            "browserHtml": True,
            "actions": [
                {
                    "action": "evaluate",
                    "source": """
                        const div = document.createElement('div')
                        div.setAttribute('id', 'shadow-root-content')
                        div.style.display = 'none'
                        const iframe = document.getElementById('result')
                        div.innerText = iframe
                          .contentWindow.document
                          .getElementById('shadow-root')
                          .shadowRoot.querySelector('p').textContent
                        document.body.appendChild(div)
                    """,
                },
            ],
        },
    )
    browser_html = api_response["browserHtml"]
    shadow_text = Selector(browser_html).css("#shadow-root-content::text").get()
    print(shadow_text)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class CodePenSpider(Spider):
    name = "codepen"

    async def start(self):
        yield Request(
            "https://cdpn.io/TLadd/fullpage/PoGoQeV?anon=true&view=",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "evaluate",
                            "source": """
                                const div = document.createElement('div')
                                div.setAttribute('id', 'shadow-root-content')
                                div.style.display = 'none'
                                const iframe = document.getElementById('result')
                                div.innerText = iframe
                                  .contentWindow.document
                                  .getElementById('shadow-root')
                                  .shadowRoot.querySelector('p').textContent
                                document.body.appendChild(div)
                            """,
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        shadow_text = response.css("#shadow-root-content::text").get()
        print(shadow_text)
```

Output:

```none
Shadow Paragraph
```

## Using the Zyte IDE

![image](zyte-api/ide/images/open-ide.png)

To open the **Zyte IDE**, select **Zyte API › Zyte IDE** in the sidebar of the
[Zyte dashboard](https://app.zyte.com/).

The **Zyte IDE** lets you:

- Write, debug,
  and deploy browser scripts, written using
  our [TypeScript](https://www.typescriptlang.org/) scripting API, to use as
  actions in browser requests.
- Build Zyte API requests visually, and debug existing requests.

### Zyte IDE requirements

To use the Zyte IDE you need a modern browser.

You also need to enable third-party cookies on the `zyte.group` domain. If
your browser blocks them, you will see an error like the following:

> Error loading webview: Error: Could not register service workers:
> NotSupportedError: Failed to register a ServiceWorker for scope …

### Browser script advantages

Browser scripts are written using the scripting API, a [TypeScript](https://www.typescriptlang.org/) API to expose Zyte API actions.

The main advantage of browser scripts over action sequences is support for a
non-linear flow: [TypeScript](https://www.typescriptlang.org/) allows using conditional statements, loops, and so
on. For example, in a browser script you can check if an element is present in
a webpage, and run different actions depending on that.

Browser scripts also allow accessing [iframe](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe) elements through
`Page.getIframe()`.

### `src/` and `dist/`

On the ![Explorer](zyte-api/ide/images/explorer.png) [Explorer](https://code.visualstudio.com/docs/getstarted/userinterface#_explorer) view of the Zyte IDE you get 2
folders:

- `src/` is the development folder, where you develop your browser scripts.

  Changes to this folder are saved on the Zyte IDE cloud storage system.

  Multiple developers can work on this folder at the same time, and see the
  work of one another in real time.
- `dist/` is the deployment folder.

  It contains files built from the last version of `src/` that has been
  deployed to Zyte API.

  You should never write anything in the `dist/` folder, all its contents
  are removed during deployment.

### Creating a new script

To create a new browser script:

1. Open the Zyte IDE.
2. Select ![Application Menu](zyte-api/ide/images/menu.png) (top-left corner) **› File › New File…**.
3. On the **Create New…** dialog, select **Smart Browser Interaction**.
4. On the **Use Case** dialog, select one of the following:
   - One of the special action interaction classes, to extend it in your new
     script.
   - **Others**, to create a script from scratch.
   - **Utils**, to create a file with utility code that you can reuse from
     browser scripts and other utility code files.
5. On the **Domain** dialog, enter the domain of the website for which you are
   writing the script (e.g. `toscrape.com`).

A TypeScript file is created in the `src/` folder based on your input data,
and will open on a tab of the Zyte IDE.

For example, if you select the **Others** use case and the `toscrape.com`
domain, a `src/Others/toscrape.com.ts` file is created with the following
content:

```typescript
import { BaseInteraction, Page } from "smartbrowser-core-interactions/index.ts";

interface Args {
    // add your arguments here
    // arg1: string;
}

export default class Interaction extends BaseInteraction {
    domains = ["example.com"];
    async do(page: Page, args: Args): Promise<void> {
        // implement your logic here
    }
}
```

You can now use the scripting API to implement
your browser script, and then debug and,
eventually, deploy that browser script.
To get started, try out our our example browser scripts!

### Debugging a script

After you create an interaction class, you can run
your interaction class on a webpage from the Zyte IDE to see
how it works:

1. Select ![Run Smart Browser Interaction](zyte-api/ide/images/run.png) (top-right corner).
2. On the **URL** dialog, enter a target URL (e.g. `https://toscrape.com`).
3. On the **Interaction Parameters(in JSON)** dialog, enter a JSON object of
   arguments for your interaction class. You can leave it empty to pass no
   parameters.

On pressing Enter, the Zyte IDE view splits vertically, and on the right-hand
view a tab loads a tool, **Smart Browser DevTools**, which runs your
interaction against the specified URL, showing you the result in real time, and
offering you debugging tools and data.

When your interaction finishes, an “execution finished” message pops up on the
bottom-right corner.

Once your interaction class is working as expected, you can deploy it and use it as a
browser action in your data extraction requests.

### Completing our Know Your Customer procedure

While all Zyte API features are available to every customer, the following
features are disabled by default:

- Custom script deployment
- Setting ipType to residential

Unless you are on a free trial, you can enable these
features by completing our [Know Your Customer](https://en.wikipedia.org/wiki/Know_your_customer) (KYC) procedure.

> ###### TIP
>
> If you are on an Enterprise plan, you have
> already completed our Know Your Customer (KYC) procedure.

To start your KYC application:

1. Open the Zyte IDE.
2. Select ![Zyte Smart Browser Devtools](zyte-api/ide/images/zyte.png) on the [Activity Bar](https://code.visualstudio.com/docs/getstarted/userinterface) (left-hand side).
3. On the **Zyte IDE** side view that opens, under **Deploy**, click the
   **Request Access** button.
   > ###### TIP
   >
   > If you see a **Deploy** button instead, you have probably already
   > passed our KYC procedure.
4. Fill and submit the form that opens.

Once our legal team has reviewed your application, we will notify you the
outcome. If approved, you will instantly get access to all previously disabled
features.

### Deploying your changes

You can debug scripts from the Zyte IDE, but to use them as browser
actions in your data extraction requests you must first deploy your changes to Zyte API.

To deploy all your changes to Zyte API:

1. Select ![Zyte Smart Browser Devtools](zyte-api/ide/images/zyte.png) on the [Activity Bar](https://code.visualstudio.com/docs/getstarted/userinterface) (left-hand side).
2. On the **Zyte IDE** side view that opens, click the **Deploy** button.
   > ###### TIP
   >
   > If you see a **Request Access** button instead, see kyc.

The deployment process empties the `dist/` folder, builds files from the
`src/` folder into the `dist/` folder, and deploys the files in the
`dist/` folder to Zyte API.

### Using a script with Zyte API

Once you have deployed a script,
you can get the interaction ID as follows:

1. Select ![Zyte Smart Browser Devtools](zyte-api/ide/images/zyte.png) on the [Activity Bar](https://code.visualstudio.com/docs/getstarted/userinterface) (left-hand side).
2. On the **Zyte IDE** side view that opens, under **Interactions**,
   right-click on your interaction, and click **Copy Interaction ID**.

The interaction ID will be copied to your clipboard.

You can then invoke your script as a browser action
from your data extraction requests:

1. Set the `action` field to `"interaction"`.
2. Set the `id` field to the interaction ID from your clipboard.

For example, if your interaction ID is `Others-toscrape.com`,
use:

```json
{
    "action": "interaction",
    "id": "Others-toscrape.com"
}
```

To pass arguments to your script, set the `args` field to an object. That
object is passed to your script as the *args* parameter of
`BaseInteraction.do()`. For example:

```json
{
    "action": "interaction",
    "id": "Others-toscrape.com",
    "args": {
        "foo": "bar"
    }
}
```

## Scripting API

This is the reference documentation of the scripting API, a [TypeScript](https://www.typescriptlang.org/) API
that you can use to write browser scripts.

For usage examples, see zapi-browser-script-examples.

### Using the scripting API

The `smartbrowser-core-interactions` module provides classes and functions
that you can use to implement browser scripts.

You can import those classes and functions from
`smartbrowser-core-interactions/index.ts`. For example:

```typescript
import { BaseInteraction, Page } from "smartbrowser-core-interactions/index.ts";
```

Below you can find the reference documentation of the complete scripting API.

### BaseInteraction and Page

`BaseInteraction()` and `Page()` are the base of the scripting
API.

#### *class* BaseInteraction()

Base class for browser scripts.

*abstract*

*exported from* `base_classes.Base`

#### BaseInteraction.do(page, args)

Entry point of a browser script.

* **Arguments:**
  * **page** (*Page*) – Current page.
  * **args** (*object*) – `args` action parameter passed through a Zyte API request, defaults to `{}`.
* **Returns:**
  **Promise<void>** –

#### *class* Page()

The webpage currently loaded.

*interface*

*exported from* `api.page`

#### Page.click(selector)

Click the first element matching *selector*.

* **Arguments:**
  * **selector** (*Selector|string*) – `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<void>** –

#### Page.cookies()

Get cookies associated with current page.

* **Returns:**
  **Promise<Cookie[]>** –

#### Page.deleteCookie(deleteCookieRequest)

Removes a cookies from a set of cookies associated with current page

* **Arguments:**
  * **deleteCookieRequest** (*DeleteCookieRequest*)
* **Returns:**
  **Promise<void>** –

#### Page.evaluate(source)

Executes JavaScript code within page context.

* **Arguments:**
  * **source** (*string*) – string with JavaScript source to be executed in page context
* **Returns:**
  **Promise<any>** – Promise that resolves to data returned by the evaluated code, if any.

#### Page.fetch(url, options)

Send a request for *url* within the current [browsing context](https://developer.mozilla.org/en-US/docs/Glossary/Browsing_context) and
return a promise that resolves to a `FetchResponse()` object.

Note that the browsing context may limit what requests can be made, e.g.
through [CORS](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS).

* **Arguments:**
  * **url** (*string*) – URL to fetch
  * **options** (*FetchOptions*) – Fetch options
* **Returns:**
  **Promise<FetchResponse>** – FetchResponse object

#### Page.getIframe(selector)

Get the first Iframe matching *selector*.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<Page>** –

#### Page.goto(url, options)

Navigate to *url*.

By default, the returned promise resolves once *url* loads. See
`GotoOptions.waitUntil` for other options.

* **Arguments:**
  * **url** (*string*) – Target URL
  * **options** (*GotoOptions*) – Navigation options
* **Returns:**
  **Promise<void>** –

#### Page.hide(selector)

Hide all elements matching *selector*.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<void>** –

#### Page.hover(selector)

Hover over the first element matching *selector*.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<void>** –

#### Page.keyPress(key)

Press *key*.

Supported key IDs are:

| Key Group           | Key                     | Key IDs                |
|---------------------|-------------------------|------------------------|
| Letters             | A                       | `"a"`, `"A"`, `"KeyA"` |
| B                   | `"b"`, `"B"`, `"KeyB"`  |                        |
| C                   | `"c"`, `"C"`, `"KeyC"`  |                        |
| D                   | `"d"`, `"D"`, `"KeyD"`  |                        |
| E                   | `"e"`, `"E"`, `"KeyE"`  |                        |
| F                   | `"f"`, `"F"`, `"KeyF"`  |                        |
| G                   | `"g"`, `"G"`, `"KeyG"`  |                        |
| H                   | `"h"`, `"H"`, `"KeyH"`  |                        |
| I                   | `"i"`, `"I"`, `"KeyI"`  |                        |
| J                   | `"j"`, `"J"`, `"KeyJ"`  |                        |
| K                   | `"k"`, `"K"`, `"KeyK"`  |                        |
| L                   | `"l"`, `"L"`, `"KeyL"`  |                        |
| M                   | `"m"`, `"M"`, `"KeyM"`  |                        |
| N                   | `"n"`, `"N"`, `"KeyN"`  |                        |
| O                   | `"o"`, `"O"`, `"KeyO"`  |                        |
| P                   | `"p"`, `"P"`, `"KeyP"`  |                        |
| Q                   | `"q"`, `"Q"`, `"KeyQ"`  |                        |
| R                   | `"r"`, `"R"`, `"KeyR"`  |                        |
| S                   | `"s"`, `"S"`, `"KeyS"`  |                        |
| T                   | `"t"`, `"T"`, `"KeyT"`  |                        |
| U                   | `"u"`, `"U"`, `"KeyU"`  |                        |
| V                   | `"v"`, `"V"`, `"KeyV"`  |                        |
| W                   | `"w"`, `"W"`, `"KeyW"`  |                        |
| X                   | `"x"`, `"X"`, `"KeyX"`  |                        |
| Y                   | `"y"`, `"Y"`, `"KeyY"`  |                        |
| Z                   | `"z"`, `"Z"`, `"KeyZ"`  |                        |
| Digits              | 0                       | `"0"`, `"Digit0"`      |
| 1                   | `"1"`, `"Digit1"`       |                        |
| 2                   | `"2"`, `"Digit2"`       |                        |
| 3                   | `"3"`, `"Digit3"`       |                        |
| 4                   | `"4"`, `"Digit4"`       |                        |
| 5                   | `"5"`, `"Digit5"`       |                        |
| 6                   | `"6"`, `"Digit6"`       |                        |
| 7                   | `"7"`, `"Digit7"`       |                        |
| 8                   | `"8"`, `"Digit8"`       |                        |
| 9                   | `"9"`, `"Digit9"`       |                        |
| Symbols             | Ampersand               | `"&"`                  |
| Asterisk            | `"*"`                   |                        |
| At sign             | `"@"`                   |                        |
| Backslash           | `"Backslash"`, `"\\"`   |                        |
| Backtick            | `"Backquote"`, `"`"`    |                        |
| Caret               | `"^"`                   |                        |
| Closing brace       | `"}"`                   |                        |
| Closing bracket     | `"BracketRight"`, `"]"` |                        |
| Closing parenthesis | `")"`                   |                        |
| Colon               | `":"`                   |                        |
| Comma               | `"Comma"`, `","`        |                        |
| Dollar sign         | `"$"`                   |                        |
| Double quote        | `'"'`                   |                        |
| Equal               | `"Equal"`, `"="`        |                        |
| Exclamation mark    | `"!"`                   |                        |
| Greater-than sign   | `">"`                   |                        |
| Hash                | `"#"`                   |                        |
| Interrogation mark  | `"?"`                   |                        |
| Less-than sign      | `"<"`                   |                        |
| Minus sign          | `"Minus"`, `"-"`        |                        |
| Opening brace       | `"{"`                   |                        |
| Opening bracket     | `"BracketLeft"`, `"["`  |                        |
| Opening parenthesis | `")"`                   |                        |
| Percent sign        | `"%"`                   |                        |
| Period              | `"Period"`, `"."`       |                        |
| Plus sign           | `"+"`                   |                        |
| Semicolon           | `"Semicolon"`, `";"`    |                        |
| Single quote        | `"Quote"`, `"'"`        |                        |
| Slash               | `"Slash"`, `"/"`        |                        |
| Tilde               | `"~"`                   |                        |
| Underscore          | `"_"`                   |                        |
| Vertical bar        | `"|"`                   |                        |
| Whitespace          | Enter                   | `"Enter"`, `"\n"`      |
| Space               | `"Space"`, `" "`        |                        |
| Tab                 | `"Tab"`                 |                        |
| Editing             | Backspace               | `"Backspace"`, `"\r"`  |
| Delete              | `"Delete"`              |                        |
| Insert              | `"Insert"`              |                        |
| Navigation          | Down arrow              | `"ArrowDown"`          |
| Left arrow          | `"ArrowLeft"`           |                        |
| Page down           | `"PageDown"`            |                        |
| Page up             | `"PageUp"`              |                        |
| Right arrow         | `"ArrowRight"`          |                        |
| Up arrow            | `"ArrowUp"`             |                        |
| Modifier            | Alt                     | `"Alt"`                |
| Caps Lock           | `"CapsLock"`            |                        |
| Left Alt            | `"AltLeft"`             |                        |
| Left Control        | `"ControlLeft"`         |                        |
| Left Shift          | `"ShiftLeft"`           |                        |
| Right Alt           | `"AltRight"`            |                        |
| Right Control       | `"ControlRight"`        |                        |
| Right Shift         | `"ShiftRight"`          |                        |
| Shift               | `"Shift"`               |                        |
| Numpad              | 0                       | `"Numpad0"`            |
| 1                   | `"Numpad1"`             |                        |
| 2                   | `"Numpad2"`             |                        |
| 3                   | `"Numpad3"`             |                        |
| 4                   | `"Numpad4"`             |                        |
| 5                   | `"Numpad5"`             |                        |
| 6                   | `"Numpad6"`             |                        |
| 7                   | `"Numpad7"`             |                        |
| 8                   | `"Numpad8"`             |                        |
| Add sign            | `"NumpadAdd"`           |                        |
| Decimal sign        | `"NumpadDecimal"`       |                        |
| Divide sign         | `"NumpadDivide"`        |                        |
| Enter               | `"NumpadEnter"`         |                        |
| Equal               | `"NumpadEqual"`         |                        |
| Multiply sign       | `"NumpadMultiply"`      |                        |
| Substract sign      | `"NumpadSubtract"`      |                        |
| UI                  | Break key               | `"Pause"`              |
| Composition         | Non-conversion          | `"NonConvert"`         |
| Other               | Null                    | `"\u0000"`             |
* **Arguments:**
  * **key** (*string*) – Key ID.
* **Returns:**
  **Promise<void>** –

#### Page.querySelector(selector)

Query the DOM for an element matching *selector* and return first element
found.

If no element matches *selector*, the return value resolves to `null`.

If multiple elements match *selector*, only the first element is returned.
To get all elements, use `querySelectorAll()` instead.

This method throws an error if *selector* is invalid.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<ElementHandle|undefined>** – Promise that resolves to an `ElementHandle()` object for the first element matching *selector*.

#### Page.querySelectorAll(selector)

Query the DOM for all elements matching *selector*.

If no elements match *selector*, the return value resolves to `[]`.

This method throws an error if *selector* is invalid.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<ElementHandle[]>** – Promise that resolves to an array of all `ElementHandle()` objects matching *selector*.

#### Page.reload(options)

Refresh the current page.

* **Arguments:**
  * **options** (*GotoOptions*) – Navigation options
* **Returns:**
  **Promise<void>** –

#### Page.scrollBottom(options)

Scroll to given position in a page.

* **Arguments:**
  * **options** (*ScrollBottomOptions*) – Scrolling options
* **Returns:**
  **Promise<void>** –

#### Page.scrollTo(options)

Scroll to a given position in the document.

* **Arguments:**
  * **options** (*ScrollToOptions*) – Scrolling options
* **Returns:**
  **Promise<void>** –

#### Page.select(selector, values)

Selects an option in a [select](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/select)
element.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
  * **values** (*string[]*) – Array of [option](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/option) element [values](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/option#value) to select.
* **Returns:**
  **Promise<void>** –

#### Page.setCookie(cookie)

Sets a cookie to be sent with subsequent requests within current page context.

* **Arguments:**
  * **cookie** (*Cookie*) – Cookie to be set, should have name and value properties
* **Returns:**
  **Promise<void>** –

#### Page.type(selector, text, delay)

Type *text* into the first element matching *selector*.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
  * **text** (*string*) – Input text.
  * **delay** (*number*) – Time to wait between key presses in seconds, defaults to 0.
* **Returns:**
  **Promise<void>** –

#### Page.url()

Returns a string with the URL of the current page.

* **Returns:**
  **Promise<string>** – Promise that resolves to the URL of the page.

#### Page.waitForNavigation(timeout, waitUntil)

Wait for navigation to finish.

* **Arguments:**
  * **timeout** (*number*) – Maximum time to wait, in seconds. Defaults to 30 seconds.
  * **waitUntil** ( *"load"|"domcontentloaded"|"networkidle0"*) – When to consider that navigation has finished. One of: `"load"` ([load event](https://developer.mozilla.org/en-US/docs/Web/API/Window/load_event), default), `"domcontentloaded"` ([DOMContentLoaded event](https://developer.mozilla.org/en-US/docs/Web/API/Window/DOMContentLoaded_event)), or `"networkidle0"` (no ongoing network connections for at least 0.5 seconds).
* **Returns:**
  **Promise<Response>** – Promise that resolves to a [Response](https://developer.mozilla.org/en-US/docs/Web/API/Response) object.

#### Page.waitForSelector(selector, timeout)

Wait for *selector* to match an item in the [DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model).

If there is already a match by the time the method is called, the returned
promise resolves immediately.

* **Arguments:**
  * **selector** (*Selector|string*) –

    `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
  * **timeout** (*number*) – Maximum wait time, in seconds. Defaults to 30 seconds.
* **Returns:**
  **Promise<void>** –

#### Page.waitForTimeout(timeout)

Return a promise that resolves after *timeout*.

* **Arguments:**
  * **timeout** (*number*) – Wait time, in seconds.
* **Returns:**
  **Promise<void>** –

### Selector and ElementHandle

#### *class* Selector()

Expression that aims to match one or more [DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model)
elements.

*interface*

*exported from* `api.page`

#### Selector.allElements?

**type:** boolean

Whether to match all possible elements (`true`) or only the first one
(`false`, default).

#### Selector.state?

**type:** “attached”|”visible”|”hidden”

Visibility required for an element to match.

Possible values are:

- `"visible"` (default): only visible elements are matched.

  An element is visible if it has a non-empty bounding box and does not
  have `visibility` set to `hidden`.

  Note that elements with `display` set to `none` have an empty
  bounding box, and are hence not considered visible.
- `"hidden"`: only non-visible elements are matched.
- `"attached"`: any element may be matched, regardless of its
  visibility.

#### Selector.type

**type:** “css”|”xpath”

Whether `value` is a [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors)
expression or an [XPath 1.0](https://www.w3.org/TR/1999/REC-xpath-19991116/) expression.

You can find some resources to learn about these selector languages in the
[parsel documentation](https://parsel.readthedocs.io/en/latest/usage.html#learning-expression-languages).

#### Selector.value

**type:** string

Expression.

#### *class* ElementHandle()

[DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model)
element.

*interface*

*exported from* `api.page`

#### ElementHandle.getAttribute(name)

Return the value of the element attribute with the specified *name*, or
`null` if the attribute does not exist.

* **Arguments:**
  * **name** (*string*)
* **Returns:**
  **Promise<string|null>** –

#### ElementHandle.getText()

Return the text between the element start tag and end tag.

* **Returns:**
  **Promise<string>** –

#### ElementHandle.querySelector(selector)

Return a nested element matching *selector*.

* **Arguments:**
  * **selector** (*Selector|string*) – `Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors) string.
* **Returns:**
  **Promise<ElementHandle|null>** –

#### ElementHandle.screenshot()

Return the screenshot of the element in PNG format as a base64-encoded string.

* **Returns:**
  **Promise<string>** –

### FetchResponse

#### *class* FetchResponse()

Response from `Page.fetch()`.

Its API is a subset of the [Response](https://developer.mozilla.org/en-US/docs/Web/API/Response) API.

*interface*

*exported from* `api.page`

#### FetchResponse.headers

**type:** Record<string, string>

#### FetchResponse.ok

**type:** boolean

#### FetchResponse.status

**type:** number

#### FetchResponse.statusText

**type:** string

#### FetchResponse.type

**type:** string

#### FetchResponse.url

**type:** string

#### FetchResponse.bytes()

* **Returns:**
  **Uint8Array** –

#### FetchResponse.json()

* **Returns:**
  **any** –

#### FetchResponse.text()

* **Returns:**
  **string** –

### Cookies

#### *class* Cookie()

*interface*

*exported from* `api.page`

#### Cookie.domain?

**type:** string

#### Cookie.expires?

**type:** number

#### Cookie.httpOnly?

**type:** boolean

#### Cookie.name

**type:** string

#### Cookie.path?

**type:** string

#### Cookie.sameSite?

**type:** CookieSameSite

#### Cookie.secure?

**type:** boolean

#### Cookie.url?

**type:** string

#### Cookie.value

**type:** string

#### *class* DeleteCookieRequest()

*interface*

*exported from* `api.page`

#### DeleteCookieRequest.domain?

**type:** string

#### DeleteCookieRequest.name

**type:** string

#### DeleteCookieRequest.partitionKey?

**type:** string

#### DeleteCookieRequest.path?

**type:** string

#### DeleteCookieRequest.url?

**type:** string

#### *class* CookieSameSite()

One of: `"Strict"`, `"Lax"`, `"None"`.

### Option classes

#### *class* FetchOptions()

Options for `Page.fetch()`.

It is a subset of [RequestInit](https://developer.mozilla.org/en-US/docs/Web/API/RequestInit).

*interface*

*exported from* `api.page`

#### FetchOptions.body?

**type:** string

#### FetchOptions.cache?

**type:** “default”|”reload”|”no-cache”|”force-cache”|”only-if-cached”

#### FetchOptions.headers?

**type:** Record<string, string>

#### FetchOptions.method?

**type:** “GET”|”POST”|”PUT”|”DELETE”|”PATCH”|”OPTIONS”|”HEAD”

#### FetchOptions.mode?

**type:** “same-origin”|”no-cors”|”cors”

#### FetchOptions.redirect?

**type:** “follow”|”error”|”manual”

#### FetchOptions.referrer?

**type:** string

#### *class* GotoOptions()

Options for `Page.goto()`.

*interface*

*exported from* `api.page`

#### GotoOptions.timeout?

**type:** number

Maximum wait time, in seconds.

Defaults to 30 seconds.

Use 0 to disable the timeout.

#### GotoOptions.waitUntil

**type:** “load”|”networkidle0”

When to consider that navigation has finished.

Possible values:

- `"load"` ([load event](https://developer.mozilla.org/en-US/docs/Web/API/Window/load_event),
  default).
- `"networkidle0"` (no ongoing network connections for at least 0.5
  seconds).

#### *class* ScrollBottomOptions()

*interface*

*exported from* `api.page`

#### ScrollBottomOptions.maxPageHeight?

**type:** number

#### ScrollBottomOptions.maxScrollCount?

**type:** number

#### ScrollBottomOptions.maxScrollDelay?

**type:** number

#### ScrollBottomOptions.timeout?

**type:** number

#### *class* ScrollToOptions()

*interface*

*exported from* `api.page`

#### ScrollToOptions.left?

**type:** number

#### ScrollToOptions.top

**type:** number

### Special action interactions

Zyte maintains a set of subclasses of `BaseInteraction()` that all Zyte
API users may use as special actions.

When implementing a custom interaction, instead of extending
`BaseInteraction()`, you can extend one of these special action
interaction classes:

- `SearchKeywordInteraction()`
- `SetLocationInteraction()`

#### *class* SearchKeywordArgs()

Argument interface for `SearchKeywordInteraction()`.

*interface*

*exported from* `base_classes.SearchKeyword`

#### SearchKeywordArgs.keyword

**type:** string

Keyword or keywords to search for.

#### *class* SearchKeywordInteraction()

Interaction that uses the search box of a page.

See `SearchKeywordArgs()` for the interface of the *args* parameter
of the `do()` method of this class.

*abstract*

*exported from* `base_classes.SearchKeyword`

**Extends:**
: - `BaseInteraction()`

#### SearchKeywordInteraction.keywordCssSelector

**type:** Selector|string

`Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors)
string to find the input field where the search keywords must be typed.

#### SearchKeywordInteraction.typeKeyword(page, keyword)

Type *keyword* into the search field on *page*, start the search, and
wait for the results page to load.

* **Arguments:**
  * **page** (*Page*) – Current page.
  * **keyword** (*string*) – Search keywords.
* **Returns:**
  **Promise<void>** –

#### *class* SetLocation.Address()

[Postal address](https://en.wikipedia.org/wiki/Address).

*interface*

*exported from* `base_classes.SetLocation`

#### SetLocation.Address.city

**type:** string

#### SetLocation.Address.country

**type:** string

[ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
country code.

#### SetLocation.Address.postalCode

**type:** string

[Postal code](https://en.wikipedia.org/wiki/Postal_code).

#### SetLocation.Address.region

**type:** string

Country subdivision of relevance for the [postal address](https://en.wikipedia.org/wiki/Address).

#### SetLocation.Address.streetAddress

**type:** string

[Street address](https://en.wiktionary.org/wiki/street_address).

#### *class* SetLocationArgs()

Argument interface for `SetLocationInteraction()`.

*interface*

*exported from* `base_classes.SetLocation`

#### SetLocationArgs.address

**type:** Address

[Postal address](https://en.wikipedia.org/wiki/Address).

#### *class* SetLocationInteraction()

Interaction to fill fields on the address form of a page.

See `SetLocationArgs()` for the interface of the *args* parameter of
the `do()` method of this class.

*abstract*

*exported from* `base_classes.SetLocation`

**Extends:**
: - `BaseInteraction()`

#### SetLocationInteraction.postalCodeCssSelector

**type:** Selector|string

`Selector()` instance or [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors)
string to find the input field of the postal code.

#### SetLocationInteraction.typePostalCode(page, postalCode)

Type *postalCode* into the postal code field on *page*.

* **Arguments:**
  * **page** (*Page*) – Current page.
  * **postalCode** (*string*) – [Postal code](https://en.wikipedia.org/wiki/Postal_code).
* **Returns:**
  **Promise<void>** –

### Versioning

Every new version of the `smartbrowser-core-interactions` module is
backward-compatible, and is automatically made available to every instance of
the Zyte IDE, and to code already deployed to Zyte API.

In the future, we may implement a versioning system to make it possible to
introduce backward-incompatible changes into the
`smartbrowser-core-interactions` module without breaking existing code.

## Browser script examples

The following sections showcase ready-to-use browser scripts. See
zyte-ide to learn how to use them.

### Search

The following example subclasses `SearchKeywordInteraction()` to
implement search for docs.zyte.com:

```typescript
import { SearchKeywordInteraction } from "smartbrowser-core-interactions/index.ts";

export default class DocsZyteComSearchKeywordInteraction extends SearchKeywordInteraction {
    domains = ["docs.zyte.com"];
    keywordCssSelector = "input.sidebar-search";
}
```

To try the example, use `https://docs.zyte.com/` as target URL,
`{"keyword": "foo"}` as parameters, and any geolocation.

> ###### NOTE
>
> If you debug the example in real time from the Zyte IDE, for the browser script to work you must widen the
> browser view until the Zyte docs search box becomes visible.
>
> ![](zyte-api/ide/examples/images/search.png)
>
> This is not a problem in a Zyte API requests, where the default
> [viewport](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/viewport) is wide enough.

### Interactive form

The following example fills and submits the content filtering form at
[quotes.toscrape.com/search.aspx](https://quotes.toscrape.com/search.aspx):

```typescript
import { BaseInteraction, Page } from "smartbrowser-core-interactions/index.ts";

interface Args {
    author: string;
    tag: string;
}

export default class QuotesToScrapeComSearchInteraction extends BaseInteraction {
    domains = ["quotes.toscrape.com"];
    async do(page: Page, args: Args): Promise<void> {
        await page.select("#author", [args.author]);
        await page.waitForSelector({type: "css", "value": "#tag option[value]", state: "attached"});
        await page.select("#tag", [args.tag]);
        await page.waitForTimeout(3);
        await page.click('input[name="submit_button"]');
        await page.waitForSelector(".quote");
    }
}
```

To try the example, use `https://quotes.toscrape.com/search.aspx` as target
URL, `{"author": "Steve Martin", "tag": "humor"}` as parameters, and any
geolocation.

### API call

The following example shows how you can use JavaScript through the `evaluate`
[action](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/actions) to send an API request from a browser and get
the API response in the resulting [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml):

```typescript
import { BaseInteraction, Page } from "smartbrowser-core-interactions/index.ts";

interface Args {}

export default class QuotesToScrapeComAPICall extends BaseInteraction {
    domains = ["quotes.toscrape.com"];
    async do(page: Page, args: Args): Promise<void> {
        const source = `
            fetch("http://quotes.toscrape.com/api/quotes?page=1")
                .then(response => response.json())
                .then(data => {
                    document.write(JSON.stringify(data))
                })
                .catch(error => {
                    document.write(JSON.stringify({error}));
                });
        `;
        await page.evaluate(source);
    }
}
```

To try the example, use `http://quotes.toscrape.com/scroll` as target
URL, `{}` as parameters, and any geolocation.

## Migrating to Zyte API

Learn how to migrate from:

> ##### Browser automation tools
>
> Migrate from tools like Playwright, Puppeteer, Selenium, or Splash, for
> better productivity and scalability.

> ##### Bright Data Web Unlocker
>
> Enjoy browser HTML, screenshots, and browser actions.

> ##### ScrapingBee
>
> Migrate from ScrapingBee to Zyte API.

> ##### scrapy-zyte-api
>
> Upgrade from scrapy-zyte-smartproxy or from scrapy-crawlera.

> ##### ZenRows
>
> Migrate from ZenRows to Zyte API.

> ##### Zyte Smart Proxy Manager
>
> Enjoy lower ban rates, browser HTML, screenshots, browser actions, and
> smart geolocation.

## Migrating from browser automation to Zyte API

Learn how to migrate from browser automation tools, like [Playwright](https://playwright.dev/),
[Puppeteer](https://pptr.dev/), [Selenium](https://www.selenium.dev/), or [Splash](https://splash.readthedocs.io/en/stable/), to Zyte API.

### Feature comparison

The following table summarizes the feature differences between Zyte API and
browser automation tools:

| Feature           | Zyte API   | Browser automation   |
|-------------------|------------|----------------------|
| API               | HTTP       | Varies               |
| Website-aware API | Yes        | No                   |
| Avoid bans        | Yes        | Hard                 |
| Scalable          | Yes        | Hard                 |

### Migration examples

The following examples show common browser automation functionality implemented
using many browser automation tools, followed by an example of the same
functionality implemented using Zyte API. Use these examples to get started
porting your own code.

To learn more about the browser automation features of Zyte API, see
zapi-browser.

If your code requires a non-linear flow or something else that cannot be
translated into a JSON array with a static sequence of actions, you may need Zyte API browser scripts.

#### Getting browser HTML

This is how you get a browser [DOM](https://en.wikipedia.org/wiki/Document_Object_Model) rendered as HTML using browser automation
tools:

#### Playwright

> ###### NOTE
>
> This example uses JavaScript with [Playwright](https://playwright.dev/) for browser automation and
> [cheerio](https://github.com/cheeriojs/cheerio) for HTML parsing.

```js
const playwright = require('playwright')

async function main () {
  const browser = await playwright.chromium.launch()
  const page = await browser.newPage()
  await page.goto('https://toscrape.com')
  const browserHtml = await page.content()
  await browser.close()
}

main()
```

#### Puppeteer

> ###### NOTE
>
> This example uses JavaScript with [Puppeteer](https://pptr.dev/) for browser
> automation and [cheerio](https://github.com/cheeriojs/cheerio) for HTML parsing.

```js
const puppeteer = require('puppeteer')

async function main () {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://toscrape.com')
  const browserHtml = await page.content()
  await browser.close()
}

main()
```

#### scrapy-playwright

> ###### NOTE
>
> This example uses [scrapy-playwright](https://github.com/scrapy-plugins/scrapy-playwright).

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={"playwright": True},
        )

    def parse(self, response):
        browser_html: str = response.text
```

#### scrapy-splash

> ###### NOTE
>
> This example uses [scrapy-splash](https://github.com/scrapy-plugins/scrapy-splash).

```python
from scrapy import Spider
from scrapy_splash import SplashRequest

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield SplashRequest("https://toscrape.com")

    def parse(self, response):
        browser_html: str = response.text
```

#### Selenium

> ###### NOTE
>
> This example uses [Selenium](https://www.selenium.dev/) with [Python bindings](https://pypi.org/project/selenium/) for browser
> automation and [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

```python
from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://toscrape.com")
browser_html = driver.page_source
driver.close()
```

#### Splash

> ###### NOTE
>
> This example uses Python with [Splash](https://splash.readthedocs.io/en/stable/) for browser automation,
> [requests](https://requests.readthedocs.io/en/latest/) to use the HTTP API of Splash, and [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

```python
from urllib.parse import quote

import requests

splash_url = "YOUR_SPLASH_URL"
url = "https://toscrape.com"
response = requests.get(f"{splash_url}/render.html?url={quote(url)}")
browser_html: str = response.content.decode()
```

And this is how you do it using Zyte API:

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"browserHtml", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "browserHtml": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "browserHtml": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "browserHtml", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          System.out.println(browserHtml);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    browserHtml: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'browserHtml' => true,
    ],
]);
$api = json_decode($response->getBody());
$browser_html = $api->browserHtml;
```

#### Proxy mode

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    -H "Zyte-Browser-Html: true" \
    https://toscrape.com
```

#### Python

```python
import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "browserHtml": True,
    },
)
browser_html: str = api_response.json()["browserHtml"]
```

#### Python client

```python
import asyncio

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "browserHtml": True,
        }
    )
    print(api_response["browserHtml"])

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                },
            },
        )

    def parse(self, response):
        browser_html: str = response.text
```

Output (first 5 lines):

```html
<!DOCTYPE html><html lang="en"><head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        <link href="./css/bootstrap.min.css" rel="stylesheet">
        <link href="./css/main.css" rel="stylesheet">
```

See zapi-browser-html.

#### Taking a screenshot

This is how you take a screenshot using browser automation tools:

#### Playwright

> ###### NOTE
>
> This example uses JavaScript with [Playwright](https://playwright.dev/) for browser automation and
> [cheerio](https://github.com/cheeriojs/cheerio) for HTML parsing.

```js
const playwright = require('playwright')

async function main () {
  const browser = await playwright.chromium.launch()
  const context = await browser.newContext({ viewport: { width: 1920, height: 1080 } })
  const page = await context.newPage()
  await page.goto('https://toscrape.com')
  const screenshot = await page.screenshot({ type: 'jpeg' })
  await browser.close()
}

main()
```

#### Puppeteer

> ###### NOTE
>
> This example uses JavaScript with [Puppeteer](https://pptr.dev/) for browser
> automation and [cheerio](https://github.com/cheeriojs/cheerio) for HTML parsing.

```js
const puppeteer = require('puppeteer')

async function main () {
  const browser = await puppeteer.launch({ defaultViewport: { width: 1920, height: 1080 } })
  const page = await browser.newPage()
  await page.goto('https://toscrape.com')
  const screenshot = await page.screenshot({ type: 'jpeg' })
  await browser.close()
}

main()
```

#### scrapy-playwright

> ###### NOTE
>
> This example uses [scrapy-playwright](https://github.com/scrapy-plugins/scrapy-playwright).

```python
from scrapy import Request, Spider
from scrapy_playwright.page import PageMethod

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "playwright": True,
                "playwright_context": "new",
                "playwright_context_kwargs": {
                    "viewport": {"width": 1920, "height": 1080},
                },
                "playwright_page_methods": [
                    PageMethod("screenshot", type="jpeg"),
                ],
            },
        )

    def parse(self, response):
        screenshot: bytes = response.meta["playwright_page_methods"][0].result
```

#### scrapy-splash

> ###### NOTE
>
> This example uses [scrapy-splash](https://github.com/scrapy-plugins/scrapy-splash).

```python
from scrapy import Spider
from scrapy_splash import SplashRequest

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield SplashRequest(
            "https://toscrape.com",
            endpoint="render.jpeg",
            args={
                "viewport": "1920x1080",
            },
        )

    def parse(self, response):
        screenshot: bytes = response.body
```

#### Selenium

> ###### NOTE
>
> This example uses [Selenium](https://www.selenium.dev/) with [Python bindings](https://pypi.org/project/selenium/) for browser
> automation and [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

```python
from io import BytesIO
from tempfile import NamedTemporaryFile

from PIL import Image
from selenium import webdriver

# https://stackoverflow.com/a/37183295
def set_viewport_size(driver, width, height):
    window_size = driver.execute_script(
        """
        return [window.outerWidth - window.innerWidth + arguments[0],
          window.outerHeight - window.innerHeight + arguments[1]];
        """,
        width,
        height,
    )
    driver.set_window_size(*window_size)

def get_jpeg_screenshot(driver):
    f = NamedTemporaryFile(suffix=".png")
    driver.save_screenshot(f.name)
    f.seek(0)
    image = Image.open(f)
    rgb_image = image.convert("RGB")
    image_io = BytesIO()
    rgb_image.save(image_io, format="JPEG")
    return image_io.getvalue()

driver = webdriver.Firefox()
set_viewport_size(driver, 1920, 1080)
driver.get("https://toscrape.com")
screenshot = get_jpeg_screenshot(driver)
driver.close()
```

#### Splash

> ###### NOTE
>
> This example uses Python with [Splash](https://splash.readthedocs.io/en/stable/) for browser automation,
> [requests](https://requests.readthedocs.io/en/latest/) to use the HTTP API of Splash, and [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

```python
from urllib.parse import quote

import requests

splash_url = "YOUR_SPLASH_URL"
url = "https://toscrape.com"
response = requests.get(f"{splash_url}/render.jpeg?url={quote(url)}&viewport=1920x1080")
screenshot: bytes = response.content
```

And this is how you do it using Zyte API:

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"screenshot", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64Screenshot = data.RootElement.GetProperty("screenshot").ToString();
var screenshot = System.Convert.FromBase64String(base64Screenshot);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "screenshot": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .screenshot \
    | base64 --decode \
    > screenshot.jpg
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "screenshot": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .screenshot \
    | base64 --decode \
    > screenshot.jpg
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "screenshot", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64Screenshot = jsonObject.get("screenshot").getAsString();
          byte[] screenshot = Base64.getDecoder().decode(base64Screenshot);
          try (FileOutputStream fos = new FileOutputStream("screenshot.jpg")) {
            fos.write(screenshot);
          }
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    screenshot: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const screenshot = Buffer.from(response.data.screenshot, 'base64')
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'screenshot' => true,
    ],
]);
$api = json_decode($response->getBody());
$screenshot = base64_decode($api->screenshot);
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "screenshot": True,
    },
)
screenshot: bytes = b64decode(api_response.json()["screenshot"])
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "screenshot": True,
        }
    )
    screenshot = b64decode(api_response["screenshot"])
    with open("screenshot.jpg", "wb") as f:
        f.write(screenshot)

asyncio.run(main())
```

#### Scrapy

```python
from base64 import b64decode

from scrapy import Request, Spider

class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "screenshot": True,
                },
            },
        )

    def parse(self, response):
        screenshot: bytes = b64decode(response.raw_api_response["screenshot"])
```

Output:

![](zyte-api/usage/code-examples/output/screenshot.jpg)

See zapi-screenshot.

#### Consuming scroll-based pagination

This is how you use browser automation tools to load a webpage on a web
browser, scroll to the bottom in a loop until it stops loading more content,
and get the resulting [DOM](https://en.wikipedia.org/wiki/Document_Object_Model) rendered as HTML:

#### Playwright

> ###### NOTE
>
> This example uses JavaScript with [Playwright](https://playwright.dev/) for browser automation and
> [cheerio](https://github.com/cheeriojs/cheerio) for HTML parsing.

```js
const cheerio = require('cheerio')
const playwright = require('playwright')

async function main () {
  const browser = await playwright.chromium.launch()
  const page = await browser.newPage()
  await page.goto('https://quotes.toscrape.com/scroll')
  await page.evaluate(async () => {
    const scrollInterval = setInterval(
      function () {
        const scrollingElement = (document.scrollingElement || document.body)
        scrollingElement.scrollTop = scrollingElement.scrollHeight
      },
      100
    )
    let previousHeight = null
    while (true) {
      const currentHeight = window.innerHeight + window.scrollY
      if (!previousHeight) {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      } else if (previousHeight === currentHeight) {
        clearInterval(scrollInterval)
        break
      } else {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      }
    }
  })
  const $ = cheerio.load(await page.content())
  const quoteCount = $('.quote').length
  await browser.close()
}

main()
```

#### Puppeteer

> ###### NOTE
>
> This example uses JavaScript with [Puppeteer](https://pptr.dev/) for browser
> automation and [cheerio](https://github.com/cheeriojs/cheerio) for HTML parsing.

```js
const cheerio = require('cheerio')
const puppeteer = require('puppeteer')

async function main () {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://quotes.toscrape.com/scroll')
  await page.evaluate(async () => {
    const scrollInterval = setInterval(
      function () {
        const scrollingElement = (document.scrollingElement || document.body)
        scrollingElement.scrollTop = scrollingElement.scrollHeight
      },
      100
    )
    let previousHeight = null
    while (true) {
      const currentHeight = window.innerHeight + window.scrollY
      if (!previousHeight) {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      } else if (previousHeight === currentHeight) {
        clearInterval(scrollInterval)
        break
      } else {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      }
    }
  })
  const $ = cheerio.load(await page.content())
  const quoteCount = $('.quote').length
  await browser.close()
}

main()
```

#### scrapy-playwright

> ###### NOTE
>
> This example uses [scrapy-playwright](https://github.com/scrapy-plugins/scrapy-playwright).

```python
from asyncio import sleep

from scrapy import Request, Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "playwright": True,
                "playwright_include_page": True,
            },
        )

    # Based on https://stackoverflow.com/a/69193325
    async def scroll_to_bottom(self, page):
        await page.evaluate(
            """
            var scrollInterval = setInterval(
                function () {
                    var scrollingElement = (document.scrollingElement || document.body);
                    scrollingElement.scrollTop = scrollingElement.scrollHeight;
                },
                100
            );
            """
        )
        previous_height = None
        while True:
            current_height = await page.evaluate(
                "(window.innerHeight + window.scrollY)"
            )
            if not previous_height:
                previous_height = current_height
                await sleep(0.5)
            elif previous_height == current_height:
                await page.evaluate("clearInterval(scrollInterval)")
                break
            else:
                previous_height = current_height
                await sleep(0.5)

    async def parse(self, response):
        page = response.meta["playwright_page"]
        await self.scroll_to_bottom(page)
        body = await page.content()
        response = response.replace(body=body)
        quote_count = len(response.css(".quote"))
        await page.close()
```

#### scrapy-splash

> ###### NOTE
>
> This example uses [scrapy-splash](https://github.com/scrapy-plugins/scrapy-splash).

```python
from scrapy import Spider
from scrapy_splash import SplashRequest

# Based on https://stackoverflow.com/a/40366442
SCROLL_TO_BOTTOM_LUA = """
function main(splash)
    local num_scrolls = 10
    local scroll_delay = 0.1

    local scroll_to = splash:jsfunc("window.scrollTo")
    local get_body_height = splash:jsfunc(
        "function() {return document.body.scrollHeight;}"
    )
    assert(splash:go(splash.args.url))

    for _ = 1, num_scrolls do
        scroll_to(0, get_body_height())
        splash:wait(scroll_delay)
    end
    return splash:html()
end
"""

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield SplashRequest(
            "https://quotes.toscrape.com/scroll",
            endpoint="execute",
            args={"lua_source": SCROLL_TO_BOTTOM_LUA},
        )

    def parse(self, response):
        quote_count = len(response.css(".quote"))
```

#### Selenium

> ###### NOTE
>
> This example uses [Selenium](https://www.selenium.dev/) with [Python bindings](https://pypi.org/project/selenium/) for browser
> automation and [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

```python
from time import sleep

from parsel import Selector
from selenium import webdriver

# Based on https://stackoverflow.com/a/69193325
def scroll_to_bottom(driver):
    driver.execute_script(
        """
        var scrollInterval = setInterval(
            function () {
                var scrollingElement = (document.scrollingElement || document.body);
                scrollingElement.scrollTop = scrollingElement.scrollHeight;
            },
            100
        );
        """
    )
    previous_height = None
    while True:
        current_height = driver.execute_script(
            "return window.innerHeight + window.scrollY"
        )
        if not previous_height:
            previous_height = current_height
            sleep(0.5)
        elif previous_height == current_height:
            driver.execute_script("clearInterval(window.scrollInterval)")
            break
        else:
            previous_height = current_height
            sleep(0.5)

driver = webdriver.Firefox()
driver.get("https://quotes.toscrape.com/scroll")
scroll_to_bottom(driver)
selector = Selector(driver.page_source)
quote_count = len(selector.css(".quote"))
driver.close()
```

#### Splash

> ###### NOTE
>
> This example uses Python with [Splash](https://splash.readthedocs.io/en/stable/) for browser automation,
> [requests](https://requests.readthedocs.io/en/latest/) to use the HTTP API of Splash, and [Parsel](https://parsel.readthedocs.io/en/latest/) for HTML parsing.

```python
from urllib.parse import quote

import requests
from parsel import Selector

# Based on https://stackoverflow.com/a/40366442
SCROLL_TO_BOTTOM_LUA = """
function main(splash)
    local num_scrolls = 10
    local scroll_delay = 0.1

    local scroll_to = splash:jsfunc("window.scrollTo")
    local get_body_height = splash:jsfunc(
        "function() {return document.body.scrollHeight;}"
    )
    assert(splash:go(splash.args.url))

    for _ = 1, num_scrolls do
        scroll_to(0, get_body_height())
        splash:wait(scroll_delay)
    end
    return splash:html()
end
"""

splash_url = "YOUR_SPLASH_URL"
url = "https://quotes.toscrape.com/scroll"
response = requests.get(
    f"{splash_url}/execute?url={quote(url)}&lua_source={quote(SCROLL_TO_BOTTOM_LUA)}"
)
selector = Selector(text=response.content.decode())
quote_count = len(selector.css(".quote"))
```

And this is how you do it using Zyte API:

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/scroll"},
    {"browserHtml", true},
    {
        "actions",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"action", "scrollBottom"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var quoteCount = (double)navigator.Evaluate("count(//*[@class=\"quote\"])");
```

#### CLI client

input.jsonl
```json
{"url": "https://quotes.toscrape.com/scroll", "browserHtml": true, "actions": [{"action": "scrollBottom"}]}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
```

#### curl

input.json
```json
{
    "url": "https://quotes.toscrape.com/scroll",
    "browserHtml": true,
    "actions": [
        {
            "action": "scrollBottom"
        }
    ]
}
```

```shell

curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .browserHtml \
    | xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> action = ImmutableMap.of("action", "scrollBottom");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://quotes.toscrape.com/scroll",
            "browserHtml",
            true,
            "actions",
            Collections.singletonList(action));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String browserHtml = jsonObject.get("browserHtml").getAsString();
          Document document = Jsoup.parse(browserHtml);
          int quoteCount = document.select(".quote").size();
          System.out.println(quoteCount);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/scroll',
    browserHtml: true,
    actions: [
      {
        action: 'scrollBottom'
      }
    ]
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
  const $ = cheerio.load(browserHtml)
  const quoteCount = $('.quote').length
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/scroll',
        'browserHtml' => true,
        'actions' => [
            ['action' => 'scrollBottom'],
        ],
    ],
]);
$data = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($data->browserHtml);
$xpath = new DOMXPath($doc);
$quote_count = $xpath->query("//*[@class='quote']")->count();
```

#### Python

```python
import requests
from parsel import Selector

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://quotes.toscrape.com/scroll",
        "browserHtml": True,
        "actions": [
            {
                "action": "scrollBottom",
            },
        ],
    },
)
browser_html = api_response.json()["browserHtml"]
quote_count = len(Selector(browser_html).css(".quote"))
```

#### Python client

```python
import asyncio

from parsel import Selector
from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://quotes.toscrape.com/scroll",
            "browserHtml": True,
            "actions": [
                {
                    "action": "scrollBottom",
                },
            ],
        },
    )
    browser_html = api_response["browserHtml"]
    quote_count = len(Selector(browser_html).css(".quote"))
    print(quote_count)

asyncio.run(main())
```

#### Scrapy

```python
from scrapy import Request, Spider

class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    async def start(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "scrollBottom",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        quote_count = len(response.css(".quote"))
```

Output:

```none
100
```

See zapi-actions.

## Migrating from Bright Data Web Unlocker to Zyte API

Learn how to migrate from [Bright Data Web Unlocker](https://brightdata.com/products/web-unlocker) to Zyte API.

### Feature comparison

The following table summarizes the feature differences between both products:

| Feature         | Zyte API      | Web Unlocker   |
|-----------------|---------------|----------------|
| API             | HTTP or proxy | Proxy          |
| Browser HTML    | Yes           | No             |
| Screenshots     | Yes           | No             |
| Browser actions | Yes           | No             |
| Network capture | Yes           | No             |

### Proxy mode

Zyte API offers a proxy mode, which makes it easier to
migrate from Bright Data Web Unlocker.

> ###### NOTE
>
> Before you decide whether to use the proxy mode or the HTTP API, learn their differences.

To migrate, update your proxy endpoint and authentication, and migrate your request parameters.

### HTTP API

When migrating from Bright Data Web Unlocker to the HTTP API of Zyte API, the main challenge is switching from a proxy API
to an HTTP API. To read its rich output data, you need JSON parsing and
sometimes base64-decoding.

For example, to get the same output as the following Web Unlocker request:

```bash
curl \
    --proxy zproxy.lum-superproxy.io:22225 \
    --proxy-user lum-customer-YOUR_USER-zone-YOUR_ZONE:YOUR_PASSWORD \
    https://toscrape.com/
```

Use Zyte API as follows:

```bash
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --compressed \
    --data '{"url": "https://toscrape.com", "httpResponseBody": true}' \
    https://api.zyte.com/v1/extract \
| jq --raw-output .httpResponseBody \
| base64 --decode
```

See zapi-usage for richer Zyte API examples, covering more scenarios
and features. See also unlocker-params to migrate your request
parameters.

### Parameter mapping

If your Web Unlocker requests set the `country` parameter, migrate them as
follows:

#### Proxy mode

Use the zyte-geolocation request header.

For example, replace:

```bash
curl … --proxy-user lum-customer-YOUR_USER-zone-YOUR_ZONE-country-us:YOUR_PASSWORD …
```

With:

```bash
curl … -H Zyte-Geolocation:US …
```

#### HTTP API

Use the [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) request field.

For example, replace:

```bash
curl … --proxy-user lum-customer-YOUR_USER-zone-YOUR_ZONE-country-us:YOUR_PASSWORD …
```

With:

```bash
curl … --data '{…, "geolocation": "US", …}' …
```

## Migrating from ScrapingBee to Zyte API

Learn how to migrate from [ScrapingBee](https://www.scrapingbee.com/) to Zyte API.

### Feature comparison

The following table summarizes the feature differences between both products:

| Feature                         | ScrapingBee                           | Zyte API                                                                                              |
|---------------------------------|---------------------------------------|-------------------------------------------------------------------------------------------------------|
| Client software                 | Python, NodeJS                        | Python, Scrapy                                                                                        |
| Pricing                         | Fixed plans                           | Pay as you go Monthly commitment over $100                                                            |
| Ban avoidance                   | Manual, may increase costs            | Automatic, no extra costs                                                                             |
| Automatic extraction            | Google SERP, custom LLM prompts       | Standard schemas including Google SERP, custom LLM prompts                                            |
| Geolocation                     | 243 countries, no data center support | 249 countries, data center support                                                                    |
| Sessions                        | Client-managed only (5m)              | Client-managed (15m) and server-managed                                                               |
| Actions                         | Basic only (9)                        | Basic (15), advanced, website-specific and custom                                                     |
| Screenshots                     | Yes, can target an element            | Yes, cannot target an element                                                                         |
| Body size limit                 | 2 MB                                  | 10 MB                                                                                                 |
| Custom headers                  | Yes                                   | Only in HTTP requests, limited to `Referer` in browser requests, cannot disable ban-avoidance headers |
| Ad blocking                     | Yes                                   | No                                                                                                    |
| Resource blocking               | Yes                                   | No                                                                                                    |
| Custom proxies                  | Yes                                   | No                                                                                                    |
| Server-side CSS/XPath selectors | Yes                                   | No                                                                                                    |
| Rate limiting                   | Concurrency-based                     | RPM-based                                                                                             |
| Usage API                       | Yes, up to 6 requests per second      | Yes, up to 20 requests per second                                                                     |

#### Pricing

ScrapingBee offers 4 plans with a fixed price per month, each with a fixed
number of “credits” per month that you have to spend on that month or lose.

With Zyte API you pay only for what you use, up to a $100 monthly
spending limit. If you need a higher spending
limit, you must commit to paying half as monthly commitment, which you do not get back if you spend less during
a month.

With ScrapingBee, HTTP requests cost 1 credit each, while browser requests cost
5 credits each. If you need to use device residential IPs (“premium proxies”)
to avoid bans, costs raise to 10 credits per HTTP request (10×) and 25 credits
per browser request (5×). For scenarios where device residential IPs do not
avoid bans either, ScrapingBee offers special “stealth” proxies for browser
requests at 75 credits per request (15×). ScrapingBee also charges 20 credits
when targetting Google domains.

With Zyte API, request cost varies depending not only on the type of request
(HTTP or browser), but also on the tier of the target website,
which covers the cost of any tech that Zyte API may use to get you a ban-free
response, including browser rendering and device residential IPs. No extra cost
for Google domains; not even for automatic extraction of SERP
([serp](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/serp)).

Unless you are never using premium or stealth proxies, you are targetting
mostly high-tier websites, and the number of credits per month that you need is
close to those included in one of ScrapingBee‘s plans, **Zyte API tends to be a
cheaper choice**.

For example, the $49 ScrapingBee plan includes 150k credits, i.e. 150k HTTP
requests. For tier 1-2 websites (i.e. most websites), Zyte API is cheaper. And
Zyte API can also be cheaper for higher-tier websites if you need fewer than
150k requests: 114k requests for tier 3, 70k requests for tier 2, and 39k
request for tier 5.

#### Ban handling

ScrapingBee makes it your responsibility to choose the right technologies
(browser rendering, device residential IPs, “stealth IPs”) to avoid bans, with
the corresponding cost increase.

Zyte API automatically chooses the leanest technology possible transparently,
without any extra cost, and automatically adapting to website changes.

#### Automatic extraction

ScrapingBee supports automatic extraction through user-defined LLM prompts.

Zyte API automatic extraction provides automatic
extraction for supported types *and* user-defined
LLM prompts to extract additional fields.

Both ScrapingBee and Zyte API support Google SERP extraction
([serp](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/serp)).

#### Rate limiting

ScrapingBee limits the number of concurrent requests that you can send,
starting at 5 with the most basic plan.

Zyte API limits the number of requests per minute (RPM) that you can send. It
is 3000 by default for all Zyte API keys, but you can request a higher
limit.

For services like these that support advanced features like browser
rendering or automatic extraction, which
usually increase response times, RPM rate limiting allows you to maintain your
throughput regardless of which features you use thanks to unlimited
concurrency, while concurrency-based limits slow down your crawls as you use
features that make requests slower.

For example, assuming an HTTP request takes 2 seconds and a
browser request takes 20 seconds, switching from HTTP
requests to browser requests with ScrapingBee would make your crawl 10 times
slower, while Zyte API would allow you to maintain a similar crawl speed by
using more concurrent requests to make up for the response time increase.

### Migrating

The main differences between the HTTP APIs of ScrapingBee and Zyte API are how
request parameters are defined and how the response is encoded.

In **ScrapingBee**, you send a `GET` request, and you specify parameters in
the URL query string, URL-encoded, e.g.

```bash
curl "https://app.scrapingbee.com/api/v1/?api_key=YOUR_ZYTE_API_KEY&url=https%3A%2F%2Ftoscrape.com"
```

The API response body comes straight from the target website:

```html
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        …
```

HTTP response headers and cookies from the target website are also received as
regular headers and cookies, only prefixed with `Spb-`.

```none
Spb-Content-Encoding: br
Spb-Content-Type: text/html
```

In **Zyte API**, you send a `POST` request, and you specify parameters in the
request body as JSON, e.g.

> ###### TIP
>
> Same as ScrapingBee, Zyte API offers a proxy mode
> that you can use instead of the HTTP API if it makes things simpler.

```bash
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data '{"url": "https://toscrape.com", "httpResponseBody": true, "httpResponseHeaders": true}' \
    --compressed \
    https://api.zyte.com/v1/extract
```

The API response is a JSON object with all the response data from the target
website:

```json
{
    "url": "https://toscrape.com/",
    "statusCode": 200,
    "httpResponseBody": "PCFET0NUWVBFIGh0bWw+CjxodG1sIGxhbmc9ImVuIj4KICAgIDx…",
    "httpResponseHeaders": [
        {
            "name": "content-type",
            "value": "text/html"
        },
        {
            "name": "content-encoding",
            "value": "br"
        }
    ]
}
```

> ###### NOTE
>
> [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) is base64-encoded to support binary
> responses, like images or PDF files.

Once you understand how to migrate a simple request like the one above, you can
migrate any other request the same way, replacing ScrapingBee parameters
with Zyte API counterparts.

### Parameter mapping

| ScrapingBee               | Zyte API                                                                                                                                                                                                                                           |
|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| (default)                 | [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody), [httpResponseHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseHeaders)       |
| `api_key`                 | Use basic authentication                                                                                                                                                                                                                           |
| `url`                     | [url](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/url)                                                                                                                                                           |
| `render_js`               | [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/browserHtml)                                                                                                                                           |
| `js_scenario`             | See below                                                                                                                                                                                                                                          |
| `wait`                    | `waitForTimeout` action (see below)                                                                                                                                                                                                                |
| `wait_for`                | `waitForSelector` action (see below)                                                                                                                                                                                                               |
| `wait_browser`            | `waitForNavigation` action (see below)                                                                                                                                                                                                             |
| `block_ads`               | Not supported                                                                                                                                                                                                                                      |
| `block_resources`         | Not supported                                                                                                                                                                                                                                      |
| `viewport_width`          | [viewport](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/viewport)                                                                                                                                                 |
| `window_height`           | [viewport](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/viewport)                                                                                                                                                 |
| `premium_proxy`           | [ipType=residential](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType) (not required to avoid bans)                                                                                                            |
| `country_code`            | [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) (does not require [ipType=residential](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType))             |
| `stealth_proxy`           | N/A, ban avoidance is a transparent feature                                                                                                                                                                                                        |
| `own_proxy`               | Not supported                                                                                                                                                                                                                                      |
| `forward_headers`         | [customHttpRequestHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customHttpRequestHeaders), [requestHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestHeaders) |
| `forward_headers_pure`    | Not supported                                                                                                                                                                                                                                      |
| `ai_query`                | [customAttributes](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributes)                                                                                                                                 |
| `ai_selector`             | Not supported                                                                                                                                                                                                                                      |
| `ai_extract_rules`        | [customAttributes](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributes)                                                                                                                                 |
| `extract_rules`           | Not supported                                                                                                                                                                                                                                      |
| `screenshot`              | [screenshot](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/screenshot)                                                                                                                                             |
| `screenshot_selector`     | Not supported                                                                                                                                                                                                                                      |
| `screenshot_full_page`    | [screenshotOptions.fullPage=true](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/screenshotOptions.fullPage)                                                                                                        |
| `json_response`           | See zapi-network-capture                                                                                                                                                                                                                           |
| `return_page_source`      | Not supported (use [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody) if you are only using browser rendering to avoid bans)                                                       |
| `scraping_config`         | Not supported                                                                                                                                                                                                                                      |
| `session_id`              | [session.id](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session.id) (must be UUID4)                                                                                                                             |
| `timeout`                 | Not supported                                                                                                                                                                                                                                      |
| `cookies`                 | [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies)                                                                                                                                     |
| `device`                  | [device](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/device)                                                                                                                                                     |
| `custom_google`           | N/A                                                                                                                                                                                                                                                |
| `transparent_status_code` | N/A, Zyte API returns the response or not based on whether or not it is a ban, not based on the status code                                                                                                                                        |

### Action mapping

ScrapingBee allows defining a sequence of browser actions through the
`"instructions"` JSON array of the `js_scenario` parameter. For example:

```json
{
    "instructions": [
        {"click": "#buttonId"}
    ]
}
```

Which URL-encoded would become:

```none
js_scenario=%7B%22instructions%22%3A+%5B%7B%22click%22%3A+%22%23buttonId%22%7D%5D%7D
```

The Zyte API equivalent is the [actions](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/actions) field. The following is
a matching example:

```json
{
    "actions": [
        {
            "action": "click",
            "selector": {
                "type": "css",
                "value": "#buttonId"
            }
        }
    ]
}
```

These are ScrapingBee actions and their Zyte API counterparts:
`click`: `click`
`evaluate`: `evaluate`
`fill`: `type`
`infinite_scroll`: `scrollBottom`
`scroll_x`: `scrollTo`
`scroll_y`: `scrollTo`
`wait`: `waitForTimeout`
`wait_for`: `waitForSelector`
`wait_for_and_click`: `waitForSelector`, `click`
The following Zyte API actions are not supported by ScrapingBee:
`doubleClick`
`goto`
`hide`
`hover`
`keyPress`
`reload`
`searchKeyword`
`select`
`setLocation`
`waitForRequest`
`waitForResponse`

Zyte API also supports custom actions.

## Migrating from scrapy-zyte-smartproxy to scrapy-zyte-api

This migration guide provides the steps necessary to migrate from
scrapy-zyte-smartproxy or scrapy-crawlera
to scrapy-zyte-api.

> ###### NOTE
>
> If you use Smart Proxy Manager, see
> spm-migrate for general migration information.

### Maybe keep scrapy-zyte-smartproxy

If you use scrapy-zyte-smartproxy for
Scrapy integration with Smart Proxy Manager, and
you only want to migrate to Zyte API to enjoy better
ban avoidance or pricing, you can
continue using scrapy-zyte-smartproxy: scrapy-zyte-smartproxy 2.3.1 and higher
support the proxy mode of Zyte API.

> ###### TIP
>
> If you are using scrapy-crawlera, you would need to migrate to
> scrapy-zyte-smartproxy to use Zyte API proxy mode.
> See the release notes of scrapy-zyte-smartproxy 2.0.0 for details. It might be worth migrating to
> scrapy-zyte-api instead.

To switch from Smart Proxy Manager to the proxy mode of Zyte API, replace your
Smart Proxy Manager API key with [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access), and set the
ZYTE_SMARTPROXY_URL setting to `"http://api.zyte.com:8011"`.
Alternatively, you can enable Zyte API proxy mode for specific requests.

You should also add 520 and 521 to the `RETRY_HTTP_CODES` setting:

settings.py
```python
from scrapy.settings.default_settings import RETRY_HTTP_CODES as DEFAULT_RETRY_HTTP_CODES

RETRY_HTTP_CODES = DEFAULT_RETRY_HTTP_CODES + [520, 521]
```

scrapy-zyte-smartproxy will automatically translate Smart Proxy Manager headers
into their Zyte API counterparts where possible, and drop them when not. But
you should eventually update your headers, see spm-migrate-map.

Using scrapy-zyte-smartproxy for Zyte API makes it easier to migrate from Smart
Proxy Manager. However, the proxy mode of Zyte API has feature
differences with the HTTP API.
Continue reading to learn how to migrate to scrapy-zyte-api.

> ###### TIP
>
> You can keep both scrapy-zyte-smartproxy and scrapy-zyte-api, and use
> one or the other for different requests or spiders.

### Set up scrapy-zyte-api

1. You need Python 3.8 or higher to use the latest version of scrapy-zyte-api.
2. You need Scrapy 2.0.1 or higher to use the latest version of
   scrapy-zyte-api.

   If you are using a lower version of Scrapy, please upgrade to a higher
   Scrapy version, and make sure your code works as expected with the newer
   Scrapy version before you continue the migration process.

   The [Scrapy release notes](https://docs.scrapy.org/en/latest/news.html)
   of every Scrapy version cover backward-incompatible changes and deprecation
   removals, which should help you upgrade your existing code as you upgrade
   Scrapy.
3. Install the latest version of scrapy-zyte-api:
   ```bash
   pip install --upgrade scrapy-zyte-api
   ```
4. Configure scrapy-zyte-api in your `settings.py` file. If your Scrapy
   version is 2.10 or higher, add the following settings:
   ```python
   ADDONS = {
       "scrapy_zyte_api.Addon": 500,
   }
   ZYTE_API_TRANSPARENT_MODE = False
   ```

   Otherwise add the following settings:
   ```python
   DOWNLOAD_HANDLERS = {
       "http": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
       "https": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
   }
   DOWNLOADER_MIDDLEWARES = {
       "scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000,
   }
   REQUEST_FINGERPRINTER_CLASS = "scrapy_zyte_api.ScrapyZyteAPIRequestFingerprinter"
   SPIDER_MIDDLEWARES = {
       "scrapy_zyte_api.ScrapyZyteAPISpiderMiddleware": 100,
   }
   TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
   ```

   If any of these settings already exists in your `settings.py` file,
   modify the existing setting as needed instead of re-defining it. For
   example, if you already have `DOWNLOADER_MIDDLEWARES` defined, add
   `"scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000,` to your
   existing definition, keeping existing downloader middlewares untouched.

   Also, make sure that these settings are not being overridden elsewhere. For
   example, make sure they are not defined in multiple lines of your
   `settings.py` file, and that they are not overridden in your [Scrapy
   Cloud project settings](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud).
   > ###### NOTE
   >
   > On projects that were not using the asyncio Twisted reactor, your
   > existing code may need changes, such as:
   > - [Handling a pre-installed Twisted reactor](https://docs.scrapy.org/en/latest/topics/asyncio.html#handling-a-pre-installed-reactor).
   >
   >   Some Twisted imports install the default, non-asyncio Twisted
   >   reactor as a side effect. Once a reactor is installed, it cannot be
   >   changed for the whole run time.
   > - [Converting Twisted Deferreds into asyncio Futures](https://docs.scrapy.org/en/latest/topics/asyncio.html#awaiting-on-deferreds).
   >
   >   Note that you might be using Deferreds without realizing it through
   >   some Scrapy functions and methods. For example, when you yield the
   >   return value of `self.crawler.engine.download()` from a spider
   >   callback, you are yielding a Deferred.
5. Add [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access) to
   `settings.py` as well:
   ```python
   ZYTE_API_KEY = "YOUR_ZYTE_API_KEY"
   ```
6. To enable cookie support, the `COOKIES_ENABLED`
   setting is not enough, you must also define an additional setting in
   `settings.py`:
   ```python
   ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED = True
   ```

### Migrate

Your next steps depend on how you want to approach your migration. You can
migrate some requests,
migrate some spiders, or
migrate your entire project.

#### Migrate a request

Migrating requests makes sense if you want to keep scrapy-zyte-smartproxy but you need to drive specific requests
through scrapy-zyte-api for features only available through the HTTP API.

To migrate a Scrapy request, set the following fields in the request metadata:

```python
yield Request(
    ...,
    meta={
        "dont_proxy": True,
        "zyte_api_automap": True,
    },
)
```

> ###### TIP
>
> If your spider stops with the `plugin_conflict` finish reason, make
> sure the `ZYTE_API_TRANSPARENT_MODE` setting is `False`. Only set
> `ZYTE_API_TRANSPARENT_MODE` to `True` when migrating an
> entire spider or project.

#### Migrate a spider

Compared to migrating an entire project, migrating spiders one by one,
incrementally, can be more time consuming, but also less disruptive, giving you
time to validate the migration of each spider separately.

To migrate a Scrapy spider, use [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) or [update_settings](https://docs.scrapy.org/en/master/topics/spiders.html#scrapy.Spider.update_settings) to
toggle scrapy-zyte-smartproxy, scrapy-crawlera, and scrapy-zyte-api:

```python
class MySpider(Spider):
    custom_settings = {
        "ZYTE_API_TRANSPARENT_MODE": True,
        "ZYTE_SMARTPROXY_ENABLED": False,
        "CRAWLERA_ENABLED": False,  # Only needed if you use scrapy-crawlera
    }
```

You can look at the stats of a crawl after migration to check that the
migration was successful: there should be `scrapy-zyte-api`-prefixed
stats indicating scrapy-zyte-api usage, and
there should be no scrapy-zyte-smartproxy stats, which are prefixed with either
`zyte_smartproxy` (Smart Proxy Manager) or `zyte_api_proxy` (Zyte API), or
scrapy-crawlera stats, which are prefixed with `crawlera`.

#### Migrate a project

To migrate a Scrapy project:

1. Disable scrapy-zyte-smartproxy or scrapy-crawlera.

   scrapy-zyte-smartproxy is enabled through the `ZYTE_SMARTPROXY_ENABLED`
   setting. scrapy-crawlera through `CRAWLERA_ENABLED`.

   To disable, find where you define that setting (e.g. `settings.py`,
   Scrapy Cloud settings), and remove it.

   Also, make sure you are not enabling those settings on specific spiders,
   e.g. through the [custom_settings](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.custom_settings) class attribute of a spider
   class, or in your cloud (e.g. in Scrapy Cloud, which
   allows overriding settings for specific spiders).
2. Configure Zyte API to run in transparent mode.

   If you use `scrapy_zyte_api.Addon`, remove the
   `ZYTE_API_TRANSPARENT_MODE = False` line from `settings.py`. The add-on
   enables transparent mode automatically.

   If you do *not* use `scrapy_zyte_api.Addon`, add the following line to
   `settings.py`:
   ```python
   ZYTE_API_TRANSPARENT_MODE = True
   ```

To check that the migration was successful, you can either check stats for each spider or remove
scrapy-zyte-smartproxy and scrapy-crawlera.

#### Remove proxy headers

Regardless of whether you are migrating only some spiders or your whole
project, review the code of requests that now go through Zyte API to look for
proxy headers, i.e. those prefixed with `X-Crawlera-` or `Zyte-`
(case-insensitive), and replace them with Zyte API counterparts according to
this table.

> ###### TIP
>
> You can usually find Scrapy requests by searching your code for uses
> of the `Request` class, but mind that there are other
> ways to create requests, including: `request.copy()`, `request.replace()`, `request.from_curl()`,
> `request_from_dict()`, `response.follow()` and `response.follow_all()`.

You can specify those parameters through a `zyte_api_automap` dictionary
in request metadata. For example, to set the [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) of a
request to the USA:

```python
yield Request(
    ...,
    meta={
        "zyte_api_automap": {
            "geolocation": "US",
        },
    },
)
```

For details, see automap.

#### Handle retries

scrapy-zyte-api implements an advanced retry mechanism, with a default retry policy that should work for most scenarios.

If retries for ban responses are being exceeded and
you want to increase retries, or if you want to retry permanent download
errors, you can try switching to the
aggressive retry policy:

settings.py
```python
ZYTE_API_RETRY_POLICY = "zyte_api.aggressive_retrying"
```

You can also create a custom retry policy, see the reference documentation of
`RetryFactory` and `AggressiveRetryFactory`
for examples.

When retries are exceeded for a given request, an exception is raised, and if
not caught, an error message is logged. See retry-non-successful to
learn how to handle such exceptions.

#### Adjust the crawl speed

If your crawl speed lowers significantly after migrating:

1. Ensure that you are not setting a `DOWNLOAD_DELAY`, which scrapy-zyte-smartproxy and scrapy-crawlera
   ignore, but scrapy-zyte-api respects.

   If you need to keep a download delay for some domains, you can use the
   `DOWNLOAD_SLOTS` setting. Note that
   requests sent through  scrapy-zyte-api use a different slot, prefixed with
   `zyte-api@` (e.g. `zyte-api@example.com`).
2. Increase the `CONCURRENT_REQUESTS`
   and `CONCURRENT_REQUESTS_PER_DOMAIN` settings as needed.
3. If a higher concurrency does not improve your crawl speed, the cause may be
   rate limiting; if the
   `scrapy-zyte-api/throttle_ratio` Scrapy stat is high, you may want to
   request a higher limit.

If your crawl speed increases too much after migrating:

- If the AutoThrottle Scrapy extension is
  enabled (i.e. `AUTOTHROTTLE_ENABLED`
  is `True`, as it is by default in Scrapy Cloud),
  scrapy-zyte-api bypasses the extension for Zyte API request, to let Zyte
  API handle rate limiting on its own.

  Set the `ZYTE_API_PRESERVE_DELAY` setting to `True` to prevent
  scrapy-zyte-api from bypassing the extension.

#### Memory may increase

Zyte API HTTP response bodies are Base64-encoded, making
them 33-37% larger, hence increasing memory usage.

If your spider runs out of memory after migration, consider:

- Increasing available memory. If you use Scrapy Cloud, use more units.
- Lower `SCRAPER_SLOT_MAX_ACTIVE_SIZE` to a value that prevents exceeding
  available memory while allowing an acceptable crawl speed.

### Remove scrapy-zyte-smartproxy (optional)

Once you have migrated all your code and are happy with the result, you can
remove scrapy-zyte-smartproxy and scrapy-crawlera:

```bash
pip uninstall scrapy-zyte-smartproxy scrapy-crawlera
```

And remove from your code and from Scrapy Cloud any related Scrapy setting,
i.e. those prefixed with either `ZYTE_SMARTPROXY_` or `CRAWLERA_`,
including those that you used to disable scrapy-zyte-smartproxy in an earlier
migration step (no need to disable something that is not installed anymore).

## Migrating from ZenRows to Zyte API

Learn how to migrate from [ZenRows](https://www.zenrows.com/) to Zyte API.

### Feature comparison

The following table summarizes the feature differences between both products:

| Feature                       | Zyte API                                               | ZenRows                                                          |
|-------------------------------|--------------------------------------------------------|------------------------------------------------------------------|
| API                           | HTTP or proxy                                          | HTTP                                                             |
| Client software               | Python, Scrapy                                         | Python, NodeJS                                                   |
| Restricted website categories | No broad category restrictions [^1]                    | Banks, payment gateways, visas/permits, government               |
| Advanced ban avoidance        | Always available, automatic                            | Only Business+, manual                                           |
| Automatic extraction          | AI-powered, standard schemas                           | Undocumented website support, item type support or output schema |
| Markdown output               | No                                                     | Yes                                                              |
| Geolocation                   | 249 countries, data center support                     | 190 countries, no data center support                            |
| Sessions                      | Client-managed (15m) and server-managed                | Client-managed only (10m), no cookies                            |
| Actions                       | Basic (15), advanced, website-specific and custom      | Basic only (10)                                                  |
| Screenshots                   | JPEG/PNG, configurable viewport, cannot target element | PNG only, fixed viewport, can target element                     |
| Network capture               | Up to 5 MiB / 10 responses                             | Unlimited                                                        |
| Network blocking              | No                                                     | Yes                                                              |
| JavaScript disabling          | Yes                                                    | No                                                               |
| Server-side CSS selectors     | No                                                     | Yes                                                              |
| Rate limiting                 | RPM-based                                              | Concurrency-based                                                |
| Overuse handling              | Rate-limiting responses                                | Rate-limiting responses followed by IP blocking                  |
[^1]: Some specific websites may be blocked for legal or compliance reasons.

#### Automatic extraction

ZenRows supports automatic extraction, but their documentation does not provide
details on supported websites, item types or output schemas.

Zyte API automatic extraction is AI-based, i.e. it
works on any website of a supported type (e.g.
e-commerce, blogs/news, job postings), and we provide detailed documentation
about output schemas.

#### Sessions

ZenRows only supports client-managed sessions, and
limits them to 10 minutes. Moreover, their sessions do not maintain cookies,
you must do that on the client side.

Zyte API allows 15 minutes for client-managed sessions, but also supports
server-managed sessions with much longer
lifetimes and an easier API. Moreover, the Scrapy plugin supports an additional session management API.

#### Screenshots

Both ZenRows and Zyte API support PNG screenshots of the visible viewport or
the full page.

ZenRows allows taking a screenshot of a specific element.

Zyte API allows configuring the browser [viewport](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/viewport).

Zyte API can return both [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) and
[screenshot](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/screenshot) on the same request, i.e. get the browser HTML
matching a given screenshot. In ZenRows you would need 2 separate requests, and
the contents of each might not be a perfect match.

#### Rate limiting

ZenRows limits the number of concurrent requests that you can send, starting at
10 with the most basic plan.

Zyte API limits the number of requests per minute (RPM) that you can send. It
is 3000 by default for all Zyte API keys, but you can request a higher
limit.

For services like these that support advanced features like browser
rendering or automatic extraction, which usually increase response times, RPM rate limiting
allows you to maintain your throughput regardless of which features you use
thanks to unlimited concurrency, while concurrency-based limits slow down your
crawls as you use features that make requests slower.

For example, assuming an HTTP request takes 2 seconds
and a browser request takes 20 seconds, switching
from HTTP requests to browser requests with ZenRows would make your crawl 10
times slower, while Zyte API would allow you to maintain a similar crawl speed
by using more concurrent requests to make up for the response time increase.

#### Overuse handling

When you exceed your concurrency with ZenRows, they start by sending
rate-limiting responses, but eventually they block your IP address for
increasing amounts of time.

With Zyte API, reaching your rate limit is not only allowed, but
encouraged, and you can request a higher limit
limit if you need it.

### Migrating

The main differences between the HTTP APIs of ZenRows and Zyte API are how
request parameters are defined and how the response is encoded.

In **ZenRows**, you send a `GET` request, and you specify parameters in the
URL query string, URL-encoded, e.g.

```bash
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZYTE_API_KEY&url=https%3A%2F%2Ftoscrape.com"
```

The API response body comes straight from the target website:

```html
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
        …
```

HTTP response headers from the target website are also received as regular
headers, only prefixed with `Zr-`, and the response URL (which might not
match the request URL, e.g. in case of redirection) is received as the special `Zr-Final-Url` header:

```none
Zr-Content-Encoding: br
Zr-Content-Type: text/html
Zr-Final-Url: https://toscrape.com/
```

In **Zyte API**, you send a `POST` request, and you specify parameters in the
request body as JSON, e.g.

```bash
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data '{"url": "https://toscrape.com", "httpResponseBody": true, "httpResponseHeaders": true}' \
    --compressed \
    https://api.zyte.com/v1/extract
```

The API response is a JSON object with all the response data from the target
website:

```json
{
    "url": "https://toscrape.com/",
    "statusCode": 200,
    "httpResponseBody": "PCFET0NUWVBFIGh0bWw+CjxodG1sIGxhbmc9ImVuIj4KICAgIDx…",
    "httpResponseHeaders": [
        {
            "name": "content-type",
            "value": "text/html"
        },
        {
            "name": "content-encoding",
            "value": "br"
        }
    ]
}
```

> ###### NOTE
>
> [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) is base64-encoded to support binary
> responses, like images or PDF files.

Once you understand how to migrate a simple request like the one above, you can
migrate any other request the same way, replacing ZenRows parameters with
Zyte API counterparts.

### Parameter mapping

| ZenRows                | Zyte API                                                                                                                                                                                                                                     |
|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| (default)              | [httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody), [httpResponseHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseHeaders) |
| `apikey`               | Use basic authentication                                                                                                                                                                                                                     |
| `url`                  | [url](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/url)                                                                                                                                                     |
| `js_render`            | [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/browserHtml)                                                                                                                                     |
| `custom_headers`       | [customHttpRequestHeaders](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customHttpRequestHeaders)                                                                                                           |
| `premium_proxy`        | [ipType=residential](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType) (not required to avoid bans)                                                                                                      |
| `proxy_country`        | [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) (does not require [ipType=residential](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/ipType))       |
| `session_id`           | [session.id](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session.id) (must be UUID4)                                                                                                                       |
| `device`               | [device](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/device)                                                                                                                                               |
| `original_status`      | N/A (see [statusCode](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/statusCode))                                                                                                                        |
| `allowed_status_codes` | N/A (see zapi-successful-responses-wrapping-bad-responses)                                                                                                                                                                                   |
| `block_resources`      | Not supported                                                                                                                                                                                                                                |
| `json_response`        | See zapi-network-capture                                                                                                                                                                                                                     |
| `css_extractor`        | Not supported                                                                                                                                                                                                                                |
| `autoparse`            | See zapi-extract                                                                                                                                                                                                                             |
| `markdown_response`    | Not supported                                                                                                                                                                                                                                |
| `screenshot`           | [screenshot](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/screenshot)                                                                                                                                       |
| `screenshot_fullpage`  | [screenshotOptions.fullPage=true](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/screenshotOptions.fullPage)                                                                                                  |
| `screenshot_selector`  | Not supported                                                                                                                                                                                                                                |

For parameters defining browser actions, see zenrows-actions.

### Action mapping

These are ZenRows actions and their Zyte API counterparts.
`check`: `click`
`click`: `click`
`evaluate`: `evaluate`
`fill`: `type`
`scroll_x`: `scrollTo`
`scroll_y`: `scrollTo`
`select_option`: `select`
`uncheck`: `click`
`wait`: `waitForTimeout`
`wait_for`: `waitForSelector`

`wait_for` only supports CSS selectors, while `waitForSelector` also
supports XPath selectors.

ZenRows also has a `solve_captcha` action that requires you to specify which
CAPTCHA you need to solve, while Zyte API avoids bans automatically by
default (no action necessary), while allowing CAPTCHA management
to be disabled through zapi-permissions-control.

The following Zyte API actions are not supported by ZenRows:
`doubleClick`
`goto`
`hide`
`hover`
`keyPress`
`reload`
`scrollBottom`
`searchKeyword`
`setLocation`
`waitForNavigation`
`waitForRequest`
`waitForResponse`

Zyte API also supports custom actions.

ZenRows actions have `frame_`-prefixed counterparts that work on [iframes](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe),
and a utility action (`frame_reveal`) to inject iframe contents into the
main DOM. On Zyte API you need to use custom actions
to  interact with iframes.

## Migrating from Smart Proxy Manager to Zyte API

Learn how to migrate from Smart Proxy Manager to
Zyte API.

### Key differences

The following table summarizes the feature differences between both products:

| Feature                | Smart Proxy Manager   | Zyte API                                        |
|------------------------|-----------------------|-------------------------------------------------|
| API                    | Proxy                 | HTTP or proxy                                   |
| Ban avoidance          | Good                  | Great                                           |
| Device residential IPs | Add-on                | Automatic (configurable)                        |
| Session management     | Client-managed        | Server-managed or client-managed                |
| Geolocation            | Manual                | Automatic (configurable)                        |
| Browser HTML           | No                    | Yes (HTTP API only, proxy mode support planned) |
| Screenshots            | No                    | Yes (HTTP API only)                             |
| Browser actions        | No                    | Yes (HTTP API only)                             |
| Network capture        | No                    | Yes (HTTP API only)                             |
| HTTP redirection       | Not followed          | Followed by default, can be disabled            |
| User throttling        | Concurrency-based     | Request-based                                   |

See also spm-migrate-map below for some additional, lower-level
differences.

#### Ban avoidance

Smart Proxy Manager does a good job at avoiding bans through proxy rotation,
ban detection, retrying algorithms, and browser mimicking through browser
profiles.

Zyte API improves on it by using an actual browser, if that is required to
prevent bans on a particular website.

Zyte API also supports webpage interaction.

#### Device residential IPs

Zyte API supports both static data center IPs and device residential IPs. It
automatically chooses the right type of IP address as needed, but it also
allows you to force a specific IP type.

#### Session management

While Smart Proxy Manager only supports client-managed sessions, Zyte API supports both client-managed
sessions and server-managed sessions.

The main difference between both implementations of client-managed sessions is
that, in Zyte API, it is the client (you), not the server, who generates the
session ID. See [session.id](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session.id) for details.

Additionally, for some scenarios, using browser actions can remove the need for multiple requests with a shared
session.

#### Geolocation

Both products let you choose which country of origin to use for a request.

However, with Zyte API you usually do not need to manually choose which country
of origin to use for each request, because Zyte API automatically chooses the
best country of origin based on the target website.

Smart Proxy Manager does support a richer list of countries of origin that you
can set manually. However, if you let Zyte API choose the right country of
origin, it can use additional countries not available for manual override.

Smart Proxy Manager also allows defining an account with a set of geolocations,
so that requests using that account pick a geolocation from that set. Zyte API
does not support this, you can either omit the geolocation and let Zyte API
choose the best geolocation, or set a specific geolocation for a given request.

For more information, see zapi-geolocation.

### Authentication

You cannot use your Smart Proxy Manager API key for Zyte API, you need to
[get a separate API key to use Zyte API](https://app.zyte.com/o/zyte-api/api-access).

### Proxy mode

Zyte API offers a proxy mode, which makes it easier to
migrate from Smart Proxy Manager.

> ###### NOTE
>
> Before you decide whether to use the proxy mode or the HTTP API, learn their differences.

To migrate, update your proxy endpoint and API key. You may also need to update some proxy headers as
indicated below.

> ###### WARNING
>
> The proxy mode is not optimized for use in combination with
> browser automation tools. Consider using Zyte API’s browser
> automation features instead. See
> zapi-browser-automation.

The following example shows a basic request using Smart Proxy Manager:

#### C#

```cs
using System;
using System.IO;
using System.Net;
using System.Text;

var proxy = new WebProxy("http://proxy.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var request = (HttpWebRequest)WebRequest.Create("https://toscrape.com");
request.Proxy = proxy;
request.PreAuthenticate = true;
request.AllowAutoRedirect = false;

var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
var httpResponseBody = reader.ReadToEnd();
reader.Close();
response.Close();

Console.WriteLine(httpResponseBody);
```

#### curl

```bash
curl \
    --proxy proxy.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'proxy.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@proxy.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

> ###### NOTE
>
> You need to install and configure our CA certificate for
> the requests library.

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@proxy.zyte.com:8011"
        for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Scrapy

After you install and configure [scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy), you can use
Scrapy as usual and all requests will be proxied through Smart Proxy
Manager automatically.

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)
```

And this is an identical request using the proxy mode of Zyte API:

#### C#

```cs
using System;
using System.Net;
using System.Net.Http;

var proxy = new WebProxy("http://api.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var httpClientHandler = new HttpClientHandler
{
    Proxy = proxy,
};

var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
var message = new HttpRequestMessage(HttpMethod.Get, "https://toscrape.com");
var response = client.Send(message);
var body = await response.Content.ReadAsStringAsync();

Console.WriteLine(body);
```

#### curl

```bash
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### Java

```java
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.hc.client5.http.auth.AuthCache;
import org.apache.hc.client5.http.auth.AuthScope;
import org.apache.hc.client5.http.auth.CredentialsProvider;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.impl.auth.BasicAuthCache;
import org.apache.hc.client5.http.impl.auth.BasicScheme;
import org.apache.hc.client5.http.impl.auth.CredentialsProviderBuilder;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.client5.http.impl.routing.DefaultProxyRoutePlanner;
import org.apache.hc.client5.http.protocol.HttpClientContext;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHost;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;

class Example {
  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {

    HttpHost proxy = new HttpHost("api.zyte.com", 8011);
    DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
    CredentialsProvider credentialsProvider =
        CredentialsProviderBuilder.create()
            .add(new AuthScope(proxy), "YOUR_ZYTE_API_KEY", "".toCharArray())
            .build();

    AuthCache authCache = new BasicAuthCache();
    BasicScheme basicAuth = new BasicScheme();
    authCache.put(proxy, basicAuth);
    HttpClientContext context = HttpClientContext.create();
    context.setCredentialsProvider(credentialsProvider);
    context.setAuthCache(authCache);

    CloseableHttpClient client =
        HttpClients.custom()
            .setRoutePlanner(routePlanner)
            .setDefaultCredentialsProvider(credentialsProvider)
            .build();

    HttpGet request = new HttpGet("https://toscrape.com");
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String httpResponseBody = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }
}
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'api.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

> ###### NOTE
>
> You need to install and configure our CA certificate for
> the requests library.

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@api.zyte.com:8011" for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Ruby

```ruby
# frozen_string_literal: true

require 'net/http'

url = URI('https://toscrape.com/')
proxy_host = 'api.zyte.com'
proxy_port = '8011'

http = Net::HTTP.new(url.host, url.port, proxy_host, proxy_port, 'YOUR_ZYTE_API_KEY', '')
http.use_ssl = true

r = http.start do |h|
  h.request(Net::HTTP::Get.new(url))
end

puts r.body
```

#### Scrapy

When using [scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy), set the `ZYTE_SMARTPROXY_URL`
setting to `"http://api.zyte.com:8011"` and the
`ZYTE_SMARTPROXY_APIKEY` setting to [your Zyte API key](https://app.zyte.com/o/zyte-api/api-access) for Zyte API.

> ###### NOTE
>
> **Important**: Use your **Zyte API key** here, not a Scrapy Cloud API key. Make sure you get this from the Zyte API access page.

Then you can continue using Scrapy as usual and all requests will be
proxied through Zyte API automatically.

```python
from scrapy import Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)
```

### HTTP API

> ###### TIP
>
> If you are using [scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy) (previously
> scrapy-crawlera), see scrapy-zyte-smartproxy-migrate for detailed
> migration steps.

When migrating from Smart Proxy Manager to the HTTP API
of Zyte API, the main challenge is switching from a proxy API to an HTTP API.
To read its rich output data, you need JSON parsing and sometimes
base64-decoding.

The following example shows a basic request using Smart Proxy Manager:

#### C#

```cs
using System;
using System.IO;
using System.Net;
using System.Text;

var proxy = new WebProxy("http://proxy.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_ZYTE_API_KEY", "");

var request = (HttpWebRequest)WebRequest.Create("https://toscrape.com");
request.Proxy = proxy;
request.PreAuthenticate = true;
request.AllowAutoRedirect = false;

var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
var httpResponseBody = reader.ReadToEnd();
reader.Close();
response.Close();

Console.WriteLine(httpResponseBody);
```

#### curl

```bash
curl \
    --proxy proxy.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com
```

#### JS

```js
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'proxy.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_ZYTE_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_ZYTE_API_KEY:@proxy.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
```

#### Python

> ###### NOTE
>
> You need to install and configure our CA certificate for
> the requests library.

```python
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_ZYTE_API_KEY:@proxy.zyte.com:8011"
        for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())
```

#### Scrapy

After you install and configure [scrapy-zyte-smartproxy](https://github.com/scrapy-plugins/scrapy-zyte-smartproxy), you can use
Scrapy as usual and all requests will be proxied through Smart Proxy
Manager automatically.

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)
```

And this is an identical request using the HTTP API of Zyte API:

> ###### NOTE
>
> Install and configure code example requirements and
> the Zyte CA certificate to run the example below.

#### C#

```cs
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_ZYTE_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"httpResponseBody", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);
```

#### CLI client

input.jsonl
```json
{"url": "https://toscrape.com", "httpResponseBody": true}
```

```shell
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
```

#### curl

input.json
```json
{
    "url": "https://toscrape.com",
    "httpResponseBody": true
}
```

```shell
curl \
    --user YOUR_ZYTE_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
```

#### Java

```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_ZYTE_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "httpResponseBody", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    CloseableHttpClient client = HttpClients.createDefault();
    client.execute(
        request,
        response -> {
          HttpEntity entity = response.getEntity();
          String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
          JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
          String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
          byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
          String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
          System.out.println(httpResponseBody);
          return null;
        });
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
```

#### JS

```js
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    httpResponseBody: true
  },
  {
    auth: { username: 'YOUR_ZYTE_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
})
```

#### PHP

```php
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_ZYTE_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'httpResponseBody' => true,
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);
```

#### Proxy mode

With the proxy mode, you always get a response
body.

```shell
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_ZYTE_API_KEY: \
    --compressed \
    https://toscrape.com \
> output.html
```

#### Python

```python
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_ZYTE_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "httpResponseBody": True,
    },
)
http_response_body: bytes = b64decode(api_response.json()["httpResponseBody"])
```

#### Python client

```python
import asyncio
from base64 import b64decode

from zyte_api import AsyncZyteAPI

async def main():
    client = AsyncZyteAPI()
    api_response = await client.get(
        {
            "url": "https://toscrape.com",
            "httpResponseBody": True,
        }
    )
    http_response_body = b64decode(api_response["httpResponseBody"]).decode()
    print(http_response_body)

asyncio.run(main())
```

#### Scrapy

In transparent mode, when you target a text
resource (e.g. HTML, JSON), regular Scrapy requests work out of the
box:

```python
from scrapy import Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        http_response_text: str = response.text
```

While regular Scrapy requests also work for binary responses at the
moment, they may stop working in future versions of
scrapy-zyte-api, so passing
[httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/httpResponseBody) is recommended when targeting binary
resources:

```python
from scrapy import Request, Spider

class ToScrapeSpider(Spider):
    name = "toscrape_com"

    async def start(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "httpResponseBody": True,
                },
            },
        )

    def parse(self, response):
        http_response_body: bytes = response.body
```

Output (first 5 lines):

```html
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>
```

See zapi-usage for richer Zyte API examples, covering more scenarios
and features. See also spm-migrate-map to migrate your request
parameters.

If your code seems to run slower with Zyte API, see zapi-optimize.

There is no easy way to use Zyte API to drive requests from browser automation
tools. If you are using Smart Proxy Manager as a proxy for a browser automation
tool, consider using Zyte API for your browser automation needs instead. See
zapi-browser-automation.

### Parameter mapping

The following table shows a mapping of Smart Proxy Manager request
headers and their corresponding proxy mode headers and Zyte API parameters:

| Smart Proxy Manager        | Zyte API (proxy mode)    | Zyte API (HTTP API)                                                                                      |
|----------------------------|--------------------------|----------------------------------------------------------------------------------------------------------|
| `X-Crawlera-Client`        | zyte-client              | `User-Agent` header                                                                                      |
| x-crawlera-cookies bc      | See below                | See below                                                                                                |
| x-crawlera-jobid bc        | zyte-jobid               | [jobId](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/jobId)             |
| x-crawlera-max-retries     | Not planned              | Not planned                                                                                              |
| x-crawlera-no-bancheck     | Planned                  | Not planned                                                                                              |
| x-crawlera-profile bc      | zyte-device \*           | See below                                                                                                |
| x-crawlera-profile-pass bc | zyte-override-headers \* | See below                                                                                                |
| X-Crawlera-Region bc       | zyte-geolocation         | [geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) |
| x-crawlera-session bc      | zyte-session-id          | [session.id](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/session.id)   |
| x-crawlera-timeout         | Not planned              | Not planned                                                                                              |
| x-crawlera-use-https       | N/A                      | N/A                                                                                                      |

Headers tagged with bc can be used in Zyte API proxy mode. See
spm-migrate-bc.

#### Replacing X-Crawlera-Cookies

x-crawlera-cookies supports 3 values:

- `enable` causes automatic cookies to override request cookies.

  To achieve this in Zyte API, do not set [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies).
- `disable` causes request cookies to override automatic cookies.

  This is the default behavior of Zyte API, using
  [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies) overrides automatic cookies.
- `discard` causes both request cookies and automatic cookies to be
  discarded.

  To achieve this in Zyte API, do not set [requestCookies](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/requestCookies), and
  set [cookieManagement](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/cookieManagement) (zyte-cookie-management in
  proxy mode) to `discard`.

#### Replacing X-Crawlera-Profile and X-Crawlera-Profile-Pass

In general, you can replace `X-Crawlera-Profile` with the
[device](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/device) Zyte API request parameter.

Mind, however, that the behavior of Zyte API is actually a middle ground
between the `desktop` (or `mobile`) and `pass` values of
x-crawlera-profile: browser-specific headers are always sent (unlike
`pass`, which disables them altogether), but you can override them (unlike
`desktop` or `mobile`, which force them unless you use
`X-Crawlera-Profile-Pass`). See zapi-body-request-headers for more
information.

#### Header backward compatibility

When using Zyte API proxy mode, migrating to
Zyte API proxy mode headers is recommended.

However, the following Smart Proxy Manager request headers can be used in Zyte
API proxy mode: x-crawlera-cookies, x-crawlera-jobid,
x-crawlera-profile, x-crawlera-profile-pass,
X-Crawlera-Region, x-crawlera-session.

If any of these Smart Proxy Manager headers is used, the response will include
x-crawlera-error if needed for the following error codes: `banned`, `invalid_request`, `bad_auth`,
`bad_proxy_auth`, `max_header_size_exceeded`, `internal_server_error`,
`timeout`, `domain_forbidden`.

> ###### TIP
>
> To force getting x-crawlera-error on a request without Smart
> Proxy Manager request headers, add a no-op Smart Proxy Manager request
> header, e.g. `X-Crawlera-Profile-Pass: Foo`.

### Unsubscribe from Smart Proxy Manager

Once you have successfully migrated to Zyte API, remember to unsubscribe from
Smart Proxy Manager. If in doubt, [reach out to us](https://support.zyte.com/support/tickets/new).

## Zyte API pricing

[Sign up](https://app.zyte.com/account/signup/zyteapi) to get a standard plan with no commitment
and some free credit.

Request cost depends on the target website and selected features.
Use our [cost estimator](https://app.zyte.com/o/cost-estimator) to calculate costs.

We charge only for successful responses and
provide volume discounts.

### Plans

|                     | Standard (PAYG)                                                    | Standard (commitment)                                               | Enterprise                                                        |
|---------------------|--------------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------------|
| How to enroll?      | Automatic on [signup](https://app.zyte.com/account/signup/zyteapi) | [Subscriptions](https://app.zyte.com/o/subscriptions/overview) page | [Contact sales](https://www.zyte.com/zyte-web-scraping-api/#form) |
| Initial free credit | $5 [^1]                                                            | $5 [^1]                                                             | $200                                                              |
| Rate limit          | 3000 RPM                                                           | 3000 RPM                                                            | Custom                                                            |
| Spending limit      | $100 / month                                                       | $200-$1000 / month                                                  | Custom                                                            |
| Commitment          |                                                                    | $100-$500                                                           | Custom                                                            |
| Volume discount     |                                                                    | 25%-52%                                                             | Custom                                                            |
[^1]: Shared across standard plans. Switching to another standard plan does not provide additional free credit.

**Enterprise** plans may also enjoy:

- Assistance from expert engineers for onboarding and troubleshooting.
- Access to compliance experts (see also zapi-permissions-control).
- Add-on consultancy to design and scale projects.
- Premium 24/7 support and service level agreements (SLAs): 1-hour response
  time on weekdays, and 8 hours on weekends.
- Early access to new features and priority in feature requests.

### Initial free credit

On [sign up](https://app.zyte.com/account/signup/zyteapi), you receive $5 free credit for your first billing month.

After your free credit expires, your account is suspended
until you set a spending limit. Set a spending limit
before your free credit expires to ensure uninterrupted service.

Standard plans include $5 free credit, while an Enterprise plan includes $200 free credit.

Switching between standard plans does not provide additional free credit. Your
initial $5 credit carries over when you change your commitment level.

### Request costs

The **target website** and **request type** (HTTP or
browser) determine the request tier and base cost.

Requests using extended geolocations or
device residential IPs have different base costs
that depend only on request type, plus additional costs based on network
consumption.

Additional costs apply for:

- Actions: Based on CPU and network consumption [^3]
- Network captures: Based on output size
- Screenshots: $0.002 [^2]
- Automatic extraction: $0.0004-$0.0016 per data type [^2]
  (except [serp](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/serp), which is free)
- Custom attributes: Cost depends on the extraction
  [method](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributesOptions.method):
  - [“generate”](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributesOptions.method): Based on
    [inputTokens](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/customAttributes.metadata.inputTokens) ($0.002/1k) and
    [outputTokens](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/customAttributes.metadata.outputTokens) ($0.01/1k) [^2]
  - [“extract”](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributesOptions.method): Fixed $0.001 [^2]

[^2]: Cost before volume discount.
[^3]: Network costs scale with device residential IPs usage.

### Request tiers

Zyte API automatically uses the most cost-efficient technology for each website
and assigns price tiers accordingly.

There are 5 tiers each for HTTP requests and browser requests.
Every combination of target website and request type belongs to a tier that determines
the base cost. See [Pricing @ zyte.com](https://www.zyte.com/pricing/#pricing)
for tier pricing and distribution data.

Tier assignment is automatic. New combinations start with a temporary tier until
enough data is gathered for permanent assignment.

We review tier assignments quarterly. Affected customers receive 2 weeks notice
of any tier changes.

### Successful responses

You are only charged for successful responses.
Rate-limiting and unsuccessful
responses are free.

### Spending limit

Your spending limit is the maximum monthly charge for Zyte API usage. When
reached, your account is suspended until the
next billing month.

Enterprise plans have custom spending limits managed
through your account manager.

On standard plans, you can either have a $100 spending
limit which is pay-as-you-go, or a higher spending limit ($200, $400, $700 or
$1000) where you pay 50% of your spending limit as monthly commitment plus additional spend based on actual usage up to
your spending limit.

**Increasing spending limits:**

- Takes effect immediately and lifts account suspension
- For $200+ limits, you pay the monthly commitment difference immediately
- Changes your monthly commitment for future billing cycles

**Decreasing spending limits:**

- Takes effect next billing month

For limits above $1000, [contact sales](https://www.zyte.com/zyte-web-scraping-api/#form) for an Enterprise plan.

### Account suspension

When you reach your spending limit, requests return
account suspension responses.

Enterprise plans: Contact your account manager.

Standard plans: Increase your spending limit to immediately resume service.

### Monthly commitment

Enterprise plans: Custom monthly commitment

Standard plans:

- $100 spending limit: No monthly commitment (PAYG)
- $200+ spending limit: Monthly commitment = 50% of spending limit

Your monthly commitment determines your volume discount.

Monthly commitment is paid at the start of each billing month. If your actual
usage exceeds the commitment, you pay the additional spend on your next bill.
If usage is below the commitment, no refund is provided.

**Example:** $200 spending limit = $100 monthly commitment

- Month 1: Pay $100 commitment
- Actual usage: $150 → Next bill includes $50 additional spend
- Actual usage: $80 → Next bill includes $0 additional spend

### Volume discount

Volume discounts are applied to each request.

Enterprise plans: Custom volume discount

Standard plans: Volume discount based on monthly
commitment:

| Monthly commitment   | Volume discount   |
|----------------------|-------------------|
| $100                 | 25%               |
| $200                 | 40%               |
| $350                 | 48%               |
| $500                 | 52%               |

## Zyte API frequently asked questions

### How many concurrent requests can I send?

See zyte-api-concurrency.

### Is there a response size limit?

The size limit of [browserHtml](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/browserHtml) and
[httpResponseBody](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/response/200/httpResponseBody) (before base64-encoding) is 10 MB. Longer
responses are truncated. [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression) does not affect this limit.

### Can I set a more granular geolocation?

[geolocation](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation) only supports country granularity.

However, websites seldom limit content by IP address at a lower granularity.
Provided you use the right country, you should be able to get content for any
specific country subdivision (ZIP code, state, etc.).

The way to get content targeted at a specific country subdivision is usually
through cookies. The way to get the right cookies depends on the target
website:

- On some websites you can use our setLocation action
  or some other actions to configure the target subdivision.
  > ###### TIP
  >
  > You can use sessions to minimize the amount
  > of browser requests you need.
- On some websites you can manually set a cookie that
  forces content for the target subdivision.
- On some websites you may need to start a session
  and configure that session for the target subdivision.

### Can I override the `User-Agent` header?

- You can on HTTP requests, but Zyte API may override your value for certain websites if
  needed for ban avoidance.
- You cannot on browser requests.

## Get started with Scrapy Cloud

Scrapy Cloud is a service that allows running web scraping code in the cloud.

Scrapy Cloud is designed for [Scrapy](https://scrapy.org) projects, but can support other
technologies.

### First steps

1. Sign up for Scrapy Cloud on the [Zyte dashboard](https://app.zyte.com/) for free.
2. Follow our web scraping tutorial, which covers running a
   job in Scrapy Cloud.

### Using Scrapy Cloud

See sc-usage for general usage help.

sc-reference provides detailed reference documentation.

## Scrapy Cloud usage

### Basic usage

> ##### Projects
>
> Manage your projects.

> ##### Deployment
>
> Deploy code to projects.

> ##### Spiders
>
> Write and configure web crawlers.

> ##### Jobs
>
> Run spiders and scripts.

> ##### Items
>
> Handle data extracted by jobs.

### Advanced topics

> ##### Scripts
>
> Write and run Python scripts.

> ##### Units
>
> Manage your resources for jobs.

> ##### Reference
>
> See the complete API reference documentation.

## Scrapy Cloud projects

A Scrapy Cloud project represents a code base for web scraping.

You can have any number of Scrapy Cloud projects, each with its own code
base, or with a specific version (e.g. commit) of some code base.

A common approach is to keep 2 projects:

- A project for production, with a stable code base.
- A project for development, to test changes before moving them to
  the production project.

For information about managing projects, see:

- [Organizations and Projects](https://support.zyte.com/support/solutions/articles/22000200432-organizations-and-projects)
- [Inviting Users to Projects](https://support.zyte.com/support/solutions/articles/22000200430-inviting-users-to-projects)
- [Managing Organization and Project members](https://support.zyte.com/support/solutions/articles/22000271734-managing-organization-and-project-members)
- [Customizing Scrapy settings in Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000200670-customizing-scrapy-settings-in-scrapy-cloud)
- [Deleting projects](https://support.zyte.com/support/solutions/articles/22000200397-deleting-projects)

## Deploying code to Scrapy Cloud projects

For information about deploying your code to a project, see:

- [Deploying your spiders to Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000204081-deploying-your-spiders-to-scrapy-cloud)
- [Deploying a Project from a Github Repository](https://support.zyte.com/support/solutions/articles/22000201935-deploying-a-project-from-a-github-repository)
- [Versioning your deploys to Zyte Developer Tool Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000204254-versioning-your-deploys-to-zyte-developer-tool-scrapy-cloud)
- [Deploying non-code files](https://support.zyte.com/support/solutions/articles/22000200416-deploying-non-code-files)

> ###### TIP
>
> ai-code support Scrapy Cloud deployment.

### Stacks

For information about Scrapy Cloud stacks, see:

- [Changing the Deploy Environment With Scrapy Cloud Stacks](https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks)
- [Deploying Python 3 spiders to Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000200387-deploying-python-3-spiders-to-scrapy-cloud)

#### Requirements

For information about installing additional Python packages into stacks, see:

- [Deploying Python Dependencies for Your Projects in Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000200400-deploying-python-dependencies-for-your-projects-in-scrapy-cloud)

#### Addons

When using a stack, you can use [Scrapy Cloud add-ons](https://support.zyte.com/support/solutions/articles/22000200395-scrapy-cloud-addons) to extend your code,
including:

- [Autothrottle](https://support.zyte.com/support/solutions/articles/22000200424-auto-throttle-addon),
  to crawl gently.
- [DeltaFetch](https://support.zyte.com/support/solutions/articles/22000200411-delta-fetch-addon),
  to crawl only new pages.
- [DotScrapy Persistence](https://support.zyte.com/support/solutions/articles/22000200401-dotscrapy-persistence-addon),
  to persist data between jobs.
- [Images](https://support.zyte.com/support/solutions/articles/22000200389-images-storage-addon),
  to download images into S3 storage.
- [Magic Fields](https://support.zyte.com/support/solutions/articles/22000200418-magic-fields-addon),
  to add item fields.
- [Page Storage](https://support.zyte.com/support/solutions/articles/22000200403-page-storage-addon),
  to store visited pages.
- [Query Cleaner](https://support.zyte.com/support/solutions/articles/22000200412-query-cleaner-addon),
  to clean request URL query parameters.

### Using Docker images

For information about using Docker images to deploy your code, see:

- [Deploying Custom Docker images on Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000200425-deploying-custom-docker-images-on-scrapy-cloud)
- [Errors while deploying Custom Image to Scrapy Cloud](https://support.zyte.com/support/solutions/articles/22000232799-errors-while-deploying-custom-image-to-scrapy-cloud)

## Scrapy Cloud spiders

A Scrapy Cloud spider is a Scrapy spider that is part
of a Scrapy project that has been deployed into a Scrapy Cloud project. You can start jobs to execute the code
of a spider.

Our web scraping tutorial covers creating, deploying, and
running spiders. For more information, see the Scrapy documentation.

### Spider templates and virtual spiders

Scrapy Cloud supports defining spider templates, that
you can use from the Scrapy Cloud UI to create virtual spiders that run the code of the corresponding spider template with
predefined parameters.

#### Spider templates

To create a spider template:

1. Add [scrapy-spider-metadata](https://scrapy-spider-metadata.readthedocs.io/en/latest/) as a dependency to your Scrapy Cloud
   project.
2. On the spiders that you wish to use as templates, define [metadata](https://scrapy-spider-metadata.readthedocs.io/en/latest/metadata.html#defining-spider-metadata)
   including a `title` and `description` of your choice, and setting
   `template` to `True`:
   ```python
   from scrapy import Spider

   class MySpider(Spider):
       ...
       metadata = {
           "title": "My Template",
           "description": "Description of my template.",
           "template": True,
       }
   ```

When you redeploy your code, you can start
creating virtual spiders from your spider templates.

> ###### NOTE
>
> Spider templates are also regular spiders, and can be executed directly as well.

#### Virtual spiders

To create a virtual spider from a spider template, go to your Scrapy Cloud
project page and, on the left-hand sidebar, under **Spiders**, select **Create
spider**.

On the **Create Spider** page, you can select a template, define the parameters
of your new virtual spider, and save your spider.

You can then use your virtual spider from Scrapy Cloud as if it were a regular
spider.

Virtual spiders exist only in Scrapy Cloud, not in your code. However, changes
to the code of their spider template will affect them.

#### Spider parameters

The point of spider templates is to be able to create virtual spiders from them
that each works differently based on predefined parameters.

To expose parameters to the Scrapy Cloud UI so that they can be defined when
creating a virtual spider, add a [parameter specification](https://scrapy-spider-metadata.readthedocs.io/en/latest/params.html) to your template
spiders using [scrapy-spider-metadata](https://scrapy-spider-metadata.readthedocs.io/en/latest/):

```python
from pydantic import BaseModel
from scrapy import Spider
from scrapy_spider_metadata import Args

class MyParams(BaseModel):
    foo: str

class MySpider(Args[MyParams], Spider):
    ...
```

##### Parameter types

Scrapy Cloud supports the following parameter types:

- `bool`
- `int`, `float` (with `gt`, `lt`, `ge`, and `le` [numeric
  constraint](https://docs.pydantic.dev/latest/concepts/fields/#numeric-constraints) support)
- `str` (with [string constraint](https://docs.pydantic.dev/latest/concepts/fields/#string-constraints) support)

  Scrapy Cloud also supports defining a placeholder through
  [json_schema_extra](https://docs.pydantic.dev/latest/api/fields/#pydantic.fields.Field):
  ```python
  from pydantic import BaseModel, Field

  class MyParams(BaseModel):
      url: str = Field(
          json_schema_extra={
              "placeholder": "https://books.toscrape.com",
          },
      )
  ```
- `str` + `Enum`

  Define `enumMeta` in [json_schema_extra](https://docs.pydantic.dev/latest/api/fields/#pydantic.fields.Field) to give your enumeration
  choices an optional title and description:
  ```python
  from enum import Enum

  from pydantic import BaseModel, Field

  class Foo(str, Enum):
      bar: str = "bar"
      baz: str = "baz"

  class MyParams(BaseModel):
      foo: Foo = Field(
          json_schema_extra={
              "enumMeta": {
                  Foo.bar: {
                      "title": "Bar",
                      "description": "Bar description.",
                  },
                  Foo.baz: {
                      "title": "Baz",
                      "description": "Baz description.",
                  },
              },
          },
      )
  ```

##### Widgets

Scrapy Cloud also supports a few special UI widgets that you can enable through
the `widget` key of [json_schema_extra](https://docs.pydantic.dev/latest/api/fields/#pydantic.fields.Field), e.g.

```python
from pydantic import BaseModel, Field

class MyParams(BaseModel):
    foo: int = Field(
        json_schema_extra={
            "widget": "widget-id",
        },
    )
```

The following widgets are supported:

- `custom-attrs`, to specify a [custom attributes schema](https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/customAttributes).
- `request-limit`, to specify a maximum number of requests.
- `textarea`, for multi-line text input.

##### Parameter groups

Scrapy Cloud also supports defining 2 or more optional parameters so that
filling 1 of them (and only 1) is required:

```python
from pydantic import BaseModel, ConfigDict

    class MyParams(BaseModel):
        model_config = ConfigDict(
            json_schema_extra={
                "groups": [
                    {
                        "id": "a-or-b",
                        "title": "A or B",
                        "description": "Fill A or B.",
                        "widget": "exclusive",
                    },
                ],
            },
        )
        a: str = Field(
            "json_schema_extra": {
                "group": "a-or-b",
                "exclusiveRequired": True,
            },
        )
        b: str = Field(
            "json_schema_extra": {
                "group": "a-or-b",
                "exclusiveRequired": True,
            },
        )
```

## Scrapy Cloud jobs

A job is the execution of a Scrapy spider or a Python script in Scrapy Cloud.

### Running a job

For information about running a job, see:

- [Running a Scrapy spider](https://support.zyte.com/support/solutions/articles/22000200667-running-a-scrapy-spider)
  > ###### TIP
  >
  > Starting a job through the API allows
  > defining [Scrapy setting](https://scrapy-poet.readthedocs.io/en/stable/settings.html) overrides for the job.
- [Managing your jobs in the Jobs Dashboard](https://support.zyte.com/support/solutions/articles/22000200452-managing-your-jobs-in-the-jobs-dashboard)
- [Understanding Job Outcomes](https://support.zyte.com/support/solutions/articles/22000200413-understanding-job-outcomes)
- [Deploy Project and run Spiders with settings of different environments](https://support.zyte.com/support/solutions/articles/22000236412-deploy-project-and-run-spiders-with-settings-of-different-environments)
- [Sharing data between spiders](https://support.zyte.com/support/solutions/articles/22000200420-sharing-data-between-spiders)
- [What are the differences between running a spider locally and on Scrapy Cloud?](https://support.zyte.com/support/solutions/articles/22000200426-what-are-the-differences-between-running-a-spider-locally-and-on-scrapy-cloud-)
- [Can I run the same spider in parallel?](https://support.zyte.com/support/solutions/articles/22000232777-can-i-run-the-same-spider-in-parallel-)
- [Starting jobs programmatically (HTTP API or Python)](https://support.zyte.com/support/solutions/articles/22000200394-running-custom-python-scripts)
- [Can I use an HTTP cache on Scrapy Cloud?](https://support.zyte.com/support/solutions/articles/22000201056-can-i-use-an-http-cache-on-scrapy-cloud-)
- [Why do I get “Rejected message because it was too big” error?](https://support.zyte.com/support/solutions/articles/22000218173-why-do-i-get-rejected-message-because-it-was-too-big-error-)
- [I clicked on STOP button but spider not stopped.](https://support.zyte.com/support/solutions/articles/22000222482-i-clicked-on-stop-button-but-spider-not-stopped-)

### Monitoring jobs

For information about monitoring jobs, see:

- [Getting Notifications on Certain Events](https://support.zyte.com/support/solutions/articles/22000200451-getting-notifications-on-certain-events)
- [Inspecting your spider’s runtime environment with the Job Console](https://support.zyte.com/support/solutions/articles/22000200427-inspecting-your-spider-s-runtime-environment-with-the-job-console)

### Scheduling periodic jobs

For information about scheduling periodic jobs, see:

- [Scheduling Periodic Jobs](https://support.zyte.com/support/solutions/articles/22000200419-scheduling-periodic-jobs)
- [Is it possible to schedule jobs to run sequentially?](https://support.zyte.com/support/solutions/articles/22000244891-is-it-possible-to-schedule-jobs-to-run-sequentially-)
- [How Can I Set a Number of Scrapy Cloud Units to Use for a Periodic Job?](https://support.zyte.com/support/solutions/articles/22000256286-how-can-i-set-a-number-of-scrapy-cloud-units-to-use-for-a-periodic-job-)
- [Can I Configure My Periodic Job To Run a Spider Every Minute?](https://support.zyte.com/support/solutions/articles/22000261680-can-i-configure-my-periodic-job-to-run-a-spider-every-minute-)

### See also

- sc-items

## Scrapy Cloud job logs

The **Log** tab of a job contains all the messages logged by the
job.

It includes all messages logged with Python’s `logging`, both those from
Scrapy built-in components and your own code.

### Troubleshooting

Here you can find some help to figure out the meaning of common log messages.

#### Ignoring response

> [scrapy.spidermiddlewares.httperror] Ignoring response <403 [https://example.com](https://example.com)>: HTTP status code is not handled or not allowed

By default, after redirects have been followed and retries exceeded, Scrapy
ignores responses with an HTTP status code outside the 200-299 range.

Some HTTP status codes, such as 401, 403 or 429, may be the result of a
ban. Consider using Zyte API to avoid
bans.

If you want to handle those responses in your request callback, instead of
ignoring them:

- Set `handle_httpstatus_all` or `handle_httpstatus_list`
  in your request metadata to handle such responses for a specific request:
  ```python
  Request("https://example.com", meta={"handle_httpstatus_list": {403}})
  ```
- Use the `HTTPERROR_ALLOW_ALL` or
  `HTTPERROR_ALLOWED_CODES` settings to handle such responses for
  all requests.

## Scrapy Cloud job items

For information about job items, see:

- Downloading items
- [Configuring scraped fields](https://support.zyte.com/support/solutions/articles/22000200410-configuring-scraped-fields)
- [Providing feedback on scraped data](https://support.zyte.com/support/solutions/articles/22000200398-providing-feedback-on-scraped-data)
- [Publishing and sharing datasets](https://support.zyte.com/support/solutions/articles/22000200453-publishing-and-sharing-datasets)
- [Why do I get “Rejected message because it was too big” error?](https://support.zyte.com/support/solutions/articles/22000218173-why-do-i-get-rejected-message-because-it-was-too-big-error-)

## Downloading from Scrapy Cloud

After you run a job on Scrapy Cloud, you
can download your scraped data from Scrapy Cloud, be it from the Zyte
dashboard, from a URL, or from the API.

### Downloading from the Zyte dashboard

To download your job data from the [Zyte dashboard](https://app.zyte.com):

1. Open the details page of your job (`https://app.zyte.com/p/<job ID>`).
2. Open the **Items** tab (`https://app.zyte.com/p/<job ID>/items`).
3. On the right-hand side, select **Download › <format>**.

   **<format>** can be one of: [CSV](https://en.wikipedia.org/wiki/Comma-separated_values), [JSON](https://www.json.org/json-en.html), [JSON Lines](https://jsonlines.org/), [XML](https://en.wikipedia.org/wiki/XML).

![](scrapy-cloud/usage/items/download-dash.png)

### Download from a URL

Download links from the Zyte dashboard are
transparent. Given a job ID and your Scrapy Cloud API key, you can build one
manually, for example to automate downloads.

For JSON, JSON Lines and XML, download URLs follow this pattern:

```none
https://storage.zyte.com/items/<job ID>?apikey=<Scrapy Cloud API key>&format=<format>
```

Where:

- **<job ID>** is the job ID, e.g. `00000/0/0`.
- **<Scrapy Cloud API key>** is your [Scrapy Cloud API key](https://app.zyte.com/o/settings/apikey).
- **<format>** is the output file format, one of: `json`, `jl`
  (JSON Lines), `xml`.

For CSV, the download URL is similar, but you:

- Must specify a comma-separated list of fields to export as well, in the
  `fields` query string parameter.
- Can use the `include_headers` query string parameter to indicate
  whether you want the file names in the first row (`1`) or not (`0`,
  default).

For example:

```none
https://storage.zyte.com/items/<job ID>?apikey=<Scrapy Cloud API key>&format=csv&fields=key,name,price,url&include_headers=1
```

### See also

- export

## Scrapy Cloud scripts

In addition to Scrapy spiders, you can include
standalone Python scripts in your Scrapy project and run them on Scrapy Cloud.

Scrapy Cloud scripts need to be declared under `scripts` in your `setup.py`
file:

```python
from setuptools import setup, find_packages

setup(
    name="myproject",
    version="1.0",
    packages=find_packages(),
    scripts=["bin/hello.py"],
    entry_points={"scrapy": ["settings = myproject.settings"]},
)
```

When starting a job, you can select a script instead of a spider.
Scripts are listed with their file name, prefixed with `py:`; for example,
`py:hello.py` for the script in the example above.

To access your Scrapy project settings from a script, including those defined
in Scrapy Cloud, use the `sh_scrapy.utils.get_project_settings` function:

```python
from sh_scrapy.utils import get_project_settings

settings = get_project_settings()
```

> ###### NOTE
>
> This function was introduced in [scrapinghub-entrypoint-scrapy](https://github.com/scrapinghub/scrapinghub-entrypoint-scrapy) 0.12.
> If you cannot import it, make sure you are using a modern Scrapy
> stack, or add `scrapinghub-entrypoint-scrapy>=0.12` to your
> requirements.

## Scrapy Cloud units

Jobs run on Scrapy Cloud units.

To run a job:

1. You assign between 1 and 6 units to your job.
2. Your job remains in the queue of pending jobs until the selected number of
   units are available.
3. When your job starts, the selected number of units are allocated to your
   job for the duration of the job.
4. When the job finishes, the job units are released, ready to be used by
   another job.

You can only run as many parallel jobs as units you have. If you have 1 unit,
you can only run 1 job at a time. If you have 2 units, you can run 2 jobs in
parallel, each job using 1 unit.

Every unit assigned to a job gives that job 1 computing unit, 1 GB of memory,
and 2.5 GB of disk. A job running with 2 units has twice the compute power,
memory and disk space as a job running with 1 unit.

## Scrapy Cloud reference

HTTP API
: HTTP API to interact with spiders, jobs, and other Scrapy Cloud resources.

Entry Point API
: Write custom Docker images that are compatible with Scrapy Cloud.

## Scrapy Cloud API

Scrapy Cloud provides an HTTP API for interacting with your spiders, jobs and scraped data.

### Getting started

#### Authentication

You’ll need to authenticate using your [Scrapy Cloud API key](https://app.zyte.com/o/settings/apikey).

> ###### IMPORTANT
>
> Scrapy Cloud uses a different API key than Zyte API.

There are two ways to authenticate:

HTTP Basic:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/foo
```

URL Parameter:

```
$ curl https://storage.zyte.com/foo?apikey=YOUR_SCRAPY_CLOUD_API_KEY
```

#### Example

Running a spider is simple:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://app.zyte.com/api/run.json -d project=PROJECT -d spider=SPIDER
```

Where `YOUR_SCRAPY_CLOUD_API_KEY` is your Scrapy Cloud API key, `PROJECT`
is the spider’s project ID, and `SPIDER` is the name of the spider you want
to run.

It’s possible to override Scrapy settings for a job:

```
$ curl \
    -u YOUR_SCRAPY_CLOUD_API_KEY: \
    https://app.zyte.com/api/run.json \
    -d project=PROJECT \
    -d spider=SPIDER \
    -d job_settings='{"LOG_LEVEL": "DEBUG"}'
```

`job_settings` should be a valid JSON and will be merged with project and spider settings provided for given spider.

### API endpoints

#### app.zyte.com

#### storage.zyte.com

#### Python client

You can use the [python-scrapinghub](https://github.com/scrapinghub/python-scrapinghub) library to interact with Scrapy Cloud API.
Check the [documentation](https://python-scrapinghub.readthedocs.io/) for installation instructions and usage examples.

### Pagination

You can paginate the results for the majority of the APIs using a number of parameters.
The pagination parameters differ depending on the target host for a given endpoint.

#### app.zyte.com

| Parameter   | Description                          |
|-------------|--------------------------------------|
| count       | Number of results per page.          |
| offset      | Offset to retrieve specific records. |

#### storage.zyte.com

| Parameter   | Description                                                        |
|-------------|--------------------------------------------------------------------|
| count       | Number of results per page.                                        |
| index       | Offset to retrieve specific records. Multiple values supported.    |
| start       | Skip results before the given one. See a note about format below.  |
| startafter  | Return results after the given one. See a note about format below. |
> ###### NOTE
>
> The parameters naming inconsistency is caused by historical reasons and will be fixed in the coming platform updates.

> ###### NOTE
>
> While `index` parameter is just a short `<entity_id>` (ex: `index=4`), `start` and `startafter` parameters should have the full form `<project_id>/<spider_id>/<job_id>/<entity_id>` (ex: `start=1/2/3/4`, `startafter=1/2/3/3`).

### Result formats

There are two ways to specify the format of results: Using the `Accept` header, or using the `format` parameter.

The `Accept` header supports the following values:

* application/x-jsonlines
* application/json
* application/xml
* text/plain
* text/csv

The `format` parameter supports the following values:

* json
* jl
* xml
* csv
* text

[XML-RPC data types](http://en.wikipedia.org/wiki/XML-RPC#Data_types) are used for XML output.

#### CSV parameters

| Parameter       | Description                                                             | Required   |
|-----------------|-------------------------------------------------------------------------|------------|
| fields          | Comma delimited list of fields to include, in order from left to right. | Yes        |
| include_headers | When set to ‘1’ or ‘Y’, show header names in first row.                 | No         |
| sep             | Separator character.                                                    | No         |
| quote           | Quote character.                                                        | No         |
| escape          | Escape character.                                                       | No         |
| lineend         | Line end string.                                                        | No         |

When using CSV, you will need to specify the `fields` parameter to indiciate required fields and their order. Example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://storage.zyte.com/items/53/34/7?format=csv&fields=id,name&include_headers=1"
```

### Headers

*gzip* compression is supported. A client can specify that *gzip* responses can be handled using the `accept-encoding: gzip` request header. `content-encoding: gzip` header must be present in the response to signal the *gzip* content encoding.

You can use the `saveas` request parameter to specify a filename for browser downloads. For example, specifying `?saveas=foo.json` will cause a header of `Content-Disposition: Attachment; filename=foo.json` to be returned.

### Meta parameters

You can use the `meta` parameter to return metadata for the record in addition to its core data.

The following values are available:

| Parameter   | Description                                                           |
|-------------|-----------------------------------------------------------------------|
| \_key       | The item key in the format `:project_id/:spider_id/:job_id/:item_no`. |
| \_ts        | Timestamp in milliseconds for when the item was added.                |

Example:

```
$ curl "https://storage.zyte.com/items/53/34/7?meta=_key&meta=_ts"
{"_key":"1111111/1/1/0","_ts":1342078473363, ... }
```

> ###### NOTE
>
> If the data contains fields with the same name as the requested fields, they will both appear in the result.

## Jobs API

The jobs API makes it easy to work with your spider’s jobs and lets you schedule, stop, update and delete them.

> ###### NOTE
>
> Most of the features provided by the API are also available through the
> python-scrapinghub client library.

### run.json

Schedules a job for a given spider.

| Parameter    | Description                                                                                                          | Required   |
|--------------|----------------------------------------------------------------------------------------------------------------------|------------|
| project      | Project ID.                                                                                                          | Yes        |
| spider       | Spider name.                                                                                                         | Yes        |
| jobq_id      | Spider ID as `spider` in `project`/`spider`/`job` identifier.                                                        | No         |
| add_tag      | Add specified tag to job.                                                                                            | No         |
| priority     | Job priority. Supported values: 0 (lowest) to 4 (highest). Default: 2.                                               | No         |
| job_settings | [Scrapy settings](https://docs.scrapy.org/en/latest/topics/settings.html) to override for the job, as a JSON object. | No         |
| units        | Amount of units to run job. Supported values: 1 to 6.                                                                | No         |
> ###### NOTE
>
> Any other parameter will be treated as a spider argument.

> ###### NOTE
>
> In case of using `jobq_id` parameter, `spider` parameter would be not required.

| Method   | Description                    | Supported parameters                                      |
|----------|--------------------------------|-----------------------------------------------------------|
| POST     | Schedule the specified spider. | project, spider, jobq_id, add_tag, priority, job_settings |

Example that specifies a spider name:

```
$ curl \
    -u YOUR_SCRAPY_CLOUD_API_KEY: \
    https://app.zyte.com/api/run.json \
    -d project=123 \
    -d spider=somespider \
    -d units=2 \
    -d add_tag=sometag \
    -d spiderarg1=example \
    -d job_settings='{"CLOSESPIDER_PAGECOUNT": "10"}'
{"status": "ok", "jobid": "123/1/1"}
```

Example that specifies a spider ID:

```
$ curl \
    -u YOUR_SCRAPY_CLOUD_API_KEY: \
    https://app.zyte.com/api/run.json \
    -d project=123  \
    -d jobq_id=1  \
    -d units=2  \
    -d add_tag=sometag  \
    -d spiderarg1=example  \
    -d job_settings='{"CLOSESPIDER_PAGECOUNT": "10"}'
{"status": "ok", "jobid": "123/1/1"}
```

### jobs/list.{json,jl}

Retrieve job information for a given project, spider, or specific job.

| Parameter   | Description                          | Required   |
|-------------|--------------------------------------|------------|
| project     | Project ID.                          | Yes        |
| job         | Job ID.                              | No         |
| spider      | Spider name.                         | No         |
| state       | Return jobs with specified state.    | No         |
| has_tag     | Return jobs with specified tag.      | No         |
| lacks_tag   | Return jobs that lack specified tag. | No         |

Supported `state` values: `pending`, `running`, `finished`, `deleted`.

| Method   | Description               | Supported parameters                            |
|----------|---------------------------|-------------------------------------------------|
| GET      | Retrieve job information. | project, job, spider, state, has_tag, lacks_tag |

Examples:

```
# Retrieve the latest 3 finished jobs
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/jobs/list.json?project=123&spider=somespider&state=finished&count=3"
{
  "status": "ok",
  "count": 3,
  "total": 3,
  "jobs": [
    {
      "responses_received": 1,
      "items_scraped": 2,
      "close_reason": "finished",
      "logs": 29,
      "tags": [],
      "spider": "somespider",
      "updated_time": "2015-11-09T15:21:06",
      "priority": 2,
      "state": "finished",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T15:20:25",
      "id": "123/45/14544",
      "errors_count": 0,
      "elapsed": 138399
    },
    {
      "responses_received": 1,
      "items_scraped": 2,
      "close_reason": "finished",
      "logs": 29,
      "tags": [
        "consumed"
      ],
      "spider": "somespider",
      "updated_time": "2015-11-09T14:21:02",
      "priority": 2,
      "state": "finished",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T14:20:25",
      "id": "123/45/14543",
      "errors_count": 0,
      "elapsed": 3433762
    },
    {
      "responses_received": 1,
      "items_scraped": 2,
      "close_reason": "finished",
      "logs": 29,
      "tags": [
        "consumed"
      ],
      "spider": "somespider",
      "updated_time": "2015-11-09T13:21:08",
      "priority": 2,
      "state": "finished",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T13:20:31",
      "id": "123/45/14542",
      "errors_count": 0,
      "elapsed": 7034158
    }
  ]
}

# Retrieve all running jobs
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/jobs/list.json?project=123&state=running"
{
  "status": "ok",
  "count": 2,
  "total": 2,
  "jobs": [
    {
      "responses_received": 483,
      "items_scraped": 22,
      "logs": 20,
      "tags": [],
      "spider": "somespider",
      "elapsed": 17442,
      "priority": 2,
      "state": "running",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T15:25:07",
      "id": "123/45/13140",
      "errors_count": 0,
      "updated_time": "2015-11-09T15:26:43"
    },
    {
      "responses_received": 207,
      "items_scraped": 207,
      "logs": 468,
      "tags": [],
      "spider": "someotherspider",
      "elapsed": 4085,
      "priority": 3,
      "state": "running",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T13:00:46",
      "id": "123/67/11952",
      "errors_count": 0,
      "updated_time": "2015-11-09T15:26:57"
    }
  ]
}

# Retrieve all jobs with the tag ``consumed``
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/jobs/list.json?project=123&lacks_tag=consumed"
{
  "status": "ok",
  "count": 3,
  "total": 3,
  "jobs": [
    {
      "responses_received": 208,
      "items_scraped": 208,
      "logs": 471,
      "tags": ["sometag"],
      "spider": "somespider",
      "elapsed": 1010,
      "priority": 3,
      "state": "running",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T13:00:46",
      "id": "123/45/11952",
      "errors_count": 0,
      "updated_time": "2015-11-09T15:28:27"
    },
    {
      "responses_received": 619,
      "items_scraped": 22,
      "close_reason": "finished",
      "logs": 29,
      "tags": ["sometag"],
      "spider": "someotherspider",
      "updated_time": "2015-11-09T15:27:20",
      "priority": 2,
      "state": "finished",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T15:25:07",
      "id": "123/67/13140",
      "errors_count": 0,
      "elapsed": 67409
    },
    {
      "responses_received": 3,
      "items_scraped": 20,
      "close_reason": "finished",
      "logs": 58,
      "tags": ["sometag", "someothertag"],
      "spider": "yetanotherspider",
      "updated_time": "2015-11-09T15:25:28",
      "priority": 2,
      "state": "finished",
      "version": "1447064100",
      "spider_type": "manual",
      "started_time": "2015-11-09T15:25:07",
      "id": "123/89/1627",
      "errors_count": 0,
      "elapsed": 179211
    }
  ]
}
```

### jobs/update.json

Updates information about jobs.

| Parameter   | Description                    | Required   |
|-------------|--------------------------------|------------|
| project     | Project ID.                    | Yes        |
| job         | Job ID.                        | Yes        |
| add_tag     | Add specified tag to job.      | No         |
| remove_tag  | Remove specified tag from job. | No         |

| Method   | Description             | Supported parameters              |
|----------|-------------------------|-----------------------------------|
| POST     | Update job information. | project, job, add_tag, remove_tag |

Example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://app.zyte.com/api/jobs/update.json -d project=123 -d job=123/1/2 -d add_tag=consumed
```

### jobs/delete.json

Deletes one or more jobs.

| Parameter   | Description   | Required   |
|-------------|---------------|------------|
| project     | Project ID.   | Yes        |
| job         | Job ID.       | Yes        |

| Method   | Description    | Supported parameters   |
|----------|----------------|------------------------|
| POST     | Delete job(s). | project, job           |

Example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://app.zyte.com/api/jobs/delete.json -d project=123 -d job=123/1/2 -d job=123/1/3
```

### jobs/stop.json

Stops one running job.

| Parameter   | Description   | Required   |
|-------------|---------------|------------|
| project     | Project ID.   | Yes        |
| job         | Job ID.       | Yes        |

| Method   | Description   | Supported parameters   |
|----------|---------------|------------------------|
| POST     | Stop job.     | project, job           |

Example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://app.zyte.com/api/jobs/stop.json -d project=123 -d job=123/1/1
```

## Comments API

The comments API lets you add comments directly to scraped data, which can later be viewed on the items page.

### Comment object

| Field    | Description                            |
|----------|----------------------------------------|
| id       | Comment ID.                            |
| created  | Created date.                          |
| archived | Archived date.                         |
| author   | Comment author.                        |
| avatar   | User gravatar URL.                     |
| text     | Comment text                           |
| editable | If set to true, comment can be edited. |

### comments/:comment_id

Edits or archives a comment.

| Parameter   | Description   | Required   |
|-------------|---------------|------------|
| comment_id  | Comment ID.   | Yes        |
| text        | Comment text. | PUT        |

| Method   | Description          | Supported Parameters   |
|----------|----------------------|------------------------|
| PUT      | Update comment text. | comment_id, text       |
| DELETE   | Delete comment.      | comment_id             |

PUT example:

```
$ curl -X PUT -u YOUR_SCRAPY_CLOUD_API_KEY: --data 'text=my+new+text' "https://app.zyte.com/api/comments/12"
```

DELETE example:

```
$ curl -X DELETE -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/12"
```

### comments/:project_id/:spider_id/:job_id

Retrieves all comments for a job indexed by item or item/field.

Example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12"
{
    "0": [comment, comment, ...],
    "0/title": [comment, comment, ...],
    "12/url": [comment, comment, ...],
}
```

Where `comment` is a comment object as defined above.

### comments/:project_id/stats

Retrieves the number of items with unarchived comments for each job of the project.

Example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/51/stats"
{
    "51/422/2": 1,
    "51/414/2": 1,
    "51/421/2": 1,
    "51/423/2": 4,
    "51/413/3": 3,
    "51/418/2": 1
}
```

### comments/:project_id/:spider_id/:job_id/:item_no[/:field]

Retrieves, updates or archives comments.

| Parameter   | Description   | Required   |
|-------------|---------------|------------|
| text        | Comment text. | POST       |

| Method   | Description                                        | Supported parameters   |
|----------|----------------------------------------------------|------------------------|
| GET      | Retrieve comments for an item or field.            |                        |
| POST     | Update the specified comments with the given text. | text                   |
| DELETE   | Archive the specified comment.                     |                        |

GET examples:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12/11"
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12/11/logo"
```

POST examples:

```
$ curl -X POST --data 'text=some+text' -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12/11"
$ curl -X POST --data 'text=some+text' -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12/11/logo"
```

DELETE examples:

```
$ curl -X DELETE -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12/11"
$ curl -X DELETE -u YOUR_SCRAPY_CLOUD_API_KEY: "https://app.zyte.com/api/comments/14/13/12/11/logo"
```

## JobQ API

The JobQ API allows you to retrieve finished jobs from the queue.

> ###### NOTE
>
> Most of the features provided by the API are also available through the
> python-scrapinghub client library.

### jobq/:project_id/count

Count the jobs for the specified project.

| Parameter   | Description                                                | Required   |
|-------------|------------------------------------------------------------|------------|
| spider      | Filter results by spider name.                             | No         |
| state       | Filter results by state (pending/running/finished/deleted) | No         |
| startts     | UNIX timestamp at which to begin results, in milliseconds. | No         |
| endts       | UNIX timestamp at which to end results, in milliseconds.   | No         |
| has_tag     | Filter results by existing tags                            | No         |
| lacks_tag   | Filter results by missing tags                             | No         |
> ###### HINT
>
> It’s possible to repeat `has_tag`, `lacks_tag` multiple times. In this case `has_tag` works as an `OR` operation, while `lacks_tag` works as an `AND` operation.

HTTP (assuming only 2 jobs, where 1st one is marked with `tagA`, 2nd - with `tagB`):

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://jobq.zyte.com/jobq/53/count"
2
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://jobq.zyte.com/jobq/53/count?has_tag=tagA&has_tag=tagB"
2
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://jobq.zyte.com/jobq/53/count?lacks_tag=tagA&lacks_tag=tagB"
0
```

| Method   | Description                           | Supported parameters                              |
|----------|---------------------------------------|---------------------------------------------------|
| GET      | Count jobs for the specified project. | spider, state, startts, endts, has_tag, lacks_tag |

#### Examples

**Count jobs for a given project**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://jobq.zyte.com/jobq/53/count
32110
```

### jobq/:project_id/list

Lists the jobs for the specified project, in order from most recent to last.

| Field   | Description                                       |
|---------|---------------------------------------------------|
| ts      | The time at which the job was added to the queue. |

| Parameter   | Description                                                | Required   |
|-------------|------------------------------------------------------------|------------|
| spider      | Filter results by spider name.                             | No         |
| state       | Filter results by state (pending,running,finished,deleted) | No         |
| startts     | UNIX timestamp at which to begin results, in milliseconds. | No         |
| endts       | UNIX timestamp at which to end results, in milliseconds.   | No         |
| count       | Limit results by a given number of jobs                    | No         |
| start       | Skip N first jobs from results                             | No         |
| stop        | The job key at which to stop showing results.              | No         |
| key         | Get job data for a given set of job keys                   | No         |
| has_tag     | Filter results by existing tags                            | No         |
| lacks_tag   | Filter results by missing tags                             | No         |

| Method   | Description                          | Supported parameters   |
|----------|--------------------------------------|------------------------|
| GET      | List jobs for the specified project. | startts, endts, stop   |

#### Examples

**List jobs for a given project**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://jobq.zyte.com/jobq/53/list
{"key":"53/7/81","ts":1397762393489}
{"key":"53/7/80","ts":1395111612849}
{"key":"53/7/78","ts":1393972804722}
{"key":"53/7/77","ts":1393972734215}
```

**List jobs finished between two timestamps**

If you pass the `startts` and `endts` parameters, the API will return only the jobs finished between them.

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://jobq.zyte.com/jobq/53/list?startts=1359774955431&endts=1359774955440"
{"key":"53/6/7","ts":1359774955439}
{"key":"53/3/3","ts":1359774955437}
{"key":"53/9/1","ts":1359774955431}
```

**Retrieve jobs finished after some job**

JobQ returns the list of jobs, with the most recently finished first. We recommend associating the key of the most recently finished job with the downloaded data. When you want to update your data later on, you can list the jobs and stop at the previously downloaded job, through the `stop` parameter.

Using HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://jobq.zyte.com/jobq/53/list?stop=53/7/81"
{"key":"53/7/83","ts":1403610146780}
{"key":"53/7/82","ts":1397827910849}
```

## Job metadata API

The Job metadata API allows you to get metadata for the given jobs.

> ###### NOTE
>
> Most of the features provided by the API are also available through the
> python-scrapinghub client library.

### jobs/:project_id/:spider_id/:job_id[/:field_name]

Retrieve job data or specific meta field.

#### Examples

**Get metadata for the job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/jobs/1/2/3

{
    "close_reason": "finished",
    "completed_by": "jobrunner",
    "deploy_id": 1,
    "finished_time": 1566311833872,
    "pending_time": 1566311800654,
    "priority": 2,
    "project": 1,
    "running_time": 1566311801163,
    "scheduled_by": "testuser",
    "scrapystats": {
        "downloader/request_bytes": 594,
        "downloader/request_count": 2,
        "downloader/request_method_count/GET": 2,
        "downloader/response_bytes": 1866,
        "downloader/response_count": 2,
        "downloader/response_status_count/200": 1,
        "downloader/response_status_count/404": 1,
        "elapsed_time_seconds": 3.211014,
        "finish_reason": "finished",
        "finish_time": 1566311822568.0,
        "item_scraped_count": 1,
        "log_count/DEBUG": 3,
        "log_count/INFO": 11,
        "log_count/WARNING": 1,
        "memusage/max": 72433664,
        "memusage/startup": 72433664,
        "response_received_count": 2,
        "robotstxt/request_count": 1,
        "robotstxt/response_count": 1,
        "robotstxt/response_status_count/404": 1,
        "scheduler/dequeued": 1,
        "scheduler/dequeued/disk": 1,
        "scheduler/enqueued": 1,
        "scheduler/enqueued/disk": 1,
        "start_time": 1566311819357.0
    },
    "spider": "testspider",
    "spider_args": {"arg1": "val1", "arg2": "val2"},
    "spider_type": "manual",
    "started_by": "jobrunner",
    "state": "finished",
    "tags": [
        "tag1",
        "tag2"
    ],
    "units": 2,
    "version": "6d32f52-master"
}
```

> ###### WARNING
>
> Please consider the example response with caution. Some of the fields
> appear only on specific conditions: for example, after finishing/deleting
> or restoring a job. Some other fields highly depend on the given spider/job
> configuration. There also might be some additional fields for internal use
> only which can be changed at any given moment without prior notice.

**Get specific metadata field for the job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/jobs/1/2/3/tags

[
    "tag1",
    "tag2"
]
```

## Items API

> ###### NOTE
>
> Even though these APIs support writing, they are most often used for reading. The crawlers running on Scrapinghub cloud are the ones that write to these endpoints. However, both operations are documented here for completion.

The Items API lets you interact with the items stored in the hubstorage backend for your projects. For example, you can download all the items for the job `'53/34/7'` through:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7
```

> ###### NOTE
>
> Most of the features provided by the API are also available through the
> python-scrapinghub client library.

### Item object

| Field            | Description                                                   |
|------------------|---------------------------------------------------------------|
| \_type           | The item definition.                                          |
| \_template       | The template matched against. Portia only.                    |
| \_cached_page_id | Cached page ID. Used to identify the scraped page in storage. |

Scraped fields will be top level alongside the internal fields listed above.

### items/:project_id[/:spider_id][/:job_id][/:item_no][/:field_name]

Retrieve or insert items for a project, spider, or job. Where `item_no` is the index of the item.

| Parameter   | Description                                                        | Required   |
|-------------|--------------------------------------------------------------------|------------|
| format      | Results format. See api-overview-resultformats.                    | No         |
| meta        | Meta keys to show.                                                 | No         |
| nodata      | If set, no data will be returned other than specified `meta` keys. | No         |
> ###### NOTE
>
> Pagination and meta parameters are supported, see api-overview-pagination and api-overview-metapar.

| Header        | Description                                                |
|---------------|------------------------------------------------------------|
| Content-Range | Can be used to specify a start index when inserting items. |

| Method   | Description                                         | Supported parameters   |
|----------|-----------------------------------------------------|------------------------|
| GET      | Retrieve items for a given project, spider, or job. | format, meta, nodata   |
| POST     | Insert items for a given job                        | N/A                    |
> ###### NOTE
>
> Please always use pagination parameters (`start`, `startafter` and `count`) to limit amount of items in response to prevent timeouts and different performance issues. See pagination examples below for more details.

#### Examples

**Retrieve all items from a given job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7
```

**Retrive first item from a given job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7/0
```

**Retrieve values from a single field**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7/1/fieldname
```

Here 1 is the Index_no of the Item for which the value is retrieved.

**Retrieve all items from a given spider**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34
```

**Retrieve all items from a given project**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/
```

**[Pagination] Retrieve first N items from a given job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7?count=10
```

**[Pagination] Retrieve N items from a given job starting from the given item**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7?count=10&start=53/34/7/20
```

**[Pagination] Retrieve N items from a given job starting from the item following to the given one**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7?count=10&startafter=53/34/7/19
```

**[Pagination] Retrieve a few items from a given job by their IDs**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7?index=5&index=6
```

**Get meta field from items**

To get only metadata from items, pass the `nodata=1` parameter along with the meta field that you want to get.

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://storage.zyte.com/items/53/1/7?meta=_key&nodata=1"
{"_key":"53/1/7/0"}
{"_key":"53/1/7/1"}
{"_key":"53/1/7/2"}
```

**Get items in a specific format**

Check the available formats in the api-overview-resultformats section at the API Overview.

JSON:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://storage.zyte.com/items/53/34/7?meta=_key&nodata=1 -H \"Accept: application/json\""
[{"_key":"28144/1/1/0"},{"_key":"28144/1/1/1"},{"_key":"28144/1/1/2"}, ...]
```

JSON Lines:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://storage.zyte.com/items/53/34/7?meta=_key&nodata=1 -H \"Accept: application/x-jsonlines\""
{"_key":"28144/1/1/0"}
{"_key":"28144/1/1/1"}
{"_key":"28144/1/1/2"}
...
```

**Add items to a job via POST**

Add the items stored in the file `items.jl` (JSON lines format) to the job `53/34/7`:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7 -X POST -T items.jl
```

Use the `Content-Range` header to specify a start index:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7 -X POST -T items.jl -H "content-range: items 500-/*"
```

The API will only return `200` if the data was successfully stored. There’s no limit on the amount of data you can send, but a `HTTP 413` response will be returned if any single item is over 1M.

### items/:project_id/:spider_id/:job_id/stats

Retrieve the item stats for a given job.

| Field               | Description                                |
|---------------------|--------------------------------------------|
| counts[field]       | The number of times the field was scraped. |
| totals.input_bytes  | The total size of all items in bytes.      |
| totals.input_values | The total number of items.                 |

| Parameter   | Description                       | Required   |
|-------------|-----------------------------------|------------|
| all         | Include hidden fields in results. | No         |

| Method   | Description                                | Supported parameters   |
|----------|--------------------------------------------|------------------------|
| GET      | Retrieve item stats for the specified job. | all                    |

#### Example

**Get the stats from a given job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/items/53/34/7/stats
{"counts":{"field1":9350,"field2":514},"totals":{"input_bytes":14390294,"input_values":10000}}
```

## Logs API

The logs API lets you work with logs from your crawls.

### Log object

| Field   | Description                                      | Required   |
|---------|--------------------------------------------------|------------|
| message | Log message.                                     | Yes        |
| level   | Integer log level as defined in the table below. | Yes        |
| time    | UNIX timestamp of the message, in milliseconds.  | No         |

#### Log levels

|   Value | Log level   |
|---------|-------------|
|      10 | DEBUG       |
|      20 | INFO        |
|      30 | WARNING     |
|      40 | ERROR       |
|      50 | CRITICAL    |

### logs/:project_id/:spider_id/:job_id

Retrieve or upload logs for a given job.

| Parameter   | Description                                     | Required   |
|-------------|-------------------------------------------------|------------|
| format      | Results format. See api-overview-resultformats. | No         |
> ###### NOTE
>
> Pagination and meta parameters are supported, see api-overview-pagination and api-overview-metapar.

| Method   | Description    | Supported parameters   |
|----------|----------------|------------------------|
| GET      | Retrieve logs. | format                 |
| POST     | Upload logs.   |                        |

#### Retrieving logs

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/logs/1111111/1/1/
{"time":1444822757227,"level":20,"message":"Log opened."}
{"time":1444822757229,"level":20,"message":"[scrapy.log] Scrapy 1.0.3.post6+g2d688cd started"}
```

#### Submitting logs

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/logs/53/34/7 -X POST -T log.jl
```

## Requests API

The requests API allows you to work with request and response data from your crawls.

> ###### NOTE
>
> Most of the features provided by the API are also available through the
> python-scrapinghub client library.

### Request object

| Field    | Description                             | Required   |
|----------|-----------------------------------------|------------|
| time     | Request start timestamp in milliseconds | Yes        |
| method   | HTTP method. Default: GET               | Yes        |
| url      | Request URL.                            | Yes        |
| status   | HTTP response code.                     | Yes        |
| duration | Request duration in milliseconds.       | Yes        |
| rs       | Response size in bytes.                 | Yes        |
| parent   | The index of the parent request.        | No         |
| fp       | Request fingerprint.                    | No         |
> ###### NOTE
>
> Seed requests from start URLs will have no parent field.

### requests/:project_id[/:spider_id][/:job_id][/:request_no]

Retrieve or insert request data for a project, spider or job, where `request_no` is the index of the request.

| Parameter   | Description                                                        | Required   |
|-------------|--------------------------------------------------------------------|------------|
| format      | Results format. See api-overview-resultformats.                    | No         |
| meta        | Meta keys to show.                                                 | No         |
| nodata      | If set, no data will be returned other than specified `meta` keys. | No         |
> ###### NOTE
>
> Pagination and meta parameters are supported, see api-overview-pagination and api-overview-metapar.

### requests/:project_id/:spider_id/:job_id

#### Examples

**Get the requests from a given job**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/requests/53/34/7
{"parent":0,"duration":12,"status":200,"method":"GET","rs":1024,"url":"http://scrapy.org/","time":1351521736957}
```

**Adding requests**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/requests/53/34/7 -X POST -T requests.jl
```

### requests/:project_id/:spider_id/:job_id/stats

Retrieve request stats for a given job.

| Field               | Description                              |
|---------------------|------------------------------------------|
| counts[field]       | The number of times the field occurs.    |
| totals.input_bytes  | The total size of all requests in bytes. |
| totals.input_values | The total number of requests.            |

#### Example

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/requests/53/34/7/stats
{"counts":{"url":21,"parent":19,"status":21,"method":21,"rs":21,"duration":21,"fp":21},"totals":{"input_bytes":2397,"input_values":21}}
```

## Activity API

Scrapinghub keeps track of certain project events such as when spiders are run
or new spiders are deployed. This activity log can be accessed in the dashboard by
clicking on **Activity** in the left sidebar, or programmatically through the
API described below.

### activity/:project_id

Retrieve messages for a specified project. Results are returned in reverse order.

| Parameter   | Description                          | Required   |
|-------------|--------------------------------------|------------|
| count       | Maximum number of results to return. | No         |

| Method   | Description                                     | Supported parameters   |
|----------|-------------------------------------------------|------------------------|
| GET      | Returns the messages for the specified project. | count                  |
| POST     | Creates a message.                              |                        |

GET example:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/activity/1111111/?count=2
{"event":"job:completed","job":"1111111/3/4","user":"jobrunner"}
{"event":"job:cancelled","job":"1111111/3/4","user":"example"}
```

POST example:

```
$ curl -d '{"foo": 2}' https://storage.zyte.com/activity/1111111/
{"foo":4}
{"foo":3}
```

### activity/projects

Retrieve messages for multiple projects.

Results are returned in reverse order.

| Parameter   | Description                                                 | Required   |
|-------------|-------------------------------------------------------------|------------|
| count       | Maximum number of results to return.                        | No         |
| p           | Project ID. Multiple values supported.                      | No         |
| pcount      | Maximum number of results to return per project.            | No         |
| meta        | Meta parameter to add to results. See api-overview-metapar. | No         |

| Method   | Description                                      | Supported parameters   |
|----------|--------------------------------------------------|------------------------|
| GET      | Returns the messages for the specified projects. | count, p, pcount, meta |

GET example:

```
# Retrieve a single result for projects 1111111 and 2222222, using the ``meta`` parameter to include the project ID in the results.
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/activity/projects/?pcount=1&meta=_project&p=1111111&p=2222222
{"_project": 2222222, "bar": 1}
{"_project": 1111111, "foo": 4}
```

## Collections API

*Collections* are key-value stores for an arbitrary large
number of records. They are especially useful to store information
produced and/or used by multiple scraping jobs.

> ###### NOTE
>
> The frontier API is best suited to store queues of URLs
> to be processed by scraping jobs.

### Quickstart

A **collection** is identified by a *project id*, a *type*, and a *name*.
A **record** can be any JSON dictionary. They are identified by a `_key` field.

*In the following, we use project id* `78`  *, the regular storage type* `s`
*for the collection named* `my_collection`.

> ###### NOTE
>
> Avoid using multiple collections with the same name and different types like `/s/my_collection` and `/cs/my_collection`. During operations on an entire collection, like renaming or deleting, Hubstorage will treat homonyms as a single entity and rename or delete both.

#### Create/Update a record:

```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: -X POST -d '{"_key": "foo", "value": "bar"}' \
    https://storage.zyte.com/collections/78/s/my_collection
```

#### Access a record:

```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: -X GET \
    https://storage.zyte.com/collections/78/s/my_collection/foo
```

#### Delete a record:

```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: -X DELETE \
    https://storage.zyte.com/collections/78/s/my_collection/foo
```

#### List records:

```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: -X GET \
    https://storage.zyte.com/collections/78/s/my_collection
```

#### Create/Update multiple records:

We use the `jsonline` format by default (json objects separated by a newline):

```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: -X POST -d $'{"_key": "foo", "value": "bar"}\n{"_key": "goo", "value": "baz"}' \
    https://storage.zyte.com/collections/78/s/my_collection
```

### Details

The following collection types are available:

| Type   | Full name             | Hubstorage method          | Description                                                      |
|--------|-----------------------|----------------------------|------------------------------------------------------------------|
| s      | store                 | new_store                  | Basic set store                                                  |
| cs     | cached store          | new_cached_store           | Items expire after a month                                       |
| vs     | versioned store       | new_versioned_store        | Up to 3 copies of each item will be retained                     |
| vcs    | versioned cache store | new_versioned_cached_store | Multiple copies are retained, and each one expires after a month |
> ###### NOTE
>
> Avoid using multiple collections with the same name and different types like `/s/my_collection` and `/cs/my_collection`. During operations on an entire collection, like renaming or deleting, Hubstorage will treat homonyms as a single entity and rename or delete both.

Records are `JSON` objects, with the following constraints:

- Their serialized size can’t be larger than `1 MB`;
- Javascript’s `inf` values are not supported;
- Floating-point numbers can’t be larger than `2^64 - 1`.

### API

#### collections/:project_id/list

List all collections.

```shell
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/collections/78/list
{"type":"s","name":"my_collection"}
{"type":"s","name":"my_collection_2"}
{"type":"cs","name":"my_other_collection"}
```

#### collections/:project_id/:type/:collection

Read, write or remove items in a collection.

| Parameter   | Description                                                     | Required   |
|-------------|-----------------------------------------------------------------|------------|
| key         | Read items with a specified key. Multiple values are supported. | No         |
| prefix      | Read items with a specified key prefix.                         | No         |
| prefixcount | Maximum number of values to return per prefix.                  | No         |
| startts     | UNIX timestamp at which to begin results, in milliseconds.      | No         |
| endts       | UNIX timestamp at which to end results, in milliseconds.        | No         |

| Method   | Description                                 | Supported parameters                     |
|----------|---------------------------------------------|------------------------------------------|
| GET      | Read items from the specified collection.   | key, prefix, prefixcount, startts, endts |
| POST     | Write items to the specified collection.    |                                          |
| DELETE   | Delete items from the specified collection. | key, prefix, prefixcount, startts, endts |
> ###### NOTE
>
> Pagination and meta parameters are supported,
> see api-overview-pagination and api-overview-metapar.

GET examples:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://storage.zyte.com/collections/78/s/my_collection?key=foo1&key=foo2"
{"value":"bar1"}
{"value":"bar2"}
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/collections/78/s/my_collection?prefix=f
{"value":"bar"}
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: "https://storage.zyte.com/collections/78/s/my_collection?startts=1402699941000&endts=1403039369570"
{"value":"bar"}
```

Prefix filters, unlike other filters, use indexes and should be used
when possible. You can use the `prefixcount` parameter to limit the
number of values returned for each prefix.

A common pattern is to download changes within a certain time period.
You can use the `startts` and `endts` parameters to select records
within a certain time window.

The current timestamp can be retrieved like so:

```
$ curl https://storage.zyte.com/system/ts
1403039369570
```

> ###### NOTE
>
> Timestamp filters may perform poorly when selecting a small number
> of records from a large collection.

#### collections/:project_id/:type/:collection/count

Count the number of items in a collection.

```shell
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/collections/78/s/my_collection/count
{"count":972,"scanned":972}%
```

If the collection is large, the result may contain a `nextstart` field that
is used for pagination, see api-overview-pagination.

#### collections/:project_id/:type/:collection/:item

Read Write or Delete an individual item.

| Method   | Description                        |
|----------|------------------------------------|
| GET      | Read the item with the given key   |
| POST     | Write the item with the given key  |
| DELETE   | Delete the item with the given key |
```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/collections/78/s/my_collection/foo
{"value":"bar"}
```

#### collections/:project_id/:type/:collection/:item/value

Read an individual item value.

```shell
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/collections/78/s/my_collection/foo/value
bar
```

#### collections/:project_id/:type/:collection/deleted

`POST` with a list of item keys to delete them.

> ###### NOTE
>
> This endpoint is designed to delete a large number of
> non-consecutive items. To delete consecutive items use
> `DELETE`-based endpoints, which are faster.

```shell
$ curl -u $YOUR_SCRAPY_CLOUD_API_KEY: -X POST -d '"foo"' -d '"bar"' \
    https://storage.zyte.com/collections/78/s/my_collection/deleted
```

#### collections/:project_id/delete?name=:collection

Delete an entire collection immediately.

```shell
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: -X POST https://storage.zyte.com/collections/78/delete?name=my_collection
```

#### collections/:project_id/rename?name=:collection&new_name=:new_name

Rename a collection and move all its items immediately.

```shell
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: -X POST https://storage.zyte.com/collections/rename?name=my_collection&new_name=my_collection_renamed
```

## Frontier API

The *Hub Crawl Frontier* (HCF) stores pages visited and outstanding requests to
make. It can be thought of as a persistent shared storage for a crawl scheduler.

Web pages are identified by a fingerprint. This can be the URL of the page, but
crawlers may use any other string (e.g. a hash of post parameters, if it
processes post requests), so there is no requirement for the fingerprint to be
a valid URL.

A project can have many frontiers and each frontier is broken down into slots.
A separate priority queue is maintained per slot. This means that requests
from each slot can be prioritized separately and crawled at different rates and
at different times.

Arbitrary data can be stored in both the crawl queue and with the set of
fingerprints.

A typical example would be to use the URL as a fingerprint and the hostname as
a slot. The crawler should ensure that each host is only crawled from one
process at any given time so that politeness can be maintained.

> ###### NOTE
>
> Most of the features provided by the API are also available through the
> python-scrapinghub client library.

### Batch object

| Field    | Description                  |
|----------|------------------------------|
| id       | Batch ID.                    |
| requests | An array of request objects. |

### Request object

| Field   | Description                                                          | Required   |
|---------|----------------------------------------------------------------------|------------|
| fp      | Request fingerprint.                                                 | Yes        |
| qdata   | Data to be stored along with the fingerprint in the request queue.   | No         |
| fdata   | Data to be stored along with the fingerprint in the fingerprint set. | No         |
| p       | Priority: lower priority numbers are returned first. Defaults to 0.  | No         |

### /hcf/:project_id/:frontier/s/:slot

| Field    | Description                                      |
|----------|--------------------------------------------------|
| newcount | The number of new requests that have been added. |

| Method   | Description                               | Supported parameters   |
|----------|-------------------------------------------|------------------------|
| POST     | Enqueues a request in the specified slot. | fp, qdata, fdata, p    |
| DELETE   | Deletes the specified slot.               |                        |

#### POST examples

**Add a request to the frontier**

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: -d '{"fp":"/some/path.html"}'  \
    https://storage.zyte.com/hcf/78/test/s/example.com
{"newcount":1}
```

**Add requests with additional parameters**

By using the same priority as request depth, the website can be traversed in breadth-first order from the starting URL.

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: -d $'{"fp":"/"}\n{"fp":"page1.html", "p": 1, "qdata": {"depth": 1}}' \
    https://storage.zyte.com/hcf/78/test/s/example.com
{"newcount":2}
```

#### DELETE example

The example belows delete the slot `example.com` from the frontier.

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: -X DELETE https://storage.zyte.com/hcf/78/test/s/example.com/
```

### /hcf/:project_id/:frontier/s/:slot/q

Retrieve requests for a given slot.

| Parameter   | Description                                 | Required   |
|-------------|---------------------------------------------|------------|
| mincount    | The minimum number of requests to retrieve. | No         |

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/hcf/78/test/s/example.com/q
{"id":"00013967d8af7b0001","requests":[["/",null]]}
{"id":"01013967d8af7e0001","requests":[["page1.html",{"depth":1}]]}
```

### /hcf/:project_id/:frontier/s/:slot/q/deleted

Delete a batch of requests.

Once a batch has been processed, clients should indicate that the batch is completed so that it will be removed and no longer returned when new batches are requested.

This can be achieved by posting the IDs of the completed batches:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: -d '"00013967d8af7b0001"' https://storage.zyte.com/hcf/78/test/s/example.com/q/deleted
```

You can specify the IDs as arrays or single values. As with the previous examples, multiple lines of input is accepted.

### /hcf/:project_id/:frontier/s/:slot/f

Retrieve fingerprints for a given slot.

#### Example

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/hcf/78/test/s/example.com/f
{"fp":"/"}
{"fp":"page1.html"}
```

Results are ordered lexicographically by fingerprint value.

### /hcf/:project_id/list

Lists the frontiers for a given project.

#### Example

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/hcf/78/list
["test"]
```

### /hcf/:project_id/:frontier/list

Lists the slots for a given frontier.

#### Example

HTTP:

```
$ curl -u YOUR_SCRAPY_CLOUD_API_KEY: https://storage.zyte.com/hcf/78/test/list
["example.com"]
```

## Scrapy Cloud Write Entrypoint

> ###### NOTE
>
> This is the documentation of a low-level protocol that most Scrapy Cloud users don’t need to deal with. For more high-level documentation and user guides check the [Help Center](https://support.zyte.com/support/home).

Scrapy Cloud Write Entrypoint is a write-only interface to Scrapy Cloud storage. Its main purpose is to
make it easy to write crawlers and scripts compatible with Scrapy Cloud in different programming languages
using [custom Docker images](https://support.zyte.com/support/solutions/articles/22000200425-deploying-custom-docker-images-on-scrapy-cloud).

Jobs in Scrapy Cloud run inside Docker containers. When a Job container is started, a [named pipe](http://man7.org/linux/man-pages/man7/fifo.7.html) is created
at the location stored in the `SHUB_FIFO_PATH` environment variable. To interface with Scrapy Cloud storage,
your crawler has to open this named pipe and write messages on it, following a simple text-based protocol
as described below.

### Protocol

Each message is a line of ASCII characters terminated by a newline character. Message consists of
the following parts:

- a 3-character command (one of “ITM”, “LOG”, “REQ”, “STA”, or “FIN”),
- followed by a space character,
- then followed by a payload as a [JSON](http://json.org/) object,
- and a final newline character `\n`.

This is how example log message will look like:

```
LOG {"time": 1485269941065, "level": 20, "message": "Some log message"}
```

This example and all the following examples omit the trailing newline character because it’s
a non-printable character. This is how you would write the above example message in Python:

```python
pipe.write('LOG {"time": 1485269941065, "level": 20, "message": "Some log message"}\n')
pipe.flush()
```

Newline characters are used as message separators. So, make sure that the serialized JSON object payload
doesn’t contain newline characters between key/value pairs and that newline characters inside strings
for both keys and values are properly escaped, i.e an actual `\` (reverse solidus, backslash), followed by `n`.
Here’s an example of two consecutive log messages which carry a multiline messages in the payload:

```
LOG {"time": 1485269941065, "level": 20, "message": "First multiline message. Line 1\nLine 2"}
LOG {"time": 1485269941066, "level": 30, "message": "Second multiline message. Line 1\nLine 2"}
```

In Python this will look like this:

```python
pipe.write('LOG {"time": 1485269941065, "level": 20, "message": "First multiline message. Line 1\\nLine 2"}\n')
pipe.write('LOG {"time": 1485269941066, "level": 20, "message": "Second multiline message. Line 1\\nLine 2"}\n')
pipe.flush()
```

Unicode characters in JSON object MUST be escaped using standard JSON u four-hex-digits syntax,
e.g. item `{"ключ": "значение"}` should look like this:

```
ITM {"\u043a\u043b\u044e\u0447": "\u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0435"}
```

The total size of the message MUST not exceed 1 MiB. For messages that exceed this size
the error will be logged instead.

#### ITM command

The `ITM` command writes a single item into Scrapy Cloud storage.
`ITM` payload has not predefined schema.

Example:

```
ITM {"key": "value"}
```

To support very simple scripts the Scrapy Cloud Write Entrypoint allows sending plain JSON objects as items,
i.e. without the 3-character command and space prefix. The following two messages are valid and equivalent:

```
ITM {"key": "value"}
```

```
{"key": "value"}
```

#### LOG command

The `LOG` command writes a single log message into Scrapy Cloud storage.
The schema for the `LOG` payload is described in log-object.

Example:

```
LOG {"level": 20, "message": "Some log message"}
```

#### REQ command

The `REQ` command writes a single request into Scrapy Cloud storage.
The schema for the `REQ` payload is described in request-object.

Example:

```
REQ {"url": "http://example.com", "method": "GET", "status": 200, "rs": 10, "duration": 20}
```

#### STA command

`STA` stands for stats and is used to populate the job stats page and to create graphs on the job details page.

| Field   | Description                                     | Required   |
|---------|-------------------------------------------------|------------|
| time    | UNIX timestamp of the message, in milliseconds. | No         |
| stats   | JSON object with arbitrary keys and values.     | Yes        |

If following keys are present in the `STA` payload – their values will be used to populate
Scheduled Requests graph on a job details page:

- `scheduler/enqueued`
- `scheduler/dequeued`

The key names above were picked for compatibility with [Scrapy stats](https://doc.scrapy.org/en/latest/topics/stats.html).

Example:

```
STA {"time": 1485269941065, "stats": {"key": 0, "key2": 20.5, "scheduler/enqueued": 20, "scheduler/dequeued": 15}}
```

#### FIN command

The `FIN` command is used to set the outcome of a crawler execution, once it’s finished.

| Field   | Description                                              | Required   |
|---------|----------------------------------------------------------|------------|
| outcome | String with custom outcome message, limited to 255 chars | Yes        |

Example:

```
FIN {"outcome": "finished"}
```

### Printing to stdout and stderr

The output printed by a job in Scrapy Cloud is automatically converted into log messages. Lines printed
to `stdout` are converted into `INFO` level log messages. Lines printed to `stderr` are converted
into `ERROR` level log messages. For example, if the script prints `Hello, world` to stdout,
the resulting [LOG command]() will look like this:

```
LOG {"time": 1485269941065, "level": 20, "message": "Hello, world"}
```

There’s very basic support for multiline standard output – if some output consists of multiple lines
where first line starts with a non-space character and subsequent lines start with a space character,
it would be considered as a single log entry. For example, the following traceback in stderr:

```
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'e' is not defined
```

will produce the following log messages:

```
LOG {"time": 1485269941065, "level": 40, "message": "Traceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>"}
LOG {"time": 1485269941066, "level": 40, "message": "NameError: name 'e' is not defined"}
```

Resulting log messages are subject to 1 MiB limit – this means that output longer than 1023 KiB
is likely to cause errors.

> ###### WARNING
>
> Even though you can write log messages by printing them to stdout and stderr, we recommend you
> to use the named pipe and `LOG` message instead. Due to the way data is sent between processes,
> it is not possible to maintain the order of the messages coming from different sources
> (named pipe, stdout, stderr). Exclusive usaged of the named pipe will both give the best performance
> and guarantee that messages are received in exactly the same order they were sent.

### How to build a compatible crawler

Scripts or non-Scrapy spiders have to be deployed as [custom Docker images](https://support.zyte.com/support/solutions/articles/22000200425-deploying-custom-docker-images-on-scrapy-cloud).

Each spider needs to follow the pattern:

1. Get the path to the named pipe mentioned earlier from `SHUB_FIFO_PATH` environment variable.
2. Open named pipe for writing. E.g. in Python you do it like this:
   ```python
   import os

   path = os.environ['SHUB_FIFO_PATH']
   pipe = open(path, 'w')
   ```
3. Write [messages]() to the pipe. If you want to send a message instantly, you have to flush the stream,
   otherwise it may remain in the file buffer inside the crawler process. However this is not always required
   as buffer will be flushed once enough data is written or when file object is closed
   (depends on the programming language you use):
   ```python
   # write item
   pipe.write('ITM {"a": "b"}\n')
   pipe.flush()
   # ...
   # write request
   pipe.write('REQ {"time": 1484337369817, "url": "http://example.com", "method": "GET", "status": 200, "rs": 10, "duration": 20}\n')
   pipe.flush()
   # ...
   # write log entry
   pipe.write('LOG {"time": 1484337369817, "level": 20, "message": "Some log message"}\n')
   pipe.flush()
   # ...
   # write stats
   pipe.write('STA {"time": 1485269941065, "stats": {"key": 0, "key2": 20.5}}\n')
   pipe.flush()
   # ...
   # set outcome
   pipe.write('FIN {"outcome": "finished"}\n')
   pipe.flush()
   ```
4. Close the named pipe when the crawl is finished:
   ```python
   pipe.close()
   ```

> ###### NOTE
>
> [scrapinghub-entrypoint-scrapy](https://github.com/scrapinghub/scrapinghub-entrypoint-scrapy/blob/master/sh_scrapy/writer.py) uses Scrapy Cloud Write Entrypoint, check the code if you need an example.

## Frequently Asked Questions about Scrapy Cloud

### Can I use browser automation?

Of course!

We recommend using zapi-browser, to enjoy automatic ban
avoidance and a powerful API. If you already have browser
automation code, see zapi-browser-automation.

Alternatively, you can use a third-party service or remote tool. If you are a
paying customer, you can use
Docker to run a browser automation tool like Playwright,
Puppeteer, Selenium or Splash alongside your spider.

### How can I configure an unlisted spider setting?

When adding a setting to your spider settings, select the first entry from the
drop-down list of settings, **Custom Name**. You can then enter the name and
value of your setting.

> ###### TIP
>
> You can also use the **Raw Settings** tab to edit all of your spider
> settings as plain text.

### Do I have to use Zyte API?

While Zyte API and Scrapy Cloud work
great together, they are separate products that you can use independently.

### Can I use third-party services in my spider code?

Yes, you can.

### What does “cancelled (stalled)” mean in my job’s outcome?

It means that your job was automatically cancelled by Scrapy Cloud because it
did not produce any output (logs, requests or items) for one hour.

This outcome may indicate that your spider is stuck or not configured correctly,
we recommend checking the job’s logs to investigate the issue.

### What does “killed by oom” mean in my job’s outcome?

It means that your job was killed by the operating system’s out-of-memory (OOM)
killer because it exceeded the memory available to its unit(s).

To fix this, consider:

- Using more units to give your job more memory. Each unit
  provides 1 GB of memory.
- Reducing your spider’s memory usage. For example, lower
  `SCRAPER_SLOT_MAX_ACTIVE_SIZE` to limit how many response bodies
  are held in memory at once.

## Scrapy Cloud pricing

On signup, you get the following for free:

- A low-resource unit, with half as many
  resources as a regular unit.
- Your job data can be retained for up to 7 days before deletion.
- Your jobs can run for up to 1 hour.

If you purchase 1 or more units ($9/month per unit) you get the following:

- Your low-resource unit is replaced by your purchased units, each with twice as many resources.
- Your job data can be retained for up to 120 days before deletion.
- Your jobs have unlimited run time.
- You can schedule jobs and use Docker.

> ###### TIP
>
> Students get the benefits of purchasing 1 unit for free! [Learn more](https://www.zyte.com/scrapy-cloud-student-backpack/).

## Coding Agent Add-Ons

The following free-to-use Coding Agent Add-Ons help you write, run and test web
scraping code faster, with no vendor lock-in (they work with and without Zyte
services):

> ##### Zyte Web Data for Claude Code
>
> A **Claude Code** plugin for web scraping.

> ##### Web Scraping Copilot
>
> A **Visual Studio Code** extension for web scraping.

### Comparison

|                         | Zyte Web Data for Claude Code                           | Web Scraping Copilot                                                 |
|-------------------------|---------------------------------------------------------|----------------------------------------------------------------------|
| Supported harnesses     | [Claude Code](https://code.claude.com/docs/en/overview) | [Visual Studio Code](https://code.visualstudio.com/)                 |
| AI approach             | [Agent skills](https://agentskills.io/)                 | Custom agent and tools                                               |
| `Spider` class creation | Yes                                                     | No                                                                   |
| UI tooling              | Web app for reviewing schemas and extracted data        | Interactive tree views, test management, Scrapy Cloud job monitoring |

### Automatic extraction vs coding agents

zapi-extract is a feature of Zyte API that can extract
structured data from any URL.

zapi-extract is designed to be robust to site changes, which you cannot
get with Coding Agent Add-Ons. It increases per-request cost and response
times, but it can save you time and effort in the long run by eliminating the
need to write and maintain parsing code.

On the other hand, Coding Agent Add-Ons can be more flexible and cost-effective
for many use cases, and while generated code may break when sites change, they
make it easier and faster to adapt.

You should choose the approach that best fits your needs. It is also possible
to combine both, or for example use generated code by default and
zapi-extract as fallback for website changes.

## Zyte Web Data for Claude Code

**Zyte Web Data for Claude Code** is a free-to-use [Claude Code](https://code.claude.com/docs/en/overview) plugin powered by [agent skills](https://agentskills.io) for web scraping.

> ##### Install
>
> Install Zyte Web Data for Claude Code.

> ##### Tutorial
>
> Take your first steps with Zyte Web Data for Claude Code.

## Install Zyte Web Data for Claude Code

To install **Zyte Web Data for Claude Code**:

1. [Install Claude Code](https://code.claude.com/docs/en/quickstart).
2. Add the Claude Skills marketplace repository and install the plugin by
   running the following in a terminal:
   ```shell
   claude plugin marketplace add zyte-ai/claude-skills
   claude plugin install zyte-web-data@zyte-ai
   ```

   See [https://github.com/zyte-ai/claude-skills](https://github.com/zyte-ai/claude-skills) for details.

Follow the tutorial to learn more.

## Web Scraping Copilot

**Web Scraping Copilot** is a free [Visual Studio Code](https://code.visualstudio.com/) extension by [Zyte](https://www.zyte.com/)
that helps you generate web scraping code with [GitHub Copilot](https://github.com/features/copilot). It
streamlines working with Scrapy projects and includes
optional integration with Scrapy Cloud, making it easier
to deploy and monitor your web scraping jobs.

> ##### Requirements
>
> Find out what you need in order to use Web Scraping Copilot.

> ##### Install
>
> Install Web Scraping Copilot.

> ##### Features
>
> Discover the features of Web Scraping Copilot.

> ##### FAQ
>
> Find all the answers about Web Scraping Copilot.

> ##### Tutorial
>
> Follow the tutorial to learn the AI-assisted web scraping workflow.

> ##### User interface
>
> Learn about the user interface of Web Scraping Copilot.

## Web Scraping Copilot requirements

Web Scraping Copilot requires [Visual Studio Code 1.109+](https://code.visualstudio.com/Download).

> ###### TIP
>
> All other requirements can be installed and set up with the help of
> Web Scraping Copilot, but are listed below for reference.

### Minimum requirements

The core features require a Scrapy project and a Python virtual environment
that meets the following requirements:

- [Python 3.10+](https://www.python.org/downloads/)
- [Scrapy 2.7.0+](https://www.scrapy.org/download)
- [itemadapter 0.13.0+](https://github.com/scrapy/itemadapter)
- zyte-common-items 0.29.0+

### Code generation requirements

Code generation requires:

- [GitHub Copilot](https://github.com/features/copilot) and its [chat
  extension](https://marketplace.visualstudio.com/items?itemName=GitHub.copilot-chat).

  While any GitHub Copilot plan is technically supported, the limited
  requests of the Free plan can run out quickly, so Pro or better is
  recommended.
- The following additional packages in your virtual environment:
  - web-poet 0.22.0+
  - scrapy-poet 0.26.0+ (requires setup)
  - pytest 7.0.0+

## Install Web Scraping Copilot

[Install Web Scraping Copilot](vscode:extension/zyte.web-scraping) on Visual
Studio Code ([marketplace](https://marketplace.visualstudio.com/items?itemName=zyte.web-scraping)), open the sidebar view, and follow the setup
instructions.

Follow the tutorial to learn more.

### Troubleshooting

#### The MCP server fails to start

If the MCP server fails to start, check its output by opening **View › Command
Palette… › MCP: List Servers › Web Scraping Copilot › Show Output**.

If you see `realpath: command not found` in the output, chances are you are
running macOS 12 (Monterey). macOS 12 is end-of-life, consider upgrading to a
newer macOS version. If you cannot upgrade, [install realpath](https://ports.macports.org/port/realpath/).

#### Other issues

If you cannot find your issue in this list, or the proposed workarounds do not
work for you, please [report it](https://github.com/zytedata/web-scraping-copilot/issues).

## Web Scraping Copilot features

Web Scraping Copilot provides the following features:

### Code generation

Generate maintainable web scraping code with [GitHub Copilot](https://github.com/features/copilot).

![image](_static/copilot/ai-workflow-0.1.0.gif)

See copilot-tutorial.

### Test management

Compare extracted data to expectations, including expected exceptions, check
target pages in the embedded browser, and more.

![image](_static/copilot/test-management-1.0.0.gif)

### Project setup

Start a new project in seconds.

![image](_static/copilot/new-project-1.0.0.gif)

### Scrapy Cloud integration

If you use Scrapy Cloud, you can deploy your spiders with
a click, and monitor cloud jobs from the spiders view.

![image](_static/copilot/scrapy-cloud-integration-0.1.0.png)

## Web Scraping Copilot FAQ

These are some frequently asked questions about Web Scraping Copilot:

### How much does Web Scraping Copilot cost?

**Web Scraping Copilot** in itself is **free**.

To use **code generation**, you do need a [GitHub Copilot](https://github.com/features/copilot) plan. The Free plan
is not recommended because you would spend your requests rather quickly.

To use Scrapy Cloud features, you need a Scrapy Cloud
account. The free plan is fine, though.

### Does the extension use AI from Zyte?

No — your [GitHub Copilot](https://github.com/features/copilot) AIs are used.

The extension provides instructions and prompts, and the MCP server tools use
[MCP sampling](https://modelcontextprotocol.io/specification/2025-11-25/client/sampling) to start separate chats in the background to handle the
different steps of code generation.

To control which models can be used by the MCP server, open the [Command
Palette](https://code.visualstudio.com/docs/getstarted/userinterface#_command-palette) (`Ctrl + Shift + P`) and select **MCP: List Servers › Web Scraping
Copilot › Configure Model Access**.

### Is my code sent to Zyte?

The code generation workflow that the extension facilitates does not send any
code to Zyte, only to [GitHub Copilot](https://github.com/features/copilot).

Scrapy Cloud deployment, if used, does upload your code
to Scrapy Cloud.

### Which model is best for code generation?

The model you use in the **main chat** should be somewhat smart, since workflow
management can be hard for smaller models. We recommend something like
**GPT‑5**, although GPT‑5 mini has shown good results in our tests.

The MCP web scraping **tools**, to generate expectations and code, are designed
to work well enough with models for which [GitHub Copilot](https://github.com/features/copilot) paid plans (Pro or
better) allow unlimited requests, like **GPT‑5 mini**. Given the number of
requests that those tools can generate, it could be very costly to use a
smarter model.

If you don’t mind the extra cost, however, [Claude Sonnet 4.6 offers great
quality](https://www.zyte.com/blog/llm-benchmark-claude-sonnet-46/).

## Web Scraping Copilot user interface

### Sidebar

The Web Scraping Copilot sidebar provides access to the main features of the
extension, organized in the following views: Items, Page Objects, Spiders, Zyte API,
Extension Status, and Feedback.

> The **Items** view provides details and access to all your Scrapy
> items.
>
> Specifically, it shows items:
>
> - Defined in your `items.py` file.
> - Declared as output items by your page objects.
> - Provided by the installed version of zyte-common-items.

> The **Page Objects** view provides details and access to all your
> web-poet page-objects.
>
> It also makes it easy to use AI to add new page objects or to update
> some or all fields of existing page objects. See
> copilot-tutorial.
>
> It also helps run, view, manage and debug tests for your page objects.

> The **Spiders** view provides access to all your Scrapy spiders, and helps you to run them locally.
>
> It also makes it easy to deploy them to Scrapy Cloud and monitor their jobs.

> The **Zyte API** view helps you set up scrapy-zyte-api to use Zyte API.

> The **Extension Status** view helps you set up the requirements of Web Scraping Copilot, and check if they are
> met.

> The **Feedback** view provides a link to [our issue tracker](https://github.com/zytedata/web-scraping-copilot/issues).
>
> It may sometimes also include links to surveys or other feedback
> channels.

## Installing the Zyte CA certificate

On some operating systems or web browsers, using Zyte services like Zyte
API may require installing the Zyte CA certificate.

You can tell that is your case when every attempt to use a given Zyte service
results in an error about SSL certificate verification.

To install the Zyte CA certificate, get it and follow the
instructions below for your operating system or web browser.

### Get the Zyte CA certificate

Download the certificate from [here](https://docs.zyte.com/_static/zyte-ca.crt).

### Operating systems

#### Windows 10

1. Press the `Win key + R` hotkey and input `mmc` in Run to open the Microsoft Management Console window.
2. Click `File` and select `Add/Remove Snap-ins`.
3. In the opened window select `Certificates` and press the `Add >` button.
4. In the Certificates Snap-in window select `Computer account > Local Account`, and press the `Finish` button to close the window.
5. Press the `OK` button in the Add or Remove Snap-in window.
6. Back in the Microsoft Management Console window, select `Certificates` under Console Root and  right-click `Trusted Root Certification Authorities`.
7. From the context menu select `All Tasks > Import` to open the Certificate Import Wizard window from which you can add the Zyte CA certificate.

More details can be found [here](https://windowsreport.com/install-windows-10-root-certificates/).

#### macOS

1. Install Python certificates:
   ```bash
   /Applications/Python\ 3.x/Install\ Certificates.command
   ```

   > ###### NOTE
   >
   > Replace `3.x` with your Python version.
2. Install the Zyte CA certificate:
   1. Open Keychain Access window (`Launchpad > Other > Keychain Access`).
   2. Select `System` tab under Keychains, drag and drop the downloaded
      certificate file (or select File > `Import Items...` and navigate to
      the file).
   3. Enter the administrator password to modify the keychain.
   4. Double-click the `Crawlera CA` certificate entry, expand Trust, next
      to When using this certificate: select `Always Trust`.
   5. Close the window and enter the administrator password again to update
      the settings.

#### Linux

1. Install the downloaded Zyte CA certificate file:
   ```bash
   sudo cp zyte-ca.crt /usr/local/share/ca-certificates/zyte-ca.crt
   ```
2. Update stored Certificate Authority files:
   ```bash
   sudo update-ca-certificates
   ```

### Web browsers

#### Firefox

1. Open Preferences, visit Privacy & Security tab, scroll down to the Certificates section, click View Certificates… button to open Certificate Manager.
2. Under Authorities tab click Import… button, navigate to the certificate file.
3. In the opened window (You have been asked to trust a new Certificate Authority (CA)) check the first option Trust this CA to identify websites and click the OK button to finish importing the certificate.
4. Click the OK button to save settings and exit Certificate Manager.

#### Chrome

1. Click the triple-dot icon in the top right corner and choose `Settings`.
2. Scroll to the `Privacy and security` section and click on `Security`.
3. Scroll down to find and click `Manage Certificates`.
4. The next steps will depend on the operating system. In the case of macOS, the previous action will open the `Keychain Access`, see ca-macos above. In the case of Windows, the Certificates application should appear, select `Trusted Root Certification Authorities` tab, click `Import...` button, navigate to the certificate file, verify the import was successful and the installed certificate is displayed under Trusted Root Certification Authorities tab, close the Certificates window.

### Tech stacks

#### Node.js

Point the `NODE_EXTRA_CA_CERTS` environment variable to the Zyte CA
certificate.

#### Python

To use [requests](https://requests.readthedocs.io/en/latest/), build a CA bundle and point the
`REQUESTS_CA_BUNDLE` environment variable to it.

### Alternative files

#### CA bundle

Sometimes you cannot specify an *extra* certificate, like the Zyte CA
certificate, and instead you must specify a CA certificate *bundle* that
*includes* the Zyte CA certificate.

To generate such a CA certificate bundle:

1. Get a generic CA certificate bundle in PEM format, e.g. [curl’s](https://curl.se/ca/cacert.pem).
2. Append the contents of the Zyte CA certificate to the end
   of the generic CA certificate bundle file.

You can then use the resulting file as a CA certificate bundle that supports
Zyte domains.

#### PKCS#12

In case of requiring a certificate with PKCS#12 format, you can generate it with the following OpenSSL command:

```bash
openssl pkcs12 -export -nokeys -password pass: -in zyte-ca.crt -out zyte-ca.p12
```