Migrating from browser automation to Zyte API#

Learn how to migrate from browser automation tools, like Playwright, Puppeteer, Selenium, or Splash, to Zyte API.

Feature comparison#

The following table summarizes the feature differences between Zyte API and browser automation tools:

Feature

Zyte API

Browser automation

API

HTTP

Varies

Generic interaction

Good

Better

Website-specific interaction

Basic

None

Avoid bans

Yes

Hard

Scalable

Yes

Hard

Avoiding bans#

A browser automation tool on its own can rarely avoid being banned from a website. You need to combine browser automation with a proxy management solution like Smart Proxy Manager.

An increasing number of automated banning solutions can tell the difference between a browser automation tool and a regular browser and ban the former, requiring you to tweak your browser automation tools further, sometimes requiring website-specific tweaks.

You also need to make sure that your browser automation tool and your proxy management solution play well with each other. If the way your browser automation runs JavaScript and the way your proxy management solution formats request metadata do not match a specific browser profile, you can get banned.

Scalability#

Browser automation has a very negative impact on crawl speed.

To crawl at a reasonable speed, you need to handle many requests in parallel, managing your browser automation instances and their resource usage. Which can be hard to do at scale.

Migration examples#

The following sections each shows an example of a common browser automation functionality implemented using a browser automation tool, followed by an example of the same functionality implemented using Zyte API.

The browser automation tool examples feature the following software choices:

For the software choices of Zyte API examples, see Zyte API code examples.

Getting browser HTML#

const playwright = require('playwright')

async function main () {
  const browser = await playwright.chromium.launch()
  const page = await browser.newPage()
  await page.goto('https://toscrape.com')
  const browserHtml = await page.content()
  await browser.close()
}

main()
const puppeteer = require('puppeteer')

async function main () {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://toscrape.com')
  const browserHtml = await page.content()
  await browser.close()
}

main()
from scrapy import Request, Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield Request(
            "https://toscrape.com",
            meta={"playwright": True},
        )

    def parse(self, response):
        browser_html: str = response.text
from scrapy import Spider
from scrapy_splash import SplashRequest


class ToScrapeSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield SplashRequest("https://toscrape.com")

    def parse(self, response):
        browser_html: str = response.text
from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://toscrape.com")
browser_html = driver.page_source
driver.close()
from urllib.parse import quote

import requests

splash_url = "YOUR_SPLASH_URL"
url = "https://toscrape.com"
response = requests.get(f"{splash_url}/render.html?url={quote(url)}")
browser_html: str = response.content.decode()

Note

Install and configure code example requirements to run the example below.

using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"browserHtml", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
input.json#
{"url": "https://toscrape.com", "browserHtml": true}
curl \
    --user YOUR_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .browserHtml
input.jsonl#
{"url": "https://toscrape.com", "browserHtml": true}
zyte-api input.jsonl 2> /dev/null \
| jq --raw-output .browserHtml
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "browserHtml", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    try (CloseableHttpClient client = HttpClients.createDefault()) {
      try (CloseableHttpResponse response = client.execute(request)) {
        HttpEntity entity = response.getEntity();
        String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
        JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
        String browserHtml = jsonObject.get("browserHtml").getAsString();
      }
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    browserHtml: true
  },
  {
    auth: { username: 'YOUR_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
})
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'browserHtml' => true,
    ],
]);
$api = json_decode($response->getBody());
$browser_html = $api->browserHtml;
import requests

api_response = requests.post(
    'https://api.zyte.com/v1/extract',
    auth=('YOUR_API_KEY', ''),
    json={
       'url': 'https://toscrape.com',
        'browserHtml': True,
    },
)
browser_html: str = api_response.json()['browserHtml']
import asyncio

from zyte_api.aio.client import AsyncClient

async def main():
    client = AsyncClient()
    api_response = await client.request_raw(
        {
            'url': 'https://toscrape.com',
            'browserHtml': True,
        }
    )
    browser_html: str = api_response['browserHtml']

asyncio.run(main())
from scrapy import Request, Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                },
            },
        )

    def parse(self, response):
        browser_html: str = response.text

See Browser HTML.

Taking a screenshot#

const playwright = require('playwright')

async function main () {
  const browser = await playwright.chromium.launch()
  const context = await browser.newContext({ viewport: { width: 1920, height: 1080 } })
  const page = await context.newPage()
  await page.goto('https://toscrape.com')
  const screenshot = await page.screenshot({ type: 'jpeg' })
  await browser.close()
}

main()
const puppeteer = require('puppeteer')

async function main () {
  const browser = await puppeteer.launch({ defaultViewport: { width: 1920, height: 1080 } })
  const page = await browser.newPage()
  await page.goto('https://toscrape.com')
  const screenshot = await page.screenshot({ type: 'jpeg' })
  await browser.close()
}

main()
from scrapy import Request, Spider
from scrapy_playwright.page import PageMethod


class ToScrapeSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "playwright": True,
                "playwright_context": "new",
                "playwright_context_kwargs": {
                    "viewport": {"width": 1920, "height": 1080},
                },
                "playwright_page_methods": [
                    PageMethod("screenshot", type="jpeg"),
                ],
            },
        )

    def parse(self, response):
        screenshot: bytes = response.meta["playwright_page_methods"][0].result
from scrapy import Spider
from scrapy_splash import SplashRequest


class ToScrapeSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield SplashRequest(
            "https://toscrape.com",
            endpoint="render.jpeg",
            args={
                "viewport": "1920x1080",
            },
        )

    def parse(self, response):
        screenshot: bytes = response.body
from io import BytesIO
from tempfile import NamedTemporaryFile

from PIL import Image
from selenium import webdriver

# https://stackoverflow.com/a/37183295
def set_viewport_size(driver, width, height):
    window_size = driver.execute_script(
        """
        return [window.outerWidth - window.innerWidth + arguments[0],
          window.outerHeight - window.innerHeight + arguments[1]];
        """,
        width,
        height,
    )
    driver.set_window_size(*window_size)


def get_jpeg_screenshot(driver):
    f = NamedTemporaryFile(suffix=".png")
    driver.save_screenshot(f.name)
    f.seek(0)
    image = Image.open(f)
    rgb_image = image.convert("RGB")
    image_io = BytesIO()
    rgb_image.save(image_io, format="JPEG")
    return image_io.getvalue()


driver = webdriver.Firefox()
set_viewport_size(driver, 1920, 1080)
driver.get("https://toscrape.com")
screenshot = get_jpeg_screenshot(driver)
driver.close()
from urllib.parse import quote

import requests

splash_url = "YOUR_SPLASH_URL"
url = "https://toscrape.com"
response = requests.get(f"{splash_url}/render.jpeg?url={quote(url)}&viewport=1920x1080")
screenshot: bytes = response.content

Note

Install and configure code example requirements to run the example below.

using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"screenshot", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64Screenshot = data.RootElement.GetProperty("screenshot").ToString();
var screenshot = System.Convert.FromBase64String(base64Screenshot);
input.json#
{"url": "https://toscrape.com", "screenshot": true}
curl \
    --user YOUR_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .screenshot \
| base64 --decode \
> screenshot.jpg
input.jsonl#
{"url": "https://toscrape.com", "screenshot": true}
zyte-api input.jsonl 2> /dev/null \
| jq --raw-output .screenshot \
| base64 --decode \
> screenshot.jpg
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "screenshot", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    try (CloseableHttpClient client = HttpClients.createDefault()) {
      try (CloseableHttpResponse response = client.execute(request)) {
        HttpEntity entity = response.getEntity();
        String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
        JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
        String base64Screenshot = jsonObject.get("screenshot").getAsString();
        byte[] screenshot = Base64.getDecoder().decode(base64Screenshot);
      }
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    screenshot: true
  },
  {
    auth: { username: 'YOUR_API_KEY' }
  }
).then((response) => {
  const screenshot = Buffer.from(response.data.screenshot, 'base64')
})
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'screenshot' => true,
    ],
]);
$api = json_decode($response->getBody());
$screenshot = base64_decode($api->screenshot);
from base64 import b64decode

import requests

api_response = requests.post(
    'https://api.zyte.com/v1/extract',
    auth=('YOUR_API_KEY', ''),
    json={
        'url': 'https://toscrape.com',
        'screenshot': True,
    },
)
screenshot: bytes = b64decode(api_response.json()['screenshot'])
import asyncio
from base64 import b64decode

from zyte_api.aio.client import AsyncClient

async def main():
    client = AsyncClient()
    api_response = await client.request_raw(
        {
            'url': 'https://toscrape.com',
            'screenshot': True,
        }
    )
    screenshot: bytes = b64decode(api_response['screenshot'])

asyncio.run(main())
from base64 import b64decode

from scrapy import Request, Spider


class ToScrapeComSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "screenshot": True,
                },
            },
        )

    def parse(self, response):
        screenshot: bytes = b64decode(response.raw_api_response["screenshot"])

See Screenshot.

Consuming scroll-based pagination#

const cheerio = require('cheerio')
const playwright = require('playwright')

async function main () {
  const browser = await playwright.chromium.launch()
  const page = await browser.newPage()
  await page.goto('https://quotes.toscrape.com/scroll')
  await page.evaluate(async () => {
    const scrollInterval = setInterval(
      function () {
        const scrollingElement = (document.scrollingElement || document.body)
        scrollingElement.scrollTop = scrollingElement.scrollHeight
      },
      100
    )
    let previousHeight = null
    while (true) {
      const currentHeight = window.innerHeight + window.scrollY
      if (!previousHeight) {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      } else if (previousHeight === currentHeight) {
        clearInterval(scrollInterval)
        break
      } else {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      }
    }
  })
  const $ = cheerio.load(await page.content())
  const quoteCount = $('.quote').length
  await browser.close()
}

main()
const cheerio = require('cheerio')
const puppeteer = require('puppeteer')

async function main () {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://quotes.toscrape.com/scroll')
  await page.evaluate(async () => {
    const scrollInterval = setInterval(
      function () {
        const scrollingElement = (document.scrollingElement || document.body)
        scrollingElement.scrollTop = scrollingElement.scrollHeight
      },
      100
    )
    let previousHeight = null
    while (true) {
      const currentHeight = window.innerHeight + window.scrollY
      if (!previousHeight) {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      } else if (previousHeight === currentHeight) {
        clearInterval(scrollInterval)
        break
      } else {
        previousHeight = currentHeight
        await new Promise(resolve => setTimeout(resolve, 500))
      }
    }
  })
  const $ = cheerio.load(await page.content())
  const quoteCount = $('.quote').length
  await browser.close()
}

main()
from asyncio import sleep

from scrapy import Request, Spider


class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    def start_requests(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "playwright": True,
                "playwright_include_page": True,
            },
        )

    # Based on https://stackoverflow.com/a/69193325
    async def scroll_to_bottom(self, page):
        await page.evaluate(
            """
            var scrollInterval = setInterval(
                function () {
                    var scrollingElement = (document.scrollingElement || document.body);
                    scrollingElement.scrollTop = scrollingElement.scrollHeight;
                },
                100
            );
            """
        )
        previous_height = None
        while True:
            current_height = await page.evaluate(
                "(window.innerHeight + window.scrollY)"
            )
            if not previous_height:
                previous_height = current_height
                await sleep(0.5)
            elif previous_height == current_height:
                await page.evaluate("clearInterval(scrollInterval)")
                break
            else:
                previous_height = current_height
                await sleep(0.5)

    async def parse(self, response):
        page = response.meta["playwright_page"]
        await self.scroll_to_bottom(page)
        body = await page.content()
        response = response.replace(body=body)
        quote_count = len(response.css(".quote"))
        await page.close()
from scrapy import Spider
from scrapy_splash import SplashRequest

# Based on https://stackoverflow.com/a/40366442
SCROLL_TO_BOTTOM_LUA = """
function main(splash)
    local num_scrolls = 10
    local scroll_delay = 0.1

    local scroll_to = splash:jsfunc("window.scrollTo")
    local get_body_height = splash:jsfunc(
        "function() {return document.body.scrollHeight;}"
    )
    assert(splash:go(splash.args.url))

    for _ = 1, num_scrolls do
        scroll_to(0, get_body_height())
        splash:wait(scroll_delay)
    end
    return splash:html()
end
"""


class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    def start_requests(self):
        yield SplashRequest(
            "https://quotes.toscrape.com/scroll",
            endpoint="execute",
            args={"lua_source": SCROLL_TO_BOTTOM_LUA},
        )

    def parse(self, response):
        quote_count = len(response.css(".quote"))
from time import sleep

from parsel import Selector
from selenium import webdriver

# Based on https://stackoverflow.com/a/69193325
def scroll_to_bottom(driver):
    driver.execute_script(
        """
        var scrollInterval = setInterval(
            function () {
                var scrollingElement = (document.scrollingElement || document.body);
                scrollingElement.scrollTop = scrollingElement.scrollHeight;
            },
            100
        );
        """
    )
    previous_height = None
    while True:
        current_height = driver.execute_script(
            "return window.innerHeight + window.scrollY"
        )
        if not previous_height:
            previous_height = current_height
            sleep(0.5)
        elif previous_height == current_height:
            driver.execute_script("clearInterval(window.scrollInterval)")
            break
        else:
            previous_height = current_height
            sleep(0.5)


driver = webdriver.Firefox()
driver.get("https://quotes.toscrape.com/scroll")
scroll_to_bottom(driver)
selector = Selector(driver.page_source)
quote_count = len(selector.css(".quote"))
driver.close()
from urllib.parse import quote

import requests
from parsel import Selector

# Based on https://stackoverflow.com/a/40366442
SCROLL_TO_BOTTOM_LUA = """
function main(splash)
    local num_scrolls = 10
    local scroll_delay = 0.1

    local scroll_to = splash:jsfunc("window.scrollTo")
    local get_body_height = splash:jsfunc(
        "function() {return document.body.scrollHeight;}"
    )
    assert(splash:go(splash.args.url))

    for _ = 1, num_scrolls do
        scroll_to(0, get_body_height())
        splash:wait(scroll_delay)
    end
    return splash:html()
end
"""

splash_url = "YOUR_SPLASH_URL"
url = "https://quotes.toscrape.com/scroll"
response = requests.get(
    f"{splash_url}/execute?url={quote(url)}&lua_source={quote(SCROLL_TO_BOTTOM_LUA)}"
)
selector = Selector(text=response.content.decode())
quote_count = len(selector.css(".quote"))

Note

Install and configure code example requirements to run the example below.

using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using HtmlAgilityPack;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://quotes.toscrape.com/scroll"},
    {"browserHtml", true},
    {
        "actions",
        new List<Dictionary<string, object>>()
        {
            new Dictionary<string, object>()
            {
                {"action", "scrollBottom"}
            }
        }
    }
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var browserHtml = data.RootElement.GetProperty("browserHtml").ToString();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(browserHtml);
var navigator = htmlDocument.CreateNavigator();
var quoteCount = (double)navigator.Evaluate("count(//*[@class=\"quote\"])");
input.json#
{
    "url": "https://quotes.toscrape.com/scroll",
    "browserHtml": true,
    "actions": [
        {
            "action": "scrollBottom"
        }
    ]
}
curl \
    --user YOUR_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
| jq --raw-output .browserHtml \
| xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
input.jsonl#
{"url":"https://quotes.toscrape.com/scroll","browserHtml":true,"actions":[{"action":"scrollBottom"}]}
zyte-api input.jsonl 2> /dev/null \
| jq --raw-output .browserHtml \
| xmllint --html --xpath 'count(//*[@class="quote"])' - 2> /dev/null
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Collections;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

class Example {

  private static final String API_KEY = "YOUR_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> action = ImmutableMap.of("action", "scrollBottom");
    Map<String, Object> parameters =
        ImmutableMap.of(
            "url",
            "https://quotes.toscrape.com/scroll",
            "browserHtml",
            true,
            "actions",
            Collections.singletonList(action));
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    try (CloseableHttpClient client = HttpClients.createDefault()) {
      try (CloseableHttpResponse response = client.execute(request)) {
        HttpEntity entity = response.getEntity();
        String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
        JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
        String browserHtml = jsonObject.get("browserHtml").getAsString();
        Document document = Jsoup.parse(browserHtml);
        int quoteCount = document.select(".quote").size();
      }
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
const axios = require('axios')
const cheerio = require('cheerio')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://quotes.toscrape.com/scroll',
    browserHtml: true,
    actions: [
      {
        action: 'scrollBottom'
      }
    ]
  },
  {
    auth: { username: 'YOUR_API_KEY' }
  }
).then((response) => {
  const browserHtml = response.data.browserHtml
  const $ = cheerio.load(browserHtml)
  const quoteCount = $('.quote').length
})
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://quotes.toscrape.com/scroll',
        'browserHtml' => true,
        'actions' => [
            ['action' => 'scrollBottom'],
        ],
    ],
]);
$data = json_decode($response->getBody());
$doc = new DOMDocument();
$doc->loadHTML($data->browserHtml);
$xpath = new DOMXPath($doc);
$quote_count = $xpath->query("//*[@class='quote']")->count();
import json

import requests
from parsel import Selector

api_response = requests.post(
    'https://api.zyte.com/v1/extract',
    auth=('YOUR_API_KEY', ''),
    json={
        'url': 'https://quotes.toscrape.com/scroll',
        'browserHtml': True,
        'actions': [
            {
                'action': 'scrollBottom',
            },
        ],
    },
)
browser_html = api_response.json()['browserHtml']
quote_count = len(Selector(browser_html).css('.quote'))
import asyncio
import json

from parsel import Selector
from zyte_api.aio.client import AsyncClient

async def main():
    client = AsyncClient()
    api_response = await client.request_raw(
        {
            'url': 'https://quotes.toscrape.com/scroll',
            'browserHtml': True,
            'actions': [
                {
                    'action': 'scrollBottom',
                },
            ],
        },
    )
    browser_html = api_response['browserHtml']
    quote_count = len(Selector(browser_html).css('.quote'))

asyncio.run(main())
from scrapy import Request, Spider


class QuotesToScrapeComSpider(Spider):
    name = "quotes_toscrape_com"

    def start_requests(self):
        yield Request(
            "https://quotes.toscrape.com/scroll",
            meta={
                "zyte_api_automap": {
                    "browserHtml": True,
                    "actions": [
                        {
                            "action": "scrollBottom",
                        },
                    ],
                },
            },
        )

    def parse(self, response):
        quote_count = len(response.css(".quote"))

See Actions.