Migrating from ScrapingBee to Zyte API#

Learn how to migrate from ScrapingBee to Zyte API.

Feature comparison#

The following table summarizes the feature differences between both products:

Feature

ScrapingBee

Zyte API

Client software

Python, NodeJS

Python, Scrapy

Pricing

Fixed plans

Pay as you go
Monthly commitment over $100

Ban avoidance

Manual, may increase costs

Automatic, no extra costs

Automatic extraction

Google SERP, custom LLM prompts

Standard schemas including Google SERP, custom LLM prompts, supports crawling

Geolocation

243 countries, no data center support

249 countries, data center support

Sessions

Client-managed only (5m)

Client-managed (15m) and server-managed

Actions

Basic only (9)

Basic (15), advanced, website-specific and custom

Screenshots

Yes, can target an element

Yes, cannot target an element

Body size limit

2 MB

10 MB

Custom headers

Yes

Only in HTTP requests, limited to Referer in browser requests, cannot disable ban-avoidance headers

Ad blocking

Yes

No

Resource blocking

Yes

No

Custom proxies

Yes

No

Server-side CSS/XPath selectors

Yes

No

Rate limiting

Concurrency-based

RPM-based

Usage API

Yes, up to 6 requests per second

Yes, up to 20 requests per second

Pricing#

ScrapingBee offers 4 plans with a fixed price per month, each with a fixed number of “credits” per month that you have to spend on that month or lose.

With Zyte API you pay only for what you use, up to a $100 monthly spending limit. If you need a higher spending limit, you must commit to paying half as monthly commitment, which you do not get back if you spend less during a month.

With ScrapingBee, HTTP requests cost 1 credit each, while browser requests cost 5 credits each. If you need to use residential IPs (“premium proxies”) to avoid bans, costs raise to 10 credits per HTTP request (10×) and 25 credits per browser request (5×). For scenarios where residential IPs do not avoid bans either, ScrapingBee offers special “stealth” proxies for browser requests at 75 credits per request (15×). ScrapingBee also charges 20 credits when targetting Google domains.

With Zyte API, request cost varies depending not only on the type of request (HTTP or browser), but also on the tier of the target website, which covers the cost of any tech that Zyte API may use to get you a ban-free response, including browser rendering and residential IPs. No extra cost for Google domains; not even for automatic extraction of SERP (serp).

Unless you are never using premium or stealth proxies, you are targetting mostly high-tier websites, and the number of credits per month that you need is close to those included in one of ScrapingBee‘s plans, Zyte API tends to be a cheaper choice.

For example, the $49 ScrapingBee plan includes 150k credits, i.e. 150k HTTP requests. For tier 1-2 websites (i.e. most websites), Zyte API is cheaper. And Zyte API can also be cheaper for higher-tier websites if you need fewer than 150k requests: 114k requests for tier 3, 70k requests for tier 2, and 39k request for tier 5.

Ban handling#

ScrapingBee makes it your responsibility to choose the right technologies (browser rendering, residential IPs, “stealth IPs”) to avoid bans, with the corresponding cost increase.

Zyte API automatically chooses the leanest technology possible transparently, without any extra cost, and automatically adapting to website changes.

Automatic extraction#

ScrapingBee supports automatic extraction through user-defined LLM prompts.

Zyte API automatic extraction provides automatic extraction for supported types and user-defined LLM prompts to extract additional fields. It also supports automatic crawling.

Both ScrapingBee and Zyte API support Google SERP extraction (serp).

Rate limiting#

ScrapingBee limits the number of concurrent requests that you can send, starting at 5 with the most basic plan.

Zyte API limits the number of requests per minute (RPM) that you can send. It is 500 by default for all Zyte API keys, but you can get a higher limit for free.

For services like these that support advanced features like browser rendering or automatic extraction, which usually increase response times, RPM rate limiting allows you to maintain your throughput regardless of which features you use thanks to unlimited concurrency, while concurrency-based limits slow down your crawls as you use features that make requests slower.

For example, assuming an HTTP request takes 2 seconds and a browser request takes 20 seconds, switching from HTTP requests to browser requests with ScrapingBee would make your crawl 10 times slower, while Zyte API would allow you to maintain a similar crawl speed by using more concurrent requests to make up for the response time increase.

Migrating#

The main differences between the HTTP APIs of ScrapingBee and Zyte API are how request parameters are defined and how the response is encoded.

In ScrapingBee, you send a GET request, and you specify parameters in the URL query string, URL-encoded, e.g.

curl "https://app.scrapingbee.com/api/v1/?api_key=YOUR_API_KEY&url=https%3A%2F%2Ftoscrape.com"

The API response body comes straight from the target website:

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>

HTTP response headers and cookies from the target website are also received as regular headers and cookies, only prefixed with Spb-.

Spb-Content-Encoding: br
Spb-Content-Type: text/html

In Zyte API, you send a POST request, and you specify parameters in the request body as JSON, e.g.

Tip

Same as ScrapingBee, Zyte API offers a proxy mode that you can use instead of the HTTP API if it makes things simpler.

curl \
    --user YOUR_API_KEY: \
    --header 'Content-Type: application/json' \
    --data '{"url": "https://toscrape.com", "httpResponseBody": true, "httpResponseHeaders": true}' \
    --compressed \
    https://api.zyte.com/v1/extract

The API response is a JSON object with all the response data from the target website:

{
    "url": "https://toscrape.com/",
    "statusCode": 200,
    "httpResponseBody": "PCFET0NUWVBFIGh0bWw+CjxodG1sIGxhbmc9ImVuIj4KICAgIDx…",
    "httpResponseHeaders": [
        {
            "name": "content-type",
            "value": "text/html"
        },
        {
            "name": "content-encoding",
            "value": "br"
        }
    ]
}

Note

httpResponseBody is base64-encoded to support binary responses, like images or PDF files.

Once you understand how to migrate a simple request like the one above, you can migrate any other request the same way, replacing ScrapingBee parameters with Zyte API counterparts.

Parameter mapping#

ScrapingBee

Zyte API

(default)

httpResponseBody, httpResponseHeaders

api_key

Use basic authentication

url

url

render_js

browserHtml

js_scenario

See below

wait

waitForTimeout action (see below)

wait_for

waitForSelector action (see below)

wait_browser

waitForNavigation action (see below)

block_ads

Not supported

block_resources

Not supported

viewport_width

viewport

window_height

viewport

premium_proxy

ipType=residential (not required to avoid bans)

country_code

geolocation (does not require ipType=residential)

stealth_proxy

N/A, ban avoidance is a transparent feature

own_proxy

Not supported

forward_headers

customHttpRequestHeaders, requestHeaders

forward_headers_pure

Not supported

ai_query

customAttributes

ai_selector

Not supported

ai_extract_rules

customAttributes

extract_rules

Not supported

screenshot

screenshot

screenshot_selector

Not supported

screenshot_full_page

screenshotOptions.fullPage=true

json_response

See Network capture

return_page_source

Not supported (use httpResponseBody if you are only using browser rendering to avoid bans)

scraping_config

Not supported

session_id

session.id (must be UUID4)

timeout

Not supported

cookies

requestCookies

device

device

custom_google

N/A

transparent_status_code

N/A, Zyte API returns the response or not based on whether or not it is a ban, not based on the status code

Action mapping#

ScrapingBee allows defining a sequence of browser actions through the "instructions" JSON array of the js_scenario parameter. For example:

{
    "instructions": [
        {"click": "#buttonId"}
    ]
}

Which URL-encoded would become:

js_scenario=%7B%22instructions%22%3A+%5B%7B%22click%22%3A+%22%23buttonId%22%7D%5D%7D

The Zyte API equivalent is the actions field. The following is a matching example:

{
    "actions": [
        {
            "action": "click",
            "selector": {
                "type": "css",
                "value": "#buttonId"
            }
        }
    ]
}
These are ScrapingBee actions and their Zyte API counterparts:
click: click
evaluate: evaluate
fill: type
infinite_scroll: scrollBottom
scroll_x: scrollTo
scroll_y: scrollTo
wait: waitForTimeout
wait_for: waitForSelector
wait_for_and_click: waitForSelector, click
The following Zyte API actions are not supported by ScrapingBee:
doubleClick
goto
hide
hover
keyPress
reload
searchKeyword
select
setLocation
waitForRequest
waitForResponse

Zyte API also supports custom actions.