Migrating from ScrapingBee to Zyte API#
Learn how to migrate from ScrapingBee to Zyte API.
Feature comparison#
The following table summarizes the feature differences between both products:
Feature |
ScrapingBee |
Zyte API |
---|---|---|
Client software |
Python, NodeJS |
|
Fixed plans |
Pay as you go
Monthly commitment over $100
|
|
Manual, may increase costs |
Automatic, no extra costs |
|
Google SERP, custom LLM prompts |
Standard schemas including Google SERP, custom LLM prompts, supports crawling |
|
243 countries, no data center support |
249 countries, data center support |
|
Sessions |
Client-managed only (5m) |
Client-managed (15m) and server-managed |
Basic only (9) |
Basic (15), advanced, website-specific and custom |
|
Screenshots |
Yes, can target an element |
Yes, cannot target an element |
Body size limit |
2 MB |
10 MB |
Custom headers |
Yes |
Only in HTTP requests, limited to |
Ad blocking |
Yes |
No |
Resource blocking |
Yes |
No |
Custom proxies |
Yes |
No |
Server-side CSS/XPath selectors |
Yes |
No |
Concurrency-based |
RPM-based |
|
Usage API |
Yes, up to 6 requests per second |
Yes, up to 20 requests per second |
Pricing#
ScrapingBee offers 4 plans with a fixed price per month, each with a fixed number of “credits” per month that you have to spend on that month or lose.
With Zyte API you pay only for what you use, up to a $100 monthly spending limit. If you need a higher spending limit, you must commit to paying half as monthly commitment, which you do not get back if you spend less during a month.
With ScrapingBee, HTTP requests cost 1 credit each, while browser requests cost 5 credits each. If you need to use residential IPs (“premium proxies”) to avoid bans, costs raise to 10 credits per HTTP request (10×) and 25 credits per browser request (5×). For scenarios where residential IPs do not avoid bans either, ScrapingBee offers special “stealth” proxies for browser requests at 75 credits per request (15×). ScrapingBee also charges 20 credits when targetting Google domains.
With Zyte API, request cost varies depending not only on the type of request (HTTP or browser), but also on the tier of the target website, which covers the cost of any tech that Zyte API may use to get you a ban-free response, including browser rendering and residential IPs. No extra cost for Google domains; not even for automatic extraction of SERP (serp).
Unless you are never using premium or stealth proxies, you are targetting mostly high-tier websites, and the number of credits per month that you need is close to those included in one of ScrapingBee‘s plans, Zyte API tends to be a cheaper choice.
For example, the $49 ScrapingBee plan includes 150k credits, i.e. 150k HTTP requests. For tier 1-2 websites (i.e. most websites), Zyte API is cheaper. And Zyte API can also be cheaper for higher-tier websites if you need fewer than 150k requests: 114k requests for tier 3, 70k requests for tier 2, and 39k request for tier 5.
Ban handling#
ScrapingBee makes it your responsibility to choose the right technologies (browser rendering, residential IPs, “stealth IPs”) to avoid bans, with the corresponding cost increase.
Zyte API automatically chooses the leanest technology possible transparently, without any extra cost, and automatically adapting to website changes.
Automatic extraction#
ScrapingBee supports automatic extraction through user-defined LLM prompts.
Zyte API automatic extraction provides automatic extraction for supported types and user-defined LLM prompts to extract additional fields. It also supports automatic crawling.
Both ScrapingBee and Zyte API support Google SERP extraction (serp).
Rate limiting#
ScrapingBee limits the number of concurrent requests that you can send, starting at 5 with the most basic plan.
Zyte API limits the number of requests per minute (RPM) that you can send. It is 500 by default for all Zyte API keys, but you can get a higher limit for free.
For services like these that support advanced features like browser rendering or automatic extraction, which usually increase response times, RPM rate limiting allows you to maintain your throughput regardless of which features you use thanks to unlimited concurrency, while concurrency-based limits slow down your crawls as you use features that make requests slower.
For example, assuming an HTTP request takes 2 seconds and a browser request takes 20 seconds, switching from HTTP requests to browser requests with ScrapingBee would make your crawl 10 times slower, while Zyte API would allow you to maintain a similar crawl speed by using more concurrent requests to make up for the response time increase.
Migrating#
The main differences between the HTTP APIs of ScrapingBee and Zyte API are how request parameters are defined and how the response is encoded.
In ScrapingBee, you send a GET
request, and you specify parameters in
the URL query string, URL-encoded, e.g.
curl "https://app.scrapingbee.com/api/v1/?api_key=YOUR_API_KEY&url=https%3A%2F%2Ftoscrape.com"
The API response body comes straight from the target website:
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Scraping Sandbox</title>
…
HTTP response headers and cookies from the target website are also received as
regular headers and cookies, only prefixed with Spb-
.
Spb-Content-Encoding: br
Spb-Content-Type: text/html
In Zyte API, you send a POST
request, and you specify parameters in the
request body as JSON, e.g.
Tip
Same as ScrapingBee, Zyte API offers a proxy mode that you can use instead of the HTTP API if it makes things simpler.
curl \
--user YOUR_API_KEY: \
--header 'Content-Type: application/json' \
--data '{"url": "https://toscrape.com", "httpResponseBody": true, "httpResponseHeaders": true}' \
--compressed \
https://api.zyte.com/v1/extract
The API response is a JSON object with all the response data from the target website:
{
"url": "https://toscrape.com/",
"statusCode": 200,
"httpResponseBody": "PCFET0NUWVBFIGh0bWw+CjxodG1sIGxhbmc9ImVuIj4KICAgIDx…",
"httpResponseHeaders": [
{
"name": "content-type",
"value": "text/html"
},
{
"name": "content-encoding",
"value": "br"
}
]
}
Note
httpResponseBody is base64-encoded to support binary responses, like images or PDF files.
Once you understand how to migrate a simple request like the one above, you can migrate any other request the same way, replacing ScrapingBee parameters with Zyte API counterparts.
Parameter mapping#
ScrapingBee |
Zyte API |
---|---|
(default) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Not supported |
|
Not supported |
|
|
|
|
|
ipType=residential (not required to avoid bans) |
|
geolocation (does not require ipType=residential) |
|
N/A, ban avoidance is a transparent feature |
|
Not supported |
|
|
|
Not supported |
|
|
|
Not supported |
|
|
|
Not supported |
|
|
|
Not supported |
|
|
|
See Network capture |
|
Not supported (use httpResponseBody if you are only using browser rendering to avoid bans) |
|
Not supported |
|
session.id (must be UUID4) |
|
Not supported |
|
|
|
|
|
N/A |
|
N/A, Zyte API returns the response or not based on whether or not it is a ban, not based on the status code |
Action mapping#
ScrapingBee allows defining a sequence of browser actions through the
"instructions"
JSON array of the js_scenario
parameter. For example:
{
"instructions": [
{"click": "#buttonId"}
]
}
Which URL-encoded would become:
js_scenario=%7B%22instructions%22%3A+%5B%7B%22click%22%3A+%22%23buttonId%22%7D%5D%7D
The Zyte API equivalent is the actions field. The following is a matching example:
{
"actions": [
{
"action": "click",
"selector": {
"type": "css",
"value": "#buttonId"
}
}
]
}
click
: click
evaluate
: evaluate
fill
: type
infinite_scroll
: scrollBottom
scroll_x
: scrollTo
scroll_y
: scrollTo
wait
: waitForTimeout
wait_for
: waitForSelector
wait_for_and_click
: waitForSelector
, click
doubleClick
goto
hide
hover
keyPress
reload
searchKeyword
select
setLocation
waitForRequest
waitForResponse
Zyte API also supports custom actions.