Migrating from Smart Proxy Manager to Zyte API#

Learn how to migrate from Smart Proxy Manager to Zyte API.

Key differences#

The following table summarizes the feature differences between both products:

Feature

Smart Proxy Manager

Zyte API

API

Proxy

HTTP or proxy

Ban avoidance

Good

Great

Residential proxies

Add-on

Automatic

Session management

Client-managed

Server-managed (client-managed planned)

Smart geolocation

No

Yes

Browser HTML

No

Yes (HTTP API only, proxy mode support planned)

Screenshots

No

Yes (HTTP API only)

Browser actions

No

Yes (HTTP API only)

HTTP redirection

Not followed

Followed

User throttling

Concurrency-based

Request-based

See also Parameter mapping below for some additional, lower-level differences.

Ban avoidance#

Smart Proxy Manager does a good job at avoiding bans through proxy rotation, ban detection, retrying algorithms, and browser mimicking through browser profiles.

Zyte API improves on it by using an actual browser, if that is required to prevent bans on a particular website.

Zyte API also supports webpage interaction.

Residential proxies#

Zyte API supports both data center and residential IP addresses, and automatically chooses the right type of IP address as needed.

Session management#

Smart Proxy Manager supports client-managed sessions: you can create and reuse sessions that retain the same IP address and cookies.

Zyte API does not support client-side session management at the moment, it is a planned feature. However:

  • Session contexts enable setting prerequisites for server-managed sessions, addressing some of the scenarios for which Smart Proxy Manager session management is used.

  • Cookie reuse is often enough for session handling. In fact, Zyte API can already do better at session handling than Smart Proxy Manager for websites that require a browser to generate a valid session cookie.

  • For some scenarios using browser actions can make your code more future-proof, less likely to cause a ban, and require a single request on your end where you would need multiple requests otherwise.

Geolocation#

Both products let you choose which country of origin to use for a request.

However, with Zyte API you usually do not need to manually choose which country of origin to use for each request, because Zyte API automatically chooses the best country of origin based on the target website.

Smart Proxy Manager does support a richer list of countries of origin that you can set manually. However, if you let Zyte API choose the right country of origin, it can use additional countries not available for manual override.

For more information, see Geolocation.

Authentication#

You cannot use your Smart Proxy Manager API key for Zyte API, you need to get a separate API key to use Zyte API.

Proxy mode#

Zyte API offers a proxy mode. It is not as powerful as the HTTP API, but it makes it easier to migrate from Zyte Smart Proxy Manager.

To migrate, update your proxy endpoint and API key. You must also update your proxy headers as indicated below, unless you are using scrapy-zyte-smartproxy 2.3.1 or higher, which translates them automatically.

Warning

The proxy mode is not optimized for use in combination with browser automation tools. Consider using Zyte API’s browser automation features instead. See Migrating from browser automation to Zyte API.

The following example shows a basic request using Smart Proxy Manager:

using System;
using System.IO;
using System.Net;
using System.Text;

var proxy = new WebProxy("http://proxy.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_API_KEY", "");

var request = (HttpWebRequest)WebRequest.Create("https://toscrape.com");
request.Proxy = proxy;
request.PreAuthenticate = true;
request.AllowAutoRedirect = false;

var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
var httpResponseBody = reader.ReadToEnd();
reader.Close();
response.Close();

Console.WriteLine(httpResponseBody);
curl \
    --proxy proxy.zyte.com:8011 \
    --proxy-user YOUR_API_KEY: \
    --compressed \
    https://toscrape.com
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'proxy.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_API_KEY:@proxy.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_API_KEY:@proxy.zyte.com:8011"
        for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())

After you install and configure scrapy-zyte-smartproxy, you can use Scrapy as usual and all requests will be proxied through Smart Proxy Manager automatically.

from scrapy import Request, Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)

And this is an identical request using the proxy mode of Zyte API:

using System;
using System.IO;
using System.Net;
using System.Text;

var proxy = new WebProxy("http://api.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_API_KEY", "");

var request = (HttpWebRequest)WebRequest.Create("https://toscrape.com");
request.Proxy = proxy;
request.PreAuthenticate = true;
request.AllowAutoRedirect = false;

var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
var httpResponseBody = reader.ReadToEnd();
reader.Close();
response.Close();

Console.WriteLine(httpResponseBody);
curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_API_KEY: \
    --compressed \
    https://toscrape.com
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'api.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_API_KEY:@api.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_API_KEY:@api.zyte.com:8011" for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())

When using scrapy-zyte-smartproxy, set the ZYTE_SMARTPROXY_URL setting to "http://api.zyte.com:8011" and the ZYTE_SMARTPROXY_APIKEY setting to your API key for Zyte API.

Then you can continue using Scrapy as usual and all requests will be proxied through Zyte API automatically.

from scrapy import Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)

HTTP API#

To get access to all Zyte API features, the HTTP API is the way to go.

The main challenge then becomes switching from a proxy API to an HTTP API. Because Zyte API has a wider range of features and can hence provide a richer output, you need JSON parsing, and in some cases base64-decoding, to get your data.

Tip

If you are using scrapy-zyte-smartproxy (previously scrapy-crawlera), see scrapy-zyte-smartproxy migration below for in-depth migration details.

The following example shows a basic request using Smart Proxy Manager:

using System;
using System.IO;
using System.Net;
using System.Text;

var proxy = new WebProxy("http://proxy.zyte.com:8011", true);
proxy.Credentials = new NetworkCredential("YOUR_API_KEY", "");

var request = (HttpWebRequest)WebRequest.Create("https://toscrape.com");
request.Proxy = proxy;
request.PreAuthenticate = true;
request.AllowAutoRedirect = false;

var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
var httpResponseBody = reader.ReadToEnd();
reader.Close();
response.Close();

Console.WriteLine(httpResponseBody);
curl \
    --proxy proxy.zyte.com:8011 \
    --proxy-user YOUR_API_KEY: \
    --compressed \
    https://toscrape.com
const axios = require('axios')

axios
  .get(
    'https://toscrape.com',
    {
      proxy: {
        protocol: 'http',
        host: 'proxy.zyte.com',
        port: 8011,
        auth: {
          username: 'YOUR_API_KEY',
          password: ''
        }
      }
    }
  )
  .then((response) => {
    const httpResponseBody = response.data
    console.log(httpResponseBody)
  })
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'https://toscrape.com', [
    'proxy' => 'http://YOUR_API_KEY:@proxy.zyte.com:8011',
]);
$http_response_body = (string) $response->getBody();
fwrite(STDOUT, $http_response_body);
import requests

response = requests.get(
    "https://toscrape.com",
    proxies={
        scheme: "http://YOUR_API_KEY:@proxy.zyte.com:8011"
        for scheme in ("http", "https")
    },
)
http_response_body: bytes = response.content
print(http_response_body.decode())

After you install and configure scrapy-zyte-smartproxy, you can use Scrapy as usual and all requests will be proxied through Smart Proxy Manager automatically.

from scrapy import Request, Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        print(response.text)

And this is an identical request using the HTTP API of Zyte API:

Note

Install and configure code example requirements and the Zyte CA certificate to run the example below.

using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.All
};
HttpClient client = new HttpClient(handler);

var apiKey = "YOUR_API_KEY";
var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey + ":");
var auth = System.Convert.ToBase64String(bytes);
client.DefaultRequestHeaders.Add("Authorization", "Basic " + auth);

client.DefaultRequestHeaders.Add("Accept-Encoding", "br, gzip, deflate");

var input = new Dictionary<string, object>(){
    {"url", "https://toscrape.com"},
    {"httpResponseBody", true}
};
var inputJson = JsonSerializer.Serialize(input);
var content = new StringContent(inputJson, Encoding.UTF8, "application/json");

HttpResponseMessage response = await client.PostAsync("https://api.zyte.com/v1/extract", content);
var body = await response.Content.ReadAsByteArrayAsync();

var data = JsonDocument.Parse(body);
var base64HttpResponseBody = data.RootElement.GetProperty("httpResponseBody").ToString();
var httpResponseBody = System.Convert.FromBase64String(base64HttpResponseBody);
input.jsonl#
{"url": "https://toscrape.com", "httpResponseBody": true}
zyte-api input.jsonl \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
input.json#
{
    "url": "https://toscrape.com",
    "httpResponseBody": true
}
curl \
    --user YOUR_API_KEY: \
    --header 'Content-Type: application/json' \
    --data @input.json \
    --compressed \
    https://api.zyte.com/v1/extract \
    | jq --raw-output .httpResponseBody \
    | base64 --decode \
    > output.html
import com.google.common.collect.ImmutableMap;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Map;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.ContentType;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpHeaders;
import org.apache.hc.core5.http.ParseException;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;

class Example {
  private static final String API_KEY = "YOUR_API_KEY";

  public static void main(final String[] args)
      throws InterruptedException, IOException, ParseException {
    Map<String, Object> parameters =
        ImmutableMap.of("url", "https://toscrape.com", "httpResponseBody", true);
    String requestBody = new Gson().toJson(parameters);

    HttpPost request = new HttpPost("https://api.zyte.com/v1/extract");
    request.setHeader(HttpHeaders.CONTENT_TYPE, ContentType.APPLICATION_JSON);
    request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip, deflate");
    request.setHeader(HttpHeaders.AUTHORIZATION, buildAuthHeader());
    request.setEntity(new StringEntity(requestBody));

    try (CloseableHttpClient client = HttpClients.createDefault()) {
      try (CloseableHttpResponse response = client.execute(request)) {
        HttpEntity entity = response.getEntity();
        String apiResponse = EntityUtils.toString(entity, StandardCharsets.UTF_8);
        JsonObject jsonObject = JsonParser.parseString(apiResponse).getAsJsonObject();
        String base64HttpResponseBody = jsonObject.get("httpResponseBody").getAsString();
        byte[] httpResponseBodyBytes = Base64.getDecoder().decode(base64HttpResponseBody);
        String httpResponseBody = new String(httpResponseBodyBytes, StandardCharsets.UTF_8);
      }
    }
  }

  private static String buildAuthHeader() {
    String auth = API_KEY + ":";
    String encodedAuth = Base64.getEncoder().encodeToString(auth.getBytes());
    return "Basic " + encodedAuth;
  }
}
const axios = require('axios')

axios.post(
  'https://api.zyte.com/v1/extract',
  {
    url: 'https://toscrape.com',
    httpResponseBody: true
  },
  {
    auth: { username: 'YOUR_API_KEY' }
  }
).then((response) => {
  const httpResponseBody = Buffer.from(
    response.data.httpResponseBody,
    'base64'
  )
})
<?php

$client = new GuzzleHttp\Client();
$response = $client->request('POST', 'https://api.zyte.com/v1/extract', [
    'auth' => ['YOUR_API_KEY', ''],
    'headers' => ['Accept-Encoding' => 'gzip'],
    'json' => [
        'url' => 'https://toscrape.com',
        'httpResponseBody' => true,
    ],
]);
$data = json_decode($response->getBody());
$http_response_body = base64_decode($data->httpResponseBody);

With the proxy mode, you always get a response body.

curl \
    --proxy api.zyte.com:8011 \
    --proxy-user YOUR_API_KEY: \
    --compressed \
    https://toscrape.com \
> output.html
from base64 import b64decode

import requests

api_response = requests.post(
    "https://api.zyte.com/v1/extract",
    auth=("YOUR_API_KEY", ""),
    json={
        "url": "https://toscrape.com",
        "httpResponseBody": True,
    },
)
http_response_body: bytes = b64decode(api_response.json()["httpResponseBody"])
import asyncio
from base64 import b64decode

from zyte_api.aio.client import AsyncClient


async def main():
    client = AsyncClient()
    api_response = await client.request_raw(
        {
            "url": "https://toscrape.com",
            "httpResponseBody": True,
        }
    )
    http_response_body: bytes = b64decode(api_response["httpResponseBody"])


asyncio.run(main())

In transparent mode, when you target a text resource (e.g. HTML, JSON), regular Scrapy requests work out of the box:

from scrapy import Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"
    start_urls = ["https://toscrape.com"]

    def parse(self, response):
        http_response_text: str = response.text

While regular Scrapy requests also work for binary responses at the moment, they may stop working in future versions of scrapy-zyte-api, so passing httpResponseBody is recommended when targeting binary resources:

from scrapy import Request, Spider


class ToScrapeSpider(Spider):
    name = "toscrape_com"

    def start_requests(self):
        yield Request(
            "https://toscrape.com",
            meta={
                "zyte_api_automap": {
                    "httpResponseBody": True,
                },
            },
        )

    def parse(self, response):
        http_response_body: bytes = response.body

Output (first 5 lines):

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>Scraping Sandbox</title>

See Zyte API usage documentation for richer Zyte API examples, covering more scenarios and features.

If you notice your code has become slower with Zyte API, that’s normal, as Zyte API requests usually have a higher latency. If you want a similar or better speed, increase your concurrency, the number of requests you send in parallel, accordingly. See also Optimizing Zyte API usage.

There is no easy way to use Zyte API to drive requests from browser automation tools. If you are using Smart Proxy Manager as a proxy for a browser automation tool, consider using Zyte API for your browser automation needs instead. See Migrating from browser automation to Zyte API.

scrapy-zyte-smartproxy migration#

To migrate from scrapy-zyte-smartproxy (previously scrapy-crawlera), first set up scrapy-zyte-api:

  1. You need Python 3.7 or higher to use the latest version of scrapy-zyte-api.

  2. You need Scrapy 2.0.1 or higher to use the latest version of scrapy-zyte-api.

    If you are using a lower version of Scrapy, please upgrade to a higher Scrapy version, and make sure your code works as expected with the newer Scrapy version before you continue the migration process.

    The Scrapy release notes of every Scrapy version cover backward-incompatible changes and deprecation removals, which should help you upgrade your existing code as you upgrade Scrapy.

  3. Install the latest version of scrapy-zyte-api:

    pip install --upgrade scrapy-zyte-api
    
  4. Configure scrapy-zyte-api in your settings.py file. If your Scrapy version is 2.10 or newer, add the following settings:

    ADDONS = {
        "scrapy_zyte_api.Addon": 500,
    }
    ZYTE_API_TRANSPARENT_MODE = False
    

    Otherwise add the following settings:

    DOWNLOAD_HANDLERS = {
        "http": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
        "https": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
    }
    DOWNLOADER_MIDDLEWARES = {
        "scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000,
    }
    REQUEST_FINGERPRINTER_CLASS = "scrapy_zyte_api.ScrapyZyteAPIRequestFingerprinter"
    SPIDER_MIDDLEWARES = {
        "scrapy_zyte_api.ScrapyZyteAPISpiderMiddleware": 100,
    }
    TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
    

    If any of these settings already exists in your settings.py file, modify the existing setting as needed instead of re-defining it. For example, if you already have DOWNLOADER_MIDDLEWARES defined, add "scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000, to your existing definition, keeping existing downloader middlewares untouched.

    Also, make sure that these settings are not being overridden elsewhere. For example, make sure they are not defined in multiple lines of your settings.py file, and that they are not overridden in your Scrapy Cloud project settings.

    Note

    On projects that were not using the asyncio Twisted reactor, your existing code may need changes, such as:

    • Handling a pre-installed Twisted reactor.

      Some Twisted imports install the default, non-asyncio Twisted reactor as a side effect. Once a reactor is installed, it cannot be changed for the whole run time.

    • Converting Twisted Deferreds into asyncio Futures.

      Note that you might be using Deferreds without realizing it through some Scrapy functions and methods. For example, when you yield the return value of self.crawler.engine.download() from a spider callback, you are yielding a Deferred.

  5. Add your API key to settings.py as well:

    ZYTE_API_KEY = "YOUR_API_KEY"
    
  6. To enable cookie support, you must define an additional setting in settings.py:

    ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED = True
    

    The COOKIES_ENABLED setting must also be True, which is its default value, so make sure you are not setting it to False anywhere in your code.

    Warning

    At the moment scrapy-zyte-api sends cookie parameters with the experimental name space, so responseCookies in the raw Zyte API response comes nested inside the experimental dictionary.

Your next steps depend on how you want to approach your migration. You can either migrate only some spiders or migrate your entire project at once.

Migrating spiders one by one, incrementally, can be more time consuming, but also less disruptive, giving you time to validate the migration of each spider separately.

If you prefer to migrate only some of your spiders, while other spiders remain using Smart Proxy Manager, use custom_settings or update_settings in those spiders to toggle scrapy-zyte-smartproxy and scrapy-zyte-api:

class MySpider(Spider):
    custom_settings = {
        "ZYTE_API_TRANSPARENT_MODE": True,
        "ZYTE_SMARTPROXY_ENABLED": False,
        "CRAWLERA_ENABLED": False,  # Only needed if you use scrapy-crawlera
    }

On spiders using Smart Proxy Manager, you can still drive specific requests through Zyte API instead by setting the following fields in the request metadata:

yield Request(
    ...,
    meta={
        "dont_proxy": True,
        "zyte_api_automap": True,
    },
)

If you prefer to migrate your whole project at once instead of spider by spider:

  1. Disable scrapy-zyte-smartproxy or scrapy-crawlera.

    scrapy-zyte-smartproxy is enabled through the ZYTE_SMARTPROXY_ENABLED setting. scrapy-crawlera through CRAWLERA_ENABLED.

    To disable, find where you define that setting (e.g. settings.py, Scrapy Cloud settings), and remove it.

    Also, make sure you are not enabling those settings on specific spiders, e.g. through the custom_settings class attribute of a spider class, or in your CI environment (e.g. in Scrapy Cloud, which allows overriding settings for specific spiders).

  2. Configure Zyte API to run in transparent mode from your settings.py file. If you use scrapy_zyte_api.Addon just remove the ZYTE_API_TRANSPARENT_MODE = False line as the add-on enables the transparent mode itself. Otherwise add the following line:

    ZYTE_API_TRANSPARENT_MODE = True
    

Regardless of whether you are migrating only some spiders or your whole project, review the code of requests that now go through Zyte API to look for Zyte Smart Proxy Manager headers, i.e. those prefixed with X-Crawlera- (case-insensitive), and replace them with Zyte API counterparts according to the table below.

You can specify those parameters through a zyte_api_automap dictionary in request metadata. For example, to set the geolocation of a request to the USA:

yield Request(
    ...,
    meta={
        "zyte_api_automap": {
            "geolocation": "US",
        },
    },
)

If you find that the migration has negatively affected the run time of your spiders, increase the CONCURRENT_REQUESTS and CONCURRENT_REQUESTS_PER_DOMAIN settings accordingly. If a higher concurrency does not improve your run time, the cause may be rate limiting; if the scrapy-zyte-api/throttle_ratio Scrapy stat is high, you may open a support ticket to request a higher rate limit for your account.

Once you have migrated all your code and are happy with the result, you can remove scrapy-zyte-smartproxy:

pip uninstall scrapy-zyte-smartproxy scrapy-crawlera

And remove any related Scrapy setting from your code, i.e. those prefixed with either ZYTE_SMARTPROXY_ or CRAWLERA_, including those that you used to disable scrapy-zyte-smartproxy in an earlier migration step (no need to disable something that is not installed anymore).

Parameter mapping#

The following table shows a mapping of Smart Proxy Manager request headers and their corresponding proxy mode headers and Zyte API parameters:

Headers tagged with bc can be used in Zyte API proxy mode. See Header backward compatibility.

Replacing X-Crawlera-Cookies#

X-Crawlera-Cookies supports 3 values:

  • enable causes automatic cookies to override request cookies.

    To achieve this in Zyte API, do not set requestCookies.

  • disable causes request cookies to override automatic cookies.

    This is the default behavior of Zyte API, using requestCookies overrides automatic cookies.

  • discard causes both request cookies and automatic cookies to be discarded.

    To achieve this in Zyte API, do not set requestCookies, and set cookieManagement to discard.

Replacing X-Crawlera-Profile and X-Crawlera-Profile-Pass#

In general, you can replace X-Crawlera-Profile with the device Zyte API request parameter.

Mind, however, that the behavior of Zyte API is actually a middle ground between the desktop (or mobile) and pass values of X-Crawlera-Profile: browser-specific headers are always sent (unlike pass, which disables them altogether), but you can override them (unlike desktop or mobile, which force them unless you use X-Crawlera-Profile-Pass). See Request headers for more information.

Header backward compatibility#

When using Zyte API proxy mode, migrating to Zyte API proxy mode headers is recommended.

However, the following Smart Proxy Manager request headers can be used in Zyte API proxy mode: X-Crawlera-Cookies, X-Crawlera-JobId, X-Crawlera-Profile, X-Crawlera-Profile-Pass.

If any of these Smart Proxy Manager headers is used, the response will include X-Crawlera-Error if needed for the following error codes: banned, invalid_request, bad_auth, bad_proxy_auth, max_header_size_exceeded, internal_server_error, timeout, domain_forbidden.

Tip

To force getting X-Crawlera-Error on a request without Smart Proxy Manager request headers, add a no-op Smart Proxy Manager request header, e.g. X-Crawlera-Profile-Pass: Foo.