Enable Zyte API to avoid bans#

Now that you have run your project in Scrapy Cloud, it is time to improve the project itself, starting with handling website bans.

Your target domain in this tutorial, toscrape.com, does not ban traffic. However, when targeting other websites, sooner or later you will get bans.

You will now configure your web scraping code to use Zyte API to avoid bans on any website:

  1. Set up Zyte API integration in your Scrapy project.

    If you have a GitHub Copilot account, you can type the following in chat to setup Zyte API integration:

    Run the setupZyteAPI tool and follow its instructions.

    Alternatively, handle it manually:

    1. Install the latest version of scrapy-zyte-api:

      pip install --upgrade scrapy-zyte-api
      
    2. Configure scrapy-zyte-api in transparent mode by editing the add-ons section at the start of the web-scraping-tutorial/settings.py file:

      import scrapy_zyte_api
      
      ADDONS = {
          scrapy_zyte_api.Addon: 500,
      }
      
  2. Set your Zyte API key:

    1. Sign up for Zyte API.

      You get $5 free for a month, and you should only need a fraction of that to complete this tutorial.

    2. Add or edit the following code at the end of web-scraping-tutorial/settings.py, replacing YOUR_ZYTE_API_KEY with your Zyte API key:

      ZYTE_API_KEY = "YOUR_ZYTE_API_KEY"
      
  3. Get scrapy-zyte-api installed when running in Scrapy Cloud:

    1. Create an empty text file at web-scraping-tutorial/requirements.txt.

    2. Add the following line to your requirements.txt file:

      scrapy-zyte-api
      
    3. Add the following to your scrapinghub.yml file:

      requirements:
          file: requirements.txt
      

If you run your code again, your code will work the same, only that requests will be sent through Zyte API, to avoid bans cost-efficiently.

Continue to the next chapter to learn about browser automation.

Tip

  • We closely monitor the success rate for the most popular websites, but less popular websites may slip under our radar. If you ever find a website for which Zyte API does not work as expected (e.g. gives you a ban response or too many errors), you can reach out to our expert anti-ban team.

  • If you get an SSL error, install the Zyte CA certificate on your system and try again.