Scrapy Cloud job logs#

The Log tab of a job contains all the messages logged by the job.

It includes all messages logged with Python’s logging, both those from Scrapy built-in components and your own code.

Troubleshooting#

Here you can find some help to figure out the meaning of common log messages.

[scrapy.spidermiddlewares.httperror] Ignoring response <403 https://example.com>: HTTP status code is not handled or not allowed

By default, after redirects have been followed and retries exceeded, Scrapy ignores responses with an HTTP status code outside the 200-299 range.

Some HTTP status codes, such as 401, 403 or 429, may be the result of a ban. Consider using Zyte API to avoid bans.

If you want to handle those responses in your request callback, instead of ignoring them:

Set handle_httpstatus_all or handle_httpstatus_list in your request metadata to handle such responses for a specific request:
```
Request("https://example.com", meta={"handle_httpstatus_list": {403}})
```
Use the HTTPERROR_ALLOW_ALL or HTTPERROR_ALLOWED_CODES settings to handle such responses for all requests.