Scrapy Cloud job logs#
The Log tab of a job contains all the messages logged by the job.
It includes all messages logged with Python’s logging, both those from
Scrapy built-in components and your own code.
Troubleshooting#
Here you can find some help to figure out the meaning of common log messages.
Ignoring response#
[scrapy.spidermiddlewares.httperror] Ignoring response <403 https://example.com>: HTTP status code is not handled or not allowed
By default, after redirects have been followed and retries exceeded, Scrapy ignores responses with an HTTP status code outside the 200-299 range.
Some HTTP status codes, such as 401, 403 or 429, may be the result of a ban. Consider using Zyte API to avoid bans.
If you want to handle those responses in your request callback, instead of ignoring them:
Set
handle_httpstatus_allorhandle_httpstatus_listin your request metadata to handle such responses for a specific request:Request("https://example.com", meta={"handle_httpstatus_list": {403}})
Use the
HTTPERROR_ALLOW_ALLorHTTPERROR_ALLOWED_CODESsettings to handle such responses for all requests.