Scrapy Cloud job logs#
The Log tab of a job contains all the messages logged by the job.
It includes all messages logged with Python’s logging
, both those from
Scrapy built-in components and your own code.
Troubleshooting#
Here you can find some help to figure out the meaning of common log messages.
Ignoring response#
[scrapy.spidermiddlewares.httperror] Ignoring response <403 https://example.com>: HTTP status code is not handled or not allowed
By default, after redirects have been followed and retries exceeded, Scrapy ignores responses with an HTTP status code outside the 200-299 range.
Some HTTP status codes, such as 401, 403 or 429, may be the result of a ban. Consider using Zyte API to avoid bans.
If you want to handle those responses in your request callback, instead of ignoring them:
Set
handle_httpstatus_all
orhandle_httpstatus_list
in your request metadata to handle such responses for a specific request:Request("https://example.com", meta={"handle_httpstatus_list": {403}})
Use the
HTTPERROR_ALLOW_ALL
orHTTPERROR_ALLOWED_CODES
settings to handle such responses for all requests.