Warning

Zyte API is replacing Smart Proxy Manager. See Migrating from Smart Proxy Manager to Zyte API.

Using Smart Proxy Manager with Splash#

Warning

For the code below to work you must first install the Zyte CA certificate.

Note

All the code in this documentation has been tested with Splash 3.5, Python 3.9.5

Installation#

  1. Setup the Zyte SmartProxy (formerly Crawlera) Headless Proxy as described in Using Headless Browsers with Zyte Smart Proxy Manager.

  2. Download and install Splash following this guide as explained here: https://splash.readthedocs.io/en/stable/install.html

Assuming you installed Splash using Docker, proceed to run Splash with:

docker run -it -p 8050:8050 --rm scrapinghub/splash

You can confirm Splash is running by accessing the Splash web UI at http://localhost:8050

Using Web UI#

Here is a sample script you can use to test the integration of Splash with Smart Proxy Manager, once you have also installed the Headless Proxy.

Just paste this code into Splash web UI main page, enter a URL (ex. http://example.com) and hit the “Render me!” button.

function main(splash)
    splash:on_request(function (request)
        request:set_proxy{"host.docker.internal", 3128}
    end)

    splash:go(splash.args.url)
    return splash:png()
end

Using Python & Request library#

In order to use Python and Request library with Splash we need to first take the lua code mentioned above and save it in a file name say spm-splash.lua. Now, save the Python code mentioned below to another file in the same directory say sample.py. Make sure you have Requests library installed before moving ahead.

import requests

splash_server = 'http://0.0.0.0:8050'
url = "https://example.com"

with open('spm-splash.lua') as lua:
    lua_source = ''.join(lua.readlines())
    splash_url = '{}/execute'.format(splash_server)
    r = requests.post(
        splash_url,
        json={
            'lua_source': lua_source,
            'url': url,
        },
        timeout=100,
    )

    fp = open("spm-splash.png", "wb")
    fp.write(r.content)
    fp.close()

Now run the file using:

$ python sample.py

You’ll find a screenshot of entered URL in the same directory as your Lua and Python files.

Using Splash with Scrapy#

In order to use Zyte Smart Proxy Manager with Splash and Scrapy check out Using Smart Proxy Manager with Splash and Scrapy.