Job Posting Extraction#
Job posting extraction supports pages with a single job posting on them, as found on job boards, career sections of company web-sites, or other sites. Many fields are extracted, such as job title, description, salary information and publication date.
This supports use-cases such as market, technology and competitor analysis, finding leads, and many others.
Request example#
If you requested a job posting extraction, and the extraction succeeds,
then the jobPosting
field will be available in the query result:
from autoextract.sync import request_raw
query = [{
'url': 'http://example.com/example-job-page',
'pageType': 'jobPosting'
}]
results = request_raw(query, api_key='[api key]')
print(results[0]['jobPosting'])
Available fields#
The following fields are available for jobPosting
:
title
: stringThe title of the job.
datePosted
: stringPublication date of online listing. ISO-formatted with âTâ separator, may contain a timezone.
validThrough
: stringThe date after when the item is not valid, e.g. the end of an offer. ISO-formatted with âTâ separator, may contain a timezone.
description
: stringA description of the job posting including sub-headings, with newline separators.
descriptionHtml
: stringSimplified HTML of the description, including sub-headings, image captions and embedded content.
employmentType
: stringType of employment (e.g. full-time, part-time, contract, temporary, seasonal, internship)
hiringOrganization
: dictionary with araw
string fieldInformation about the organization offering the job position. Example:
{"raw": "ACME Corp."}
baseSalary
: dictionaryThe base salary of the job or of an employee in the proposed role. It is a dictionary with the following fields:
raw
- string with the salary amount, as it appears on the websitevalue
- float number, the value of the base salary.currency
- string, currency associated to the amount.
Example:
{ "raw": "$53,251 a year", "value": 53251.0, "currency": "$" }
All fields are optional, except for
raw
. ExamplejobLocation
: dictionary with araw
string fieldA (typically single) geographic location associated with the job position. Example:
{"raw": "West New York, NJ 07093"}
probability
: floatProbability that this is a single job posting page.
url
: stringURL of a page where this job posting was extracted.
All fields are optional, except for url
and probability
.
Fields without a valid value (null or empty array) are excluded from extraction
results.
Response example#
Below is an example response with all job posting fields present:
[
{
"jobPosting": {
"title": "Regional Manager",
"datePosted": "2019-06-19T00:00:00",
"validThrough": "2019-08-19T00:00:00",
"description": "Job Description ...",
"descriptionHtml": "<article>HTML for Job Description ...",
"baseSalary": {
"currency": "$",
"raw": "$90000 gross",
"value": 90000.0
},
"jobLocation": {
"raw": "North Pole"
},
"hiringOrganization": {
"raw": "ACME Corporation"
},
"employmentType": "Full-time",
"probability": 0.95,
"url": "https://example.com/job"
},
"webPage": {
"inLanguages": [
{"code": "en"},
{"code": "es"}
]
},
"query": {
"id": "1564747029122-9e02a1868d70b7a3",
"domain": "example.com",
"userQuery": {
"pageType": "jobPosting",
"url": "https://example.com/job"
}
},
"algorithmVersion": "20.8.1"
}
]