Vehicle Extraction#
Vehicle extraction supports pages which contain a single vehicle listing for sale. Many fields are extracted, such as model name, price, and vehicle-specific fields such as VIN, mileage, engine and fuel type, and others.
Related page type is Product List Extraction which supports pages with multiple products.
Request example#
If you requested vehicle extraction, and the extraction succeeds,
then the vehicle
field will be available in the query result:
from autoextract.sync import request_raw
query = [{
'url': 'https://example.com/vehicle',
'pageType': 'vehicle'
}]
results = request_raw(query, api_key='[api key]')
print(results[0]['vehicle'])
Available fields#
Vehicle is a sub-class of product, so it has some of the fields of product
and
some fields which are specific to vehicle
only.
Vehicle-specific fields#
The list of fields which are specific to vehicle
:
vehicleIdentificationNumber
: stringVIN number is an unique fingerprint for vehicle, which is different for every vehicle.
mileageFromOdometer
: dictionaryA dictionary with the mileage of the vehicle. It may contain two fields:
value
is an integer indicating the distance travelled by the vehicleunitCode
is a string with an unit code, can be one ofSMI
for milesKMT
for kilometers
Example:
{"value": 43000, "unitCode": "KMT"}
vehicleTransmission
: stringVehicle transmission. It is the type of component used for transmitting the power from a rotating power source to the wheels or other relevant component.
fuelType
: stringThe type of fuel suitable for the engine of the vehicle.
vehicleEngine
: dictionaryInformation about the engine or engines of the vehicle. Currently it is a dictionary with a
"raw"
string field. This field contains the raw text present on the site without any parsing. Example:{"raw": "4.4L"}
color
: stringThe color of car (exterior).
vehicleInteriorColor
: stringThe color of car interior.
availableAtOrFrom
: dictionaryThe place where the car is located. Currently it is a dictionary with a
"raw"
string field. This field contains the raw text present on the site without any parsing. Example:{"raw": "New york"}
numberOfDoors
: integerThe number of doors in the car.
vehicleSeatingCapacity
: integerSeating capacity of the car.
fuelEfficiency
: list of dictionariesThe measure of fuel efficiency of vehicle. It can be represented as distance per unit fuel (eg. 20 miles per gallon) or fuel per unit distance (8 liters per 100 km). Field
raw
indicate the raw text present on the site without any parsing.Example:
[ {"raw": "25 mpg (city)"}, {"raw": "40 mpg (highway)"} ]
General product fields#
Many fields from product (see Product Extraction) are also extracted from vehicles:
name
: stringThe name of the vehicle.
offers
: list of dictionariesVehicle offers. Each offer may contain
price
,currency
,regularPrice
andavailability
string fields. All fields are optional butcurrency
is present only ifprice
is also present.price
field is a string with a valid number (a dot is used as decimal separator). It is the price a customer has to pay after discounts or special offers.currency
is the currency as given on the website, without extra normalization (for example, both “$” and “USD” are possible currencies). It is present only ifprice
is also present.regularPrice
is the price before any discount or special offer. It is present only when theprice
is different fromregularPrice
.availability
is the product availability, as a string. Allowed values:"InStock"
- includes limited availability, presale, preorder, and in-store only."OutOfStock"
- includes discontinued and sold out.
Example:
[ { "price": "42000", "regularPrice": "45000.00", "currency": "USD", "availability": "InStock" } ]
sku
: stringStock Keeping Unit identifier for the vehicle assigned by the seller.
mpn
: stringManufacturer part number identifier for vehicle. It is issued by the manufacturer and is same across different websites for a vehicle.
brand
: stringBrand or manufacturer of the vehicle.
breadcrumbs
: list of dictionaries withname
andlink
optional string fieldsA list of breadcrumbs (a specific navigation element) with optional name and URL. Example:
[ {"name": "Foo", "link": "http://example.com/foo"}, {"name": "Bar", "link": "http://example.com/foo/bar"}, {"name": "Baz"}, ]
mainImage
: stringA URL or data URL value of the main image of the vehicle.
images
: list of stringsA list of URL or data URL values of all images of the vehicle (may include the main image).
description
: stringDescription of the vehicle.
descriptionHtml
: stringSimplified HTML of the description, including sub-headings, image captions and embedded content.
aggregateRating
: dictionaryAggregate information about the vehicle rating and reviews.
ratingValue
is the average rating value, as a float.bestRating
is the best possible rating value, as a float.reviewCount
is the number of reviews or ratings for the product, as int.
Example - 4.5 out of 5, based on 12 reviews:
{ "ratingValue": 4.5, "bestRating": 5, "reviewCount": 12 }
All fields are optional but one of
reviewCount
orratingValue
must be present.additionalProperty
: list of dictionaries withname
andvalue
fieldsA list of vehicle properties or characteristics.
name
field contains the property name,value
field contains the property value.
Example:
[ {"name": "engine", "value": "I4"}, {"name": "drivetrain", "value": "All-Wheel Drive"}, {"name": "fuel type", "value": "Gasoline"} ]
probability
: floatProbability that the requested page is a single vehicle page.
canonicalUrl
: stringCanonical URL of the vehicle, if available.
url
: stringURL a of page where this vehicle was extracted.
All fields are optional, except for url
and probability
.
Fields without a valid value (null or empty array) are excluded from extraction results.
Response example#
Below is an example response with all vehicle fields present:
[
{
"vehicle": {
"name": "Vehicle name",
"offers": [
{
"price": "42000",
"currency": "USD",
"availability": "InStock",
"regularPrice": "48000"
}
],
"sku": "Vehicle sku",
"mpn": "Vehicle model",
"vehicleIdentificationNumber": "4T1BE32K25U056382",
"mileageFromOdometer": {
"value": 25000,
"unitCode": "KMT"
},
"vehicleTransmission": "manual",
"fuelType": "Petrol",
"vehicleEngine": {
"raw": "4.4L "
},
"availableAtOrFrom": {
"raw": "New york"
},
"color": "black",
"vehicleInteriorColor": "Silver",
"numberOfDoors": 5,
"vehicleSeatingCapacity": 6,
"fuelEfficiency": [
{
"raw": "45 mpg (city)"
}
],
"brand": "vehicle brand",
"breadcrumbs": [
{
"name": "Level 1",
"link": "http://example.com"
}
],
"mainImage": "http://example.com/image.png",
"images": [
"http://example.com/image.png"
],
"description": "vehicle description",
"descriptionHtml": "<article>HTML description for Vehicle ...",
"aggregateRating": {
"ratingValue": 4.5,
"bestRating": 5.0,
"reviewCount": 31
},
"additionalProperty": [
{
"name": "property 1",
"value": "value of property 1"
}
],
"probability": 0.95,
"canonicalUrl": "https://example.com/vehicle/",
"url": "https://example.com/vehicle"
},
"webPage": {
"inLanguages": [
{"code": "en"},
{"code": "es"}
]
},
"query": {
"id": "1564747029122-9e02a1868d70b7a2",
"domain": "example.com",
"userQuery": {
"pageType": "vehicle",
"url": "https://example.com/vehicle"
}
},
"algorithmVersion": "20.8.1"
}
]