Real Estate Extraction (beta)¶
Request example¶
If you requested a real estate extraction, and the extraction succeeds,
then the realEstate
field will be available in the query result:
from autoextract.sync import request_raw
query = [{
'url': 'http://example.com/example-real-estate-page',
'pageType': 'realEstate'
}]
results = request_raw(query, api_key='[api key]')
print(results[0]['realEstate'])
Available fields¶
The following fields are available for real estates:
name
: stringThe name of the real estate.
datePublished
: stringPublication date. ISO-formatted with ‘T’ separator, may contain a timezone.
datePublishedRaw
: stringSame date as
datePublished
but before parsing, i.e. as it appears on the website.description
: stringDescription of the real estate.
mainImage
: stringA URL or data URL value of the main image of the real estate.
images
: List of stringsA list of URL or data URL values of all images of the real estate (may include the main image).
yearBuilt
: numberThe year a real estate was constructed. Example:
2008
.breadcrumbs
: list of dictionaries withname
andlink
optional string fieldsA list of breadcrumbs (a specific navigation element) with optional name and URL. Example:
[ {"name": "Foo", "link": "http://example.com/foo"}, {"name": "Bar", "link": "http://example.com/foo/bar"}, {"name": "Baz"}, ]
additionalProperty
: list of dictionaries withname
andvalue
fieldsA list of real estate properties or characteristics.
name
field contains the property namevalue
field contains the property value.
Example:
[ {"name": "location", "value": "Tivat"}, {"name": "region", "value": "Tivat and Lustica"}, {"name": "type", "value": "Apartments / Developments"}, {"name": "size", "value": "41m2"} ]
address
: dictionaryA structured postal address of the real estate. Fields:
postalCode
- postal code of the addressstreetAddress
- street addressaddressCountry
- country name or a two-letter ISO 3166-1 alpha-2 country codeaddressLocality
- locality in which the street address is, and which is in the regionaddressRegion
- region in which the locality is, and which is in the countryraw
- complete address information, as it appears on the website
Example:
{ "postalCode": "77701", "streetAddress": "3214 Brookview Drive", "addressCountry": "US", "addressLocality": "Beaumont", "addressRegion": "Texas", "raw": "3214 Brookview Drive, Beaumont, Texas 77701, US" }
All fields are optional.
area
: dictionaryA structured area of the real estate. Fields:
value
- a number with the area of the real estateunitCode
- unit of the area. Allowed values:SQMT
- square meterSQFT
- square footACRE
- acre
raw
- area in the raw format, as it appears on the website
Example:
{ "value": 54.0, "unitCode": "SQMT", "raw": "54 m²" }
Fields
value
andunitCode
are optional.numberOfBathroomsTotal
: numberThe total number of bathrooms in the real estate.
numberOfFullBathrooms
: numberThe number of full bathrooms in the real estate.
numberOfPartialBathrooms
: numberThe number of half bathrooms in the real estate.
numberOfBedrooms
: numberThe number of bedrooms in the real estate.
numberOfRooms
: numberThe number of rooms (excluding bathrooms and closets) of the real estate.
identifier
: stringThe identifier of the real estate.
tradeActions
: list of dictionariesA list of structures describing possible trade actions that can be done on the real estate.
Each dictionary in a list can have the following fields:
tradeType
- type of a trade action, a string. Allowed values:"BuyAction"
- the real estate is for sale"RentAction"
- the real estate is for rent
price
- a string with an offer price of the real estatecurrency
- currency of the price, a string
Example:
[ { "tradeType": "RentAction", "price": "1700.0", "currency": "USD" } ]
All fields are optional, but
currency
can be present only ifprice
is also present.probability
: floatProbability that the requested page is a single real estate page.
url
: stringURL a of page where this real estate was extracted.
All fields are optional, except for url
and probability
.
Fields without a valid value (null or empty array) are excluded from extraction results.
Response example¶
Below is an example response with all real estate fields present:
[
{
"realEstate": {
"name": "Real Estate name",
"datePublished": "2020-06-18T00:00:00",
"datePublishedRaw": "June 18, 2020",
"description": "Real Estate description",
"mainImage": "http://example.com/image.png",
"images": [
"http://example.com/image.png"
],
"yearBuilt": 2018,
"breadcrumbs": [
{
"name": "Level 1",
"link": "http://example.com"
}
],
"additionalProperty": [
{
"name": "property 1",
"value": "value of property 1"
}
],
"address": {
"postalCode": "77701",
"streetAddress": "3214 Brookview Drive",
"addressCountry": "US",
"addressLocality": "Beaumont",
"addressRegion": "Texas",
"raw": "3214 Brookview Drive, Beaumont, Texas 77701, US"
},
"area": {
"value": 54.0,
"unitCode": "SQMT",
"raw": "54 m²"
},
"numberOfBathroomsTotal": 3,
"numberOfFullBathrooms": 2,
"numberOfPartialBathrooms": 1,
"numberOfBedrooms": 2,
"numberOfRooms": 3,
"identifier": "XYZ",
"tradeActions": [
{
"tradeType": "RentAction",
"price": "1700.0",
"currency": "USD"
}
],
"probability": 0.95,
"url": "http://example.com/example-real-estate-page"
},
"webPage": {
"inLanguages": [
{"code": "en"},
{"code": "es"}
]
},
"query": {
"id": "1564747029122-9e02a1868d70b7a3",
"domain": "example.com",
"userQuery": {
"pageType": "realEstate",
"url": "http://example.com/example-real-estate-page"
}
},
"algorithmVersion": "20.8.1"
}
]