Zyte Data article list schema v1.0#
Standard Article List Schema (1.0)
Standard Article List Schema used in Zyte offering. Covers the typical set of attributes present in article listings published on-line.
Standard Article List Schema v1.0
Responses
Response Schema: application/json
url required | string (URL) The main URL of the article list. The URL of the final response, after any redirects. Required attribute. In case there is no article list data on the page or the page was not reached, the returned item still contains "url" field, "metadata" field with a timestamp in "dateDownloaded" and all the other available datapoints. | ||||||||||||||||||||||||||||||||||||||
canonicalUrl | string The canonical form of the URL, selected by the website. | ||||||||||||||||||||||||||||||||||||||
Array of objects[ items ] List of article details found on the page. The order of the articles reflects their position on the page. | |||||||||||||||||||||||||||||||||||||||
Array
| |||||||||||||||||||||||||||||||||||||||
Array of objects or objects[ items ] The list of breadcrumbs with URL and optional category name. | |||||||||||||||||||||||||||||||||||||||
Array Any of
| |||||||||||||||||||||||||||||||||||||||
object Details of the next page URL, if available. | |||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||
pageNumber | integer Current page number, if displayed explicitly on the list page. Numeration starts with 1. | ||||||||||||||||||||||||||||||||||||||
object Metadata about the data extraction process. | |||||||||||||||||||||||||||||||||||||||
|
Response samples
- 200
{- "articles": [
- {
- "articleBody": "We held the 2022 Web Data Extraction Summit three weeks ago. I wanted to extend a huge thank you to everyone who came, especially our guest speakers, who shared some great insights throughout the day.",
- "authors": [
- {
- "name": "Honorable Zytan",
- "nameRaw": "Honorable Zytan"
}
], - "datePublished": "2022-10-25T00:00:00Z",
- "datePublishedRaw": "October 25, 2022",
- "headline": "Reflecting on the 2022 Web Data Extraction Summit | Zyte",
- "images": [
], - "inLanguage": "en",
- "mainImage": {
}, - "probability": 0.8437236322711676,
}, - {
- "articleBody": "We are thrilled with the success of this year’s Extract Summit that took place last week in London, UK. Finally, we were able to go back to the original in-person format.",
- "authors": [
- {
- "name": "Inquisitive Zytan",
- "nameRaw": "Inquisitive Zytan"
}
], - "datePublished": "2022-10-05T00:00:00Z",
- "datePublishedRaw": "October 5, 2022",
- "headline": "6 Key Takeaways from Extract Summit 2022",
- "images": [
], - "inLanguage": "en",
- "mainImage": {
}, - "probability": 0.7396970292225937,
}, - {
- "articleBody": "We couldn’t be more excited with the fourth edition of the Web Data Extraction Summit being just two weeks away. After two straight years of going virtual, our upcoming edition will provide the opportunity to meet in person again. And this year it’s going to be in London!",
- "authors": [
- {
- "name": "Contemporary Zytan",
- "nameRaw": "Contemporary Zytan"
}
], - "datePublished": "2022-09-13T00:00:00Z",
- "datePublishedRaw": "September 13, 2022",
- "headline": "Web Data Extraction Summit 2022",
- "images": [
], - "inLanguage": "en",
- "mainImage": {
}, - "probability": 0.301,
}, - {
- "articleBody": "We are delighted to announce that Extract Summit 2022 will be returning to an in-person format after two years of being virtual. This time, it’s going to be in London!",
- "authors": [
- {
- "name": "Irreplaceable Zytan",
- "nameRaw": "Irreplaceable Zytan"
}
], - "datePublished": "2022-06-14T00:00:00Z",
- "datePublishedRaw": "June 14, 2022",
- "headline": "5 Reasons to Attend Extract Summit 2022",
- "inLanguage": "en",
- "probability": 0.5705504884376111,
}, - {
- "articleBody": "It’s a wrap! Last week, for the third time, Extract Summit brought together web data experts and enthusiasts to learn, share and inspire. Sessions, workshops, panels, contests – this year’s summit had so much to offer, I don’t even know where to start.",
- "authors": [
- {
- "name": "Persistent Zytan",
- "nameRaw": "Persistent Zytan"
}
], - "datePublished": "2021-10-12T00:00:00Z",
- "datePublishedRaw": "October 12, 2021",
- "headline": "Extract Summit 2021: Highlights and key takeaways",
- "images": [
], - "inLanguage": "en",
- "mainImage": {
}, - "probability": 0.9417928575864742,
}, - {
- "articleBody": "The year so far has been quite interesting for the Web Data Extraction industry. As more and more companies are realizing the power of web extracted data, the demand for web data is rising quickly. This exponential growth and need for web data have driven many innovations leading to the development of new tools and techniques that have revolutionized web data extraction making it faster, more accurate, and more reliable.",
- "authors": [
- {
- "name": "Vigorous Zytan",
- "nameRaw": "Vigorous Zytan"
}
], - "datePublished": "2021-08-19T00:00:00Z",
- "datePublishedRaw": "August 19, 2021",
- "headline": "Web Data Extraction Summit 2021",
- "inLanguage": "en",
- "probability": 0.7128533149281218,
}, - {
- "articleBody": "We are delighted to announce that Extract Summit 2021 will be a virtual event again in 2021 brought to you by Zyte (formerly Scrapinghub). We had to make a quick decision last year and turn the event into a virtual one - the good news was it allowed so many of our community to join us from all over the world. We had 3,000 web data extraction experts sign up.",
- "authors": [
- {
- "name": "Snappy Zytan",
- "nameRaw": "Snappy Zytan"
}
], - "datePublished": "2021-05-18T00:00:00Z",
- "datePublishedRaw": "May 18, 2021",
- "headline": "Extract Summit 2021 - come join the web data revolution",
- "inLanguage": "en",
- "probability": 0.9161820802236562,
}, - {
- "articleBody": "Web data extraction has become one of the most important tools for businesses to grow and stay ahead of the competition. From developing better pricing strategies to identifying hidden risks and building better products, web data extraction provides the power to transform infinite web data into a structured format that can help you make profitable decisions.",
- "authors": [
- {
- "name": "Thriving Zytan",
- "nameRaw": "Thriving Zytan"
}
], - "datePublished": "2020-12-30T00:00:00Z",
- "datePublishedRaw": "December 30, 2020",
- "headline": "Announcing The Web Data Extraction Summit 2020",
- "images": [
], - "inLanguage": "en",
- "mainImage": {
}, - "probability": 0.7354702084642959,
}, - {
- "articleBody": "Web data extraction has become one of the most important tools for businesses to grow and stay ahead of the competition. From developing better pricing strategies to identifying hidden risks and building better products, web data extraction provides the power to transform infinite web data into a structured format that can help you make profitable decisions.",
- "authors": [
- {
- "name": "Amiable Zytan",
- "nameRaw": "Amiable Zytan"
}
], - "datePublished": "2020-09-24T00:00:00Z",
- "datePublishedRaw": "September 24, 2020",
- "headline": "Announcing the Web Data Extraction Summit 2020",
- "inLanguage": "en",
- "probability": 0.6197986051804278,
}, - {
- "articleBody": "In our article last week, we answered some of the best questions we got during Extract Summit. In today’s post, we share with you the second part of this series. We are covering questions on web scraping infrastructure and how machine learning can be used in web scraping.",
- "authors": [
- {
- "name": "Helpful Zytan",
- "nameRaw": "Helpful Zytan"
}
], - "datePublished": "2019-10-17T00:00:00Z",
- "datePublishedRaw": "October 17, 2019",
- "headline": "Web scraping questions & answers part II",
- "images": [
], - "inLanguage": "en",
- "mainImage": {
}, - "probability": 0.7749464166304918,
}
], - "breadcrumbs": [
], - "paginationNext": {
- "text": "Next »"
}, - "pageNumber": 1,
- "metadata": {
- "dateDownloaded": "2022-12-31T13:01:54Z"
}
}