Warning
Zyte Automatic Extraction will be discontinued starting April 30th, 2024. It is replaced by Zyte API. See Migrating from Automatic Extraction to Zyte API.
Warning
Zyte Automatic Extraction will be discontinued starting April 30th, 2024. It is replaced by Zyte API. See Migrating from Automatic Extraction to Zyte API.
Comment Extraction#
Comment extraction supports pages with comments, usually under a blog post or a news article. All comments on a page are returned, including fields such as comment text and it’s publication date.
This supports use-cases such as news and media monitoring, analytics, brand monitoring, mentions, sentiment analysis and many others.
Related page type is Forum Post Extraction which supports extraction of posts made under a single topic on a forum.
Request example#
If you requested a comment extraction, and the extraction succeeds, then the
comments
field will be available in the query result:Available fields#
Top-level#
The following fields are available for
comments
:url
: stringURL of a page where the comments were extracted.
comments
: list of dictionariesList of comments; fields are described below.
url
field is required.Individual comments#
Each comment inside
comments
field has the following fields available:text
: stringText of the comment.
datePublished
: stringComment date. ISO-formatted with ‘T’ separator, may contain a timezone.
datePublishedRaw
: stringSame as
datePublished
, but before parsing/normalization, i.e. as it appeared on the site.upvoteCount
: integerNumber of up-votes recieved by the comment.
downvoteCount
: integerNumber of down-votes recieved by the comment.
probability
: floatProbability that this is a comment.
Comments refer to an article/blog-post available on the same page.
All fields are optional, except for
probability
. Fields without a valid value (null or empty array) are excluded from extraction results.Response example#
Below is an example response with all comment fields present: