Reference

Spiders

class zyte_spider_templates.BaseSpider(*args: Any, **kwargs: Any)[source]

class zyte_spider_templates.EcommerceSpider(*args: Any, **kwargs: Any)[source]

Yield products from an e-commerce website.

See EcommerceSpiderParams for supported parameters.

Pages

class zyte_spider_templates.pages.HeuristicsProductNavigationPage(request_url: RequestUrl, product_navigation: ProductNavigation, response: AnyResponse, page_params: PageParams)[source]

Parameter mixins

pydantic model zyte_spider_templates.params.ExtractFromParam[source]

field extract_from: ExtractFrom | None = None: Whether to perform extraction using a browser request (browserHtml) or an HTTP request (httpResponseBody).

enum zyte_spider_templates.params.ExtractFrom(value)[source]

Member Type:: str

Valid values are as follows:

httpResponseBody: str = <ExtractFrom.httpResponseBody: 'httpResponseBody'>: Use HTTP responses. Cost-efficient and fast extraction method, which works well on many websites.

browserHtml: str = <ExtractFrom.browserHtml: 'browserHtml'>: Use browser rendering. Often provides the best quality.

pydantic model zyte_spider_templates.params.GeolocationParam[source]

field geolocation: Geolocation | None = None: ISO 3166-1 alpha-2 2-character string specified in https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation.

enum zyte_spider_templates.params.Geolocation(value)[source]

Member Type:: str

Valid values are as follows:

AF: str = <Geolocation.AF: 'AF'>

AL: str = <Geolocation.AL: 'AL'>

DZ: str = <Geolocation.DZ: 'DZ'>

AS: str = <Geolocation.AS: 'AS'>

AD: str = <Geolocation.AD: 'AD'>

AO: str = <Geolocation.AO: 'AO'>

AI: str = <Geolocation.AI: 'AI'>

AQ: str = <Geolocation.AQ: 'AQ'>

AG: str = <Geolocation.AG: 'AG'>

AR: str = <Geolocation.AR: 'AR'>

AM: str = <Geolocation.AM: 'AM'>

AW: str = <Geolocation.AW: 'AW'>

AU: str = <Geolocation.AU: 'AU'>

AT: str = <Geolocation.AT: 'AT'>

AZ: str = <Geolocation.AZ: 'AZ'>

BS: str = <Geolocation.BS: 'BS'>

BH: str = <Geolocation.BH: 'BH'>

BD: str = <Geolocation.BD: 'BD'>

BB: str = <Geolocation.BB: 'BB'>

BY: str = <Geolocation.BY: 'BY'>

BE: str = <Geolocation.BE: 'BE'>

BZ: str = <Geolocation.BZ: 'BZ'>

BJ: str = <Geolocation.BJ: 'BJ'>

BM: str = <Geolocation.BM: 'BM'>

BT: str = <Geolocation.BT: 'BT'>

BO: str = <Geolocation.BO: 'BO'>

BQ: str = <Geolocation.BQ: 'BQ'>

BA: str = <Geolocation.BA: 'BA'>

BW: str = <Geolocation.BW: 'BW'>

BV: str = <Geolocation.BV: 'BV'>

BR: str = <Geolocation.BR: 'BR'>

IO: str = <Geolocation.IO: 'IO'>

BN: str = <Geolocation.BN: 'BN'>

BG: str = <Geolocation.BG: 'BG'>

BF: str = <Geolocation.BF: 'BF'>

BI: str = <Geolocation.BI: 'BI'>

CV: str = <Geolocation.CV: 'CV'>

KH: str = <Geolocation.KH: 'KH'>

CM: str = <Geolocation.CM: 'CM'>

CA: str = <Geolocation.CA: 'CA'>

KY: str = <Geolocation.KY: 'KY'>

CF: str = <Geolocation.CF: 'CF'>

TD: str = <Geolocation.TD: 'TD'>

CL: str = <Geolocation.CL: 'CL'>

CN: str = <Geolocation.CN: 'CN'>

CX: str = <Geolocation.CX: 'CX'>

CC: str = <Geolocation.CC: 'CC'>

CO: str = <Geolocation.CO: 'CO'>

KM: str = <Geolocation.KM: 'KM'>

CG: str = <Geolocation.CG: 'CG'>

CD: str = <Geolocation.CD: 'CD'>

CK: str = <Geolocation.CK: 'CK'>

CR: str = <Geolocation.CR: 'CR'>

HR: str = <Geolocation.HR: 'HR'>

CU: str = <Geolocation.CU: 'CU'>

CW: str = <Geolocation.CW: 'CW'>

CY: str = <Geolocation.CY: 'CY'>

CZ: str = <Geolocation.CZ: 'CZ'>

CI: str = <Geolocation.CI: 'CI'>

DK: str = <Geolocation.DK: 'DK'>

DJ: str = <Geolocation.DJ: 'DJ'>

DM: str = <Geolocation.DM: 'DM'>

DO: str = <Geolocation.DO: 'DO'>

EC: str = <Geolocation.EC: 'EC'>

EG: str = <Geolocation.EG: 'EG'>

SV: str = <Geolocation.SV: 'SV'>

GQ: str = <Geolocation.GQ: 'GQ'>

ER: str = <Geolocation.ER: 'ER'>

EE: str = <Geolocation.EE: 'EE'>

SZ: str = <Geolocation.SZ: 'SZ'>

ET: str = <Geolocation.ET: 'ET'>

FK: str = <Geolocation.FK: 'FK'>

FO: str = <Geolocation.FO: 'FO'>

FJ: str = <Geolocation.FJ: 'FJ'>

FI: str = <Geolocation.FI: 'FI'>

FR: str = <Geolocation.FR: 'FR'>

GF: str = <Geolocation.GF: 'GF'>

PF: str = <Geolocation.PF: 'PF'>

TF: str = <Geolocation.TF: 'TF'>

GA: str = <Geolocation.GA: 'GA'>

GM: str = <Geolocation.GM: 'GM'>

GE: str = <Geolocation.GE: 'GE'>

DE: str = <Geolocation.DE: 'DE'>

GH: str = <Geolocation.GH: 'GH'>

GI: str = <Geolocation.GI: 'GI'>

GR: str = <Geolocation.GR: 'GR'>

GL: str = <Geolocation.GL: 'GL'>

GD: str = <Geolocation.GD: 'GD'>

GP: str = <Geolocation.GP: 'GP'>

GU: str = <Geolocation.GU: 'GU'>

GT: str = <Geolocation.GT: 'GT'>

GG: str = <Geolocation.GG: 'GG'>

GN: str = <Geolocation.GN: 'GN'>

GW: str = <Geolocation.GW: 'GW'>

GY: str = <Geolocation.GY: 'GY'>

HT: str = <Geolocation.HT: 'HT'>

HM: str = <Geolocation.HM: 'HM'>

VA: str = <Geolocation.VA: 'VA'>

HN: str = <Geolocation.HN: 'HN'>

HK: str = <Geolocation.HK: 'HK'>

HU: str = <Geolocation.HU: 'HU'>

IS: str = <Geolocation.IS: 'IS'>

IN: str = <Geolocation.IN: 'IN'>

ID: str = <Geolocation.ID: 'ID'>

IR: str = <Geolocation.IR: 'IR'>

IQ: str = <Geolocation.IQ: 'IQ'>

IE: str = <Geolocation.IE: 'IE'>

IM: str = <Geolocation.IM: 'IM'>

IL: str = <Geolocation.IL: 'IL'>

IT: str = <Geolocation.IT: 'IT'>

JM: str = <Geolocation.JM: 'JM'>

JP: str = <Geolocation.JP: 'JP'>

JE: str = <Geolocation.JE: 'JE'>

JO: str = <Geolocation.JO: 'JO'>

KZ: str = <Geolocation.KZ: 'KZ'>

KE: str = <Geolocation.KE: 'KE'>

KI: str = <Geolocation.KI: 'KI'>

KP: str = <Geolocation.KP: 'KP'>

KR: str = <Geolocation.KR: 'KR'>

KW: str = <Geolocation.KW: 'KW'>

KG: str = <Geolocation.KG: 'KG'>

LA: str = <Geolocation.LA: 'LA'>

LV: str = <Geolocation.LV: 'LV'>

LB: str = <Geolocation.LB: 'LB'>

LS: str = <Geolocation.LS: 'LS'>

LR: str = <Geolocation.LR: 'LR'>

LY: str = <Geolocation.LY: 'LY'>

LI: str = <Geolocation.LI: 'LI'>

LT: str = <Geolocation.LT: 'LT'>

LU: str = <Geolocation.LU: 'LU'>

MO: str = <Geolocation.MO: 'MO'>

MG: str = <Geolocation.MG: 'MG'>

MW: str = <Geolocation.MW: 'MW'>

MY: str = <Geolocation.MY: 'MY'>

MV: str = <Geolocation.MV: 'MV'>

ML: str = <Geolocation.ML: 'ML'>

MT: str = <Geolocation.MT: 'MT'>

MH: str = <Geolocation.MH: 'MH'>

MQ: str = <Geolocation.MQ: 'MQ'>

MR: str = <Geolocation.MR: 'MR'>

MU: str = <Geolocation.MU: 'MU'>

YT: str = <Geolocation.YT: 'YT'>

MX: str = <Geolocation.MX: 'MX'>

FM: str = <Geolocation.FM: 'FM'>

MD: str = <Geolocation.MD: 'MD'>

MC: str = <Geolocation.MC: 'MC'>

MN: str = <Geolocation.MN: 'MN'>

ME: str = <Geolocation.ME: 'ME'>

MS: str = <Geolocation.MS: 'MS'>

MA: str = <Geolocation.MA: 'MA'>

MZ: str = <Geolocation.MZ: 'MZ'>

MM: str = <Geolocation.MM: 'MM'>

NA: str = <Geolocation.NA: 'NA'>

NR: str = <Geolocation.NR: 'NR'>

NP: str = <Geolocation.NP: 'NP'>

NL: str = <Geolocation.NL: 'NL'>

NC: str = <Geolocation.NC: 'NC'>

NZ: str = <Geolocation.NZ: 'NZ'>

NI: str = <Geolocation.NI: 'NI'>

NE: str = <Geolocation.NE: 'NE'>

NG: str = <Geolocation.NG: 'NG'>

NU: str = <Geolocation.NU: 'NU'>

NF: str = <Geolocation.NF: 'NF'>

MK: str = <Geolocation.MK: 'MK'>

MP: str = <Geolocation.MP: 'MP'>

NO: str = <Geolocation.NO: 'NO'>

OM: str = <Geolocation.OM: 'OM'>

PK: str = <Geolocation.PK: 'PK'>

PW: str = <Geolocation.PW: 'PW'>

PS: str = <Geolocation.PS: 'PS'>

PA: str = <Geolocation.PA: 'PA'>

PG: str = <Geolocation.PG: 'PG'>

PY: str = <Geolocation.PY: 'PY'>

PE: str = <Geolocation.PE: 'PE'>

PH: str = <Geolocation.PH: 'PH'>

PN: str = <Geolocation.PN: 'PN'>

PL: str = <Geolocation.PL: 'PL'>

PT: str = <Geolocation.PT: 'PT'>

PR: str = <Geolocation.PR: 'PR'>

QA: str = <Geolocation.QA: 'QA'>

RO: str = <Geolocation.RO: 'RO'>

RU: str = <Geolocation.RU: 'RU'>

RW: str = <Geolocation.RW: 'RW'>

RE: str = <Geolocation.RE: 'RE'>

BL: str = <Geolocation.BL: 'BL'>

SH: str = <Geolocation.SH: 'SH'>

KN: str = <Geolocation.KN: 'KN'>

LC: str = <Geolocation.LC: 'LC'>

MF: str = <Geolocation.MF: 'MF'>

PM: str = <Geolocation.PM: 'PM'>

VC: str = <Geolocation.VC: 'VC'>

WS: str = <Geolocation.WS: 'WS'>

SM: str = <Geolocation.SM: 'SM'>

ST: str = <Geolocation.ST: 'ST'>

SA: str = <Geolocation.SA: 'SA'>

SN: str = <Geolocation.SN: 'SN'>

RS: str = <Geolocation.RS: 'RS'>

SC: str = <Geolocation.SC: 'SC'>

SL: str = <Geolocation.SL: 'SL'>

SG: str = <Geolocation.SG: 'SG'>

SX: str = <Geolocation.SX: 'SX'>

SK: str = <Geolocation.SK: 'SK'>

SI: str = <Geolocation.SI: 'SI'>

SB: str = <Geolocation.SB: 'SB'>

SO: str = <Geolocation.SO: 'SO'>

ZA: str = <Geolocation.ZA: 'ZA'>

GS: str = <Geolocation.GS: 'GS'>

SS: str = <Geolocation.SS: 'SS'>

ES: str = <Geolocation.ES: 'ES'>

LK: str = <Geolocation.LK: 'LK'>

SD: str = <Geolocation.SD: 'SD'>

SR: str = <Geolocation.SR: 'SR'>

SJ: str = <Geolocation.SJ: 'SJ'>

SE: str = <Geolocation.SE: 'SE'>

CH: str = <Geolocation.CH: 'CH'>

SY: str = <Geolocation.SY: 'SY'>

TW: str = <Geolocation.TW: 'TW'>

TJ: str = <Geolocation.TJ: 'TJ'>

TZ: str = <Geolocation.TZ: 'TZ'>

TH: str = <Geolocation.TH: 'TH'>

TL: str = <Geolocation.TL: 'TL'>

TG: str = <Geolocation.TG: 'TG'>

TK: str = <Geolocation.TK: 'TK'>

TO: str = <Geolocation.TO: 'TO'>

TT: str = <Geolocation.TT: 'TT'>

TN: str = <Geolocation.TN: 'TN'>

TM: str = <Geolocation.TM: 'TM'>

TC: str = <Geolocation.TC: 'TC'>

TV: str = <Geolocation.TV: 'TV'>

TR: str = <Geolocation.TR: 'TR'>

UG: str = <Geolocation.UG: 'UG'>

UA: str = <Geolocation.UA: 'UA'>

AE: str = <Geolocation.AE: 'AE'>

GB: str = <Geolocation.GB: 'GB'>

US: str = <Geolocation.US: 'US'>

UM: str = <Geolocation.UM: 'UM'>

UY: str = <Geolocation.UY: 'UY'>

UZ: str = <Geolocation.UZ: 'UZ'>

VU: str = <Geolocation.VU: 'VU'>

VE: str = <Geolocation.VE: 'VE'>

VN: str = <Geolocation.VN: 'VN'>

VG: str = <Geolocation.VG: 'VG'>

VI: str = <Geolocation.VI: 'VI'>

WF: str = <Geolocation.WF: 'WF'>

EH: str = <Geolocation.EH: 'EH'>

YE: str = <Geolocation.YE: 'YE'>

ZM: str = <Geolocation.ZM: 'ZM'>

ZW: str = <Geolocation.ZW: 'ZW'>

AX: str = <Geolocation.AX: 'AX'>

pydantic model zyte_spider_templates.params.MaxRequestsParam[source]

field max_requests: int | None = 100

The maximum number of Zyte API requests allowed for the crawl.

Requests with error responses that cannot be retried or exceed their retry limit also count here, but they incur in no costs and do not increase the request count in Scrapy Cloud.

pydantic model zyte_spider_templates.params.UrlParam[source]

field url: str = ''

Initial URL for the crawl. Enter the full URL including http(s), you can copy and paste it from your browser. Example: https://toscrape.com/

Constraints:

pattern = ^https?://[^:/s]+(:d{1,5})?(/[^s]*)*(#[^s]*)?$

pydantic model zyte_spider_templates.spiders.ecommerce.EcommerceCrawlStrategyParam[source]

field crawl_strategy: EcommerceCrawlStrategy = EcommerceCrawlStrategy.full: Determines how the start URL and follow-up URLs are crawled.

enum zyte_spider_templates.spiders.ecommerce.EcommerceCrawlStrategy(value)[source]

Member Type:: str

Valid values are as follows:

full: str = <EcommerceCrawlStrategy.full: 'full'>: Follow most links within the domain of URL in an attempt to discover and extract as many products as possible.

navigation: str = <EcommerceCrawlStrategy.navigation: 'navigation'>

Follow pagination, subcategories, and product detail pages.

Pagination Only is a better choice if the target URL does not have subcategories, or if Zyte API is misidentifying some URLs as subcategories.

pagination_only: str = <EcommerceCrawlStrategy.pagination_only: 'pagination_only'>: Follow pagination and product detail pages. Subcategory links are ignored.