E-commerce spider template (ecommerce)

Basic use

scrapy crawl ecommerce -a url="https://books.toscrape.com"


pydantic model zyte_spider_templates.spiders.ecommerce.EcommerceSpiderParams[source]
field crawl_strategy: EcommerceCrawlStrategy = EcommerceCrawlStrategy.full

Determines how the start URL and follow-up URLs are crawled.

field extract_from: ExtractFrom | None = None

Whether to perform extraction using a browser request (browserHtml) or an HTTP request (httpResponseBody).

field geolocation: Geolocation | None = None

ISO 3166-1 alpha-2 2-character string specified in https://docs.zyte.com/zyte-api/usage/reference.html#operation/extract/request/geolocation.

field max_requests: int | None = 100

The maximum number of Zyte API requests allowed for the crawl.

Requests with error responses that cannot be retried or exceed their retry limit also count here, but they incur in no costs and do not increase the request count in Scrapy Cloud.

field url: str [Required]

Initial URL for the crawl. Enter the full URL including http(s), you can copy and paste it from your browser. Example: https://toscrape.com/

  • pattern = ^https?://[^:/s]+(:d{1,5})?(/[^s]*)*(#[^s]*)?$