-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Introduction
Jeremy Chou edited this page Mar 6, 2022
·
6 revisions
Redis-based components for Scrapy.
- Free software: MIT license
- Documentation: https://scrapy-redis.readthedocs.org.
- Python versions: 2.7, 3.4+
-
Distributed crawling/scraping
- You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls.
-
Distributed post-processing
- Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue.
-
Scrapy plug-and-play components
- Scheduler + Duplication Filter, Item Pipeline, Base Spiders.
- Python 2.7, 3.4 or 3.5
- Redis >= 2.8
- Scrapy >= 1.0
- redis-py >= 2.10