Replies: 1 comment 6 replies
-
Hello, and thanks for your interest in Crawlee! We haven't looked into selenium-driverless yet, but at a glance, it looks interesting. Could you explain how it's different from "regular" selenium, as far as scraping/crawling is concerned? By the way, a feature request for a Selenium-based crawler already exists - #284, but it doesn't seem to get much traction. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
First thank you so much for your effort in this package it's really great.
My question is: Is there any consideration to add new crawlers like selenium-driverless in current time? Recently I've been trying PlaywrightCrawler with many websites and some of them could detect it & raise cloudflare, though testing same sites with selenium-driverless could easily get through.
For example this site with the following code:
Expected behavior
Crawler should visit homepage, extract products links, visit each product & extract its data, very basic crawling.
Actual behavior
Success visiting the homepage, but all further requests are blocked with cloudflare.
What I tried
Of course adding a whole new crawler is a headache & not that simple, but I'd like to hear your opinion.
If you suggest any flags I can pass to the PlaywrightCrawler to reduce its detection I'll be very thankful too.
Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions