-
Notifications
You must be signed in to change notification settings - Fork 105
Home
CoderHXL edited this page Mar 25, 2023
·
5 revisions
English | 简体中文
x-crawl is a flexible nodejs crawler library. It can crawl pages, control pages, batch network requests, batch download file resources, polling and crawling, etc. Support asynchronous/synchronous mode crawling data. Running on nodejs, the usage is flexible and simple, friendly to JS/TS developers.
If you feel good, you can give x-crawl repository a Star to support it, your Star will be the motivation for my update.
- Support asynchronous/synchronous way to crawl data.
- Flexible writing, supporting multiple ways to write request configuration and obtain crawling results.
- Flexible crawling interval, no interval/fixed interval/random interval, it is up to you to use/avoid high concurrent crawling.
- Simple configuration can crawl pages, batch network requests, batch download file resources, polling and crawling, etc.
- Crawl SPA (single-page application) to generate pre-rendered content (ie "SSR" (server-side rendering)), and use jsdom library to parse the content, and also supports self-parsing.
- Form submissions, keystrokes, event actions, screenshots of generated pages, etc.
- Capture and record the success and failure of crawling, and highlight the reminders.
- Written in TypeScript, has types, provides generics.