Home

x-crawl

x-crawl is a flexible nodejs crawler library. It can crawl pages, control pages, batch network requests, batch download file resources, polling and crawling, etc. Support asynchronous/synchronous mode crawling data. Running on nodejs, the usage is flexible and simple, friendly to JS/TS developers.

If you feel good, you can give x-crawl repository a Star to support it, your Star will be the motivation for my update.

Features

Support asynchronous/synchronous way to crawl data.
Flexible writing, supporting multiple ways to write request configuration and obtain crawling results.
Flexible crawling interval, no interval/fixed interval/random interval, it is up to you to use/avoid high concurrent crawling.
Simple configuration can crawl pages, batch network requests, batch download file resources, polling and crawling, etc.
Crawl SPA (single-page application) to generate pre-rendered content (ie "SSR" (server-side rendering)), and use jsdom library to parse the content, and also supports self-parsing.
Form submissions, keystrokes, event actions, screenshots of generated pages, etc.
Capture and record the success and failure of crawling, and highlight the reminders.
Written in TypeScript, has types, provides generics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

x-crawl

Features

Clone this wiki locally