Skip to content

javaarchive/Atlas

Repository files navigation

Distributed Web Scraping Platform

So I reinvented a task queue I guess?

Development

Notes: the database structures used are only compatible with PostgreSQL. Thus for development, you need to use the docker-compose.dev.yml file (see the scripts folder to start one).

Production

Please change the postgres password.

Setup

Current development setup.

# for postgres
scripts/database.sh

In another terminal:

npm install # first time only
npm start

Eventually I'll make a dockerfile that you can just run but not yet.

warning

Namespaces are kinda broken rn because I didn't segment the cache by namespace. just run another instance for now.

About

a fun distributed crawling server

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages