See the slides or the report (french) for details.
- rbenv
- docker (+ docker-compose)
git clone [email protected]:gautierdelorme/marinewatch.git
cd marinewatch
docker-compose up -d
rbenv install
rbenv rehash
gem install bundler
rbenv rehash
cd ./code/web_api
bundle install
rbenv rehash
- Docker as containers manager
- Batch and Streaming processing using Spark and HDFS (written in Scala)
- Neo4j as graph database
- Web and Streaming APIs built in Ruby (using Sinatra)
- CLI tool written in pure bash
- Git as versioning system
code/
: contains all the marinewatch codemwspark
Spark Scala application- Batch processing to generate structured data to be imported in Neo4j
- Streaming processing to update Neo4j data in real time
web_api
Web API to get shortest path between two geo coordinates (supported formats:html
,json
andkml
)streaming_api
Streaming API to push new updates from boats
data/
: contains all data filesinput
Data files used by Spark jobsoutput
Data files generated by Spark jobs
neo4j/
:conf
: Neo4j config filesdata
: Neo4j databaseslogs
: Logs generated by Neo4jplugins
: Plugins used by Neo4j
docker-compose.yml
: Docker config filemarinewatch-cli
: CLI tool used to manage the app
Important:
- You need to have docker running
./marinewatch-cli -b 40
./marinewatch-cli -c
./marinewatch-cli -u dbname
# create new database named dbname
# import new data inside, start the database
# create an index on (latitude,longitude)
./marinewatch-cli -d dbname
./marinewatch-cli -s
# You can see the result from this endpoint for example
# http://localhost:4567/route?from=39.425,6.825&to=6.225,103.050
./marinewatch-cli -s -b 40 -u dbname
$ ./marinewatch-cli -h
Usage: marinewatch-cli [-h] [-b <int>] [-c <string>] [-u <string>] [-d <string>] [-s] [-t]
-h Help. Display this message and quit.
-b <int> Run batch process with specified accuracy.
-c <string> Run streaming process listening on specified address.
-u <string> Create new database with name.
-d <string> Start database with name.
-s Start web server.
-t Start streaming server.
- ✅
Improve speed - ✅
Use datasets with better accuracy (1/40) - ✅
Add Spark Streaming processing - Do not restrict cost to boats density
- ...
This project is licensed under the MIT License. See the LICENSE file for details.