Monitoring Apache Spark and on Docker with Prometheus and Grafana

Spark streaming with Kafka and ingest to BigQuery

Goal

The goal of this project is to:

Create a Docker Container that runs Spark
Use Prometheus to get metrics from Spark applications and Node-exporter
Use Grafana to display the metrics collected
Spark stream message with Kafka to BigQuery

Notes

Spark version running is 3.0.2
For all available metrics for Spark monitoring see here.
The containerized environment consists of a Master, a Worker.
To track metrics across Spark apps, appName needs to be set up or else the spark.metrics.namespace will be spark.app.id that changes after every invocation of the app.
Main Scala Application running is Kafka Streaming Project-assembly-0.2.0.jar that is streaming job execution ingest to BigQuery.
Dockerfile for Spark/Hadoop is also available here in order to add it in docker-compose.yaml file as seen here.

Usage

Assuming that Docker is installed, simply execute the following command to build and run the Docker Containers:

docker-compose -f docker-compose.spark.yaml -f docker-compose.kafka.yaml build && docker-compose -f docker-compose.spark.yaml -f docker-compose.kafka.yaml up

To shutdown Docker Containers, execute the following command:

docker-compose -f docker-compose.spark.yaml -f docker-compose.kafka.yaml down

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monitoring Apache Spark and on Docker with Prometheus and Grafana

Spark streaming with Kafka and ingest to BigQuery

Goal

Notes

Usage

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Cron		Cron
Grafana		Grafana
Prometheus		Prometheus
Spark		Spark
apps		apps
.gitignore		.gitignore
README.md		README.md
docker-compose.kafka.yaml		docker-compose.kafka.yaml
docker-compose.spark.yaml		docker-compose.spark.yaml

zaivi/int-spark-streaming-bigquery

Folders and files

Latest commit

History

Repository files navigation

Monitoring Apache Spark and on Docker with Prometheus and Grafana

Spark streaming with Kafka and ingest to BigQuery

Goal

Notes

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages