A hands-on DevOps project showcasing end-to-end monitoring, alerting, and observability for modern infrastructure and applications. This project demonstrates how to design, deploy, and manage a production-grade monitoring stack using tools like Prometheus, cAdvisor, cephFS, Grafana, Alertmanager integrated with CI/CD pipelines and cloud-native environments (Docker Swarm).
- Screenshots
- Tech Stack
- Prerequisites
- Quick Start
- Documentation
- Features
- Tasks (automation)
- Roadmap
- License
- Contributing
- Contact
List of tools used in the project
This project uses Devbox to manage the development environment. Devbox provides a consistent, isolated environment with all the necessary CLI tools pre-installed.
-
Install Docker
- Follow the installation instructions for your operating system.
The rest of the tools are already installed in the devbox environment
-
Install Devbox
- Follow the installation instructions for your operating system.
-
Clone the Repository
git clone https://github.com/sean-njela/demo_monitoring.git cd demo_monitoring
-
Start the Devbox Environment and poetry environment
devbox shell # Start the devbox environment (this will also start the poetry environment) poetry install # Install dependencies poetry env activate # use the output to activate the poetry environment ( ONLY IF DEVBOX DOES NOT ACTIVATE THE ENVIRONMENT)
Note - The first time you run
devbox shell
, it will take a few minutes to install the necessary tools. But after that it will be much faster.
task dev # this one command will run all commands necessary to setup the environment. yes, really.
# GIVE EVERYTHING A MINUTE TO SETUP THEN
task status # check if all containers are running (should be 6 services)
Everything ran well if you see the following output:
task: [status] docker stack ls
[status] NAME SERVICES
[status] monitoring 6
[status] portainer 2
task: [status] docker service ls
[status] ID NAME MODE REPLICAS IMAGE
PORTS
[status] mwlzgz7v5yr8 monitoring_cadvisor global 1/1 gcr.io/cadvisor/cadvisor:v0.47.2 *:8080->8080/tcp
[status] yvf13xmyw1gx monitoring_grafana replicated 1/1 grafana/grafana:10.0.3 *:3000->3000/tcp
[status] bg08vgtsdo1k monitoring_nginx-app replicated 1/1 nginx:alpine *:8081->80/tcp
[status] w3wivuzmohvg monitoring_nginx-exporter replicated 1/1 nginx/nginx-prometheus-exporter:0.11.0 *:9113->9113/tcp
[status] r9i5x9gkc9wv monitoring_node-exporter global 1/1 prom/node-exporter:v1.5.0 *:9100->9100/tcp
[status] ymbf3o7ksmha monitoring_prometheus replicated 1/1 prom/prometheus:v2.47.0 *:9090->9090/tcp
[status] yktdcwwswder portainer_agent global 1/1 portainer/agent:lts
[status] ny2dcmtg4pqw portainer_portainer replicated 1/1 portainer/portainer-ee:lts *:9000->9000/tcp, *:9443->9443/tcp
Then run the following to expose the urls:
task info
As you make changes you can run the following command to refresh/redeploy the stack:
task deploy
# and as a follow up run:
task status
For more info on redeployments, consult the docs, under safe-workflow-for-updating-a-swarm-stack
For full documentation, setup instructions, and architecture details, visit the docs or run:
task docs
Docs available at: http://127.0.0.1:8000/
- π Metrics Collection & Visualization β real-time system, application, and container insights
- π Reliability & Scalability β designing a monitoring stack built for production
This project is designed for a simple, one-command setup. All necessary actions are orchestrated through
Taskfile.yaml
.
task setup # setup the environment
task dev # automated local provisioning
task cleanup-dev # cleanup the dev environment
The Taskfile.gitflow.yaml
provides a structured Git workflow using Git Flow. This helps in managing features, releases, and hotfixes in a standardized way. To run these tasks just its the same as running any other task. Using gitflow is optional.
task init # Initialize Git Flow with 'main', gh-pages and 'develop'
task sync # Sync current branch with latest 'develop' and handle main updates
task release:finish # Finishes and publishes a release (merges, tags, pushes). e.g task release:finish version="1.2.0"
To see all tasks:
task --list-all
If you do not want the gitflow tasks, you can remove the Taskfile.gitflow.yaml
file and unlink it from the Taskfile.yaml
file (remove the includes
section). If you cannot find the section use CTRL + F to search for Taskfile.yaml.
Important notes to remember whilst using the project
For comprehensive troubleshooting, refer to the Troubleshooting section. Or open the github pages here and use the search bar to search your issue (USE INDIVIDUAL KEYWORDS NOT THE ISSUE NAME).
- π Metrics Collection & Visualization β real-time system, application, and container insights
- π¨ Alerting & Incident Response β proactive notifications via Slack/Email/PagerDuty
Contributions welcome! Open an issue or submit a PR.
Distributed under the MIT License. See LICENSE
for more info.
Your Name β @linkedin β @twitter/x β [email protected]
Project Link: https://github.com/sean-njela/demo_monitoring
About Me - About Me