1111Compute Energy & Emissions Monitoring Stack (CEEMS) contains a Prometheus exporter to
1212export metrics of compute instance units and a REST API server which is meant to be used
1313as JSON datasource in Grafana that exposes the metadata and aggregated metrics of each
14- compute unit.
14+ compute unit. Optionally, it includes a TSDB load balancer that supports basic load
15+ balancing functionality based on retention periods of two or more TSDBs.
1516
1617"Compute Unit" in the current context has a wider scope. It can be a batch job in HPC,
1718a VM in cloud, a pod in k8s, _ etc_ . The main objective of the repository is to quantify
@@ -82,10 +83,10 @@ current exporter only exposes the GPU index to compute unit mapping. These two m
8283can be used together using PromQL to show the metrics of GPU metrics of a given compute
8384unit.
8485
85- ## End product
86+ ## Current stack objective
8687
8788Using this stack with Prometheus and Grafana will enable users to have access to time
88- series data of their compute units be it a batch job, a VM or a pod. The users will
89+ series metrics of their compute units be it a batch job, a VM or a pod. The users will
8990also able to have information on total energy consumed and total emissions generated
9091by their individual workloads, by their project/namespace.
9192
@@ -101,14 +102,16 @@ This monorepo contains following apps that can be used with Grafana and Promethe
101102metrics, RAPL energy, IPMI power consumption, emission factor and GPU to compute unit
102103mapping.
103104
104- - ` ceems_server ` : This is a simple REST API server that exposes projects and compute units
105+ - ` ceems_api_server ` : This is a simple REST API server that exposes projects and compute units
105106information of users by querying a SQLite3 DB.
106107This server can be used as
107108[ JSON API DataSource] ( https://grafana.github.io/grafana-json-datasource/installation/ ) or
108109[ Infinity DataSource] ( https://grafana.com/grafana/plugins/yesoreyeram-infinity-datasource/ )
109110in Grafana to construct dashboards for users. The DB contain aggregate metrics of each
110111compute unit along with aggregate metrics of each project.
111112
113+ - ` ceems_lb ` : This is a basic load balancer meant to work with TSDB instances.
114+
112115Currently, only SLURM is supported as a resource manager. In future support for Openstack
113116and Kubernetes will be added.
114117
@@ -121,7 +124,7 @@ Pre-compiled binaries of the apps can be downloaded from the
121124
122125### Build
123126
124- As the ` ceems_server ` uses SQLite3 as DB backend, we are dependent on CGO for
127+ As the ` ceems_api_server ` uses SQLite3 as DB backend, we are dependent on CGO for
125128compiling that app. On the other hand, ` ceems_exporter ` is a pure GO application.
126129Thus, in order to build from sources, users need to execute two build commands
127130
@@ -135,9 +138,9 @@ that builds `ceems_exporter` binary and
135138CGO_BUILD=1 make build
136139```
137140
138- which builds ` ceems_server ` app .
141+ which builds ` ceems_api_server ` and ` ceems_lb ` apps .
139142
140- Both of them will be placed in ` bin ` folder in the root of the repository
143+ All the applications will be placed in ` bin ` folder in the root of the repository
141144
142145### Running tests
143146
@@ -148,7 +151,7 @@ make tests
148151CGO_BUILD=1 make tests
149152```
150153
151- ## Configuration
154+ ## CEEMS Exporter
152155
153156Currently, the exporter supports only SLURM resource manager.
154157` ceems_exporter ` provides following collectors:
@@ -254,12 +257,42 @@ These metrics are mainly used to estimate the proportion of CPU and memory usage
254257individual compute units and to estimate the energy consumption of compute unit
255258based on these proportions.
256259
257- ## API server
260+ ## CEEMS API server
258261
259- As discussed in the introduction, ` ceems_server `
262+ As discussed in the introduction, ` ceems_api_server `
260263exposes usage and compute unit details of users _ via_ API end points. This data will be
261264gathered from the underlying resource manager at a configured interval of time and
262- keep it in a local DB.
265+ keep it in a local DB.
266+
267+ ## CEEMS Load Balancer
268+
269+ Taking Prometheus TSDB as an example, Prometheus advises to use local file system to store
270+ the data. This ensure performance and data integrity. However, storing data on local
271+ disk is not fault tolerant unless data is replicated elsewhere. There are cloud native
272+ projects like [ Thanos] ( https://thanos.io/ ) , [ Cortex] ( https://cortexmetrics.io/ ) to
273+ address this issue. This load balancer is meant
274+ to provide the basic functionality proposed by Thanos, Cortex, _ etc_ .
275+
276+ The core idea is to replicate the Prometheus data using
277+ [ Prometheus' remote write] ( https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write )
278+ functionality onto a remote storage which
279+ is fault tolerant and have higher storage capacity but with a degraded query performance.
280+ In this scenario, we have two TSDBs with following characteristics:
281+
282+ - TSDB using local disk: faster query performance with limited storage space
283+ - TSDB using remote storage: slower query performance with bigger storage space
284+
285+ TSDB using local disk ("hot" instance) will have shorter retention period and the
286+ one using remote storage ("cold" instance)
287+ can have longer retention. CEEMS load balancer is capable of introspecting the query and
288+ then routing the request to either "hot" or "cold" instances of TSDB.
289+
290+ Besides CEEMS load balancer is capable of providing basic access control policies of
291+ TSDB if the DB of CEEMS API server is provided. It means when a user makes a TSDB query
292+ for a given compute unit identified by a ` uuid ` , CEEMS load balancer will check if the
293+ user owns that compute unit by check with the DB and decide to proxy the request to
294+ TSDB or not. This is very handy as Grafana does not impose any access control on datasources
295+ and current load balancer can provide such functionality.
263296
264297<!-- In the case of SLURM, the app executes `sacct` command to get
265298info on jobs. However, `sacct` command needs to be executed as either `root` or `slurm`
@@ -286,11 +319,11 @@ using different methods like `setuid` sticky bit. -->
286319## Linux capabilities
287320
288321Linux capabilities can be assigned to either file or process. For instance, capabilities
289- on the ` ceems_exporter ` and ` ceems_server ` binaries can be set as follows:
322+ on the ` ceems_exporter ` and ` ceems_api_server ` binaries can be set as follows:
290323
291324```
292325sudo setcap cap_sys_ptrace,cap_dac_read_search,cap_setuid,cap_setgid+ep /full/path/to/ceems_exporter
293- sudo setcap cap_setuid,cap_setgid+ep /full/path/to/ceems_server
326+ sudo setcap cap_setuid,cap_setgid+ep /full/path/to/ceems_api_server
294327```
295328
296329This will assign all the capabilities that are necessary to run ` ceems_exporter `
@@ -460,26 +493,26 @@ file capabilities or process capabilities, the flags `--collector.slurm.job.prop
460493and ` --collector.slurm.gpu.job.map.path ` can be omitted and there is no need to
461494set up prolog and epilog scripts.
462495
463- ### ` ceems_server `
496+ ### ` ceems_api_server `
464497
465498The stats server can be started as follows:
466499
467500```
468- /path/to/ceems_server \
501+ /path/to/ceems_api_server \
469502 --resource.manager.slurm \
470503 --storage.data.path="/var/lib/ceems" \
471504 --log.level="debug"
472505```
473506
474507Data files like SQLite3 DB created for the server will be placed in
475508` /var/lib/ceems ` directory. Note that if this directory does exist,
476- ` ceems_server ` will attempt to create one if it has enough privileges. If it
509+ ` ceems_api_server ` will attempt to create one if it has enough privileges. If it
477510fails to create, error will be shown up.
478511
479512<!-- To execute `sacct` command as `slurm` user, command becomes following:
480513
481514```
482- /path/to/ceems_server \
515+ /path/to/ceems_api_server \
483516 --slurm.sacct.path="/usr/local/bin/sacct" \
484517 --slurm.sacct.run.as.slurmuser \
485518 --path.data="/var/lib/ceems" \
@@ -490,30 +523,64 @@ Note that this approach needs capabilities assigned to process. On the other han
490523we want to use `sudo` approach to execute `sacct` command, the command becomes:
491524
492525```
493- /path/to/ceems_server \
526+ /path/to/ceems_api_server \
494527 --slurm.sacct.path="/usr/local/bin/sacct" \
495528 --slurm.sacct.run.with.sudo \
496529 --path.data="/var/lib/ceems" \
497530 --log.level="debug"
498531```
499532
500533This requires an entry into sudoers file that permits the user starting
501- `ceems_server ` to execute `sudo sacct` without password. -->
534+ `ceems_api_server ` to execute `sudo sacct` without password. -->
502535
503- ` ceems_server ` updates the local DB with job information regularly. The frequency
536+ ` ceems_api_server ` updates the local DB with job information regularly. The frequency
504537of this update and period for which the data will be retained can be configured
505538too. For instance, the following command will update the DB for every 30 min and
506539keeps the data for the past one year.
507540
508541```
509- /path/to/ceems_server \
542+ /path/to/ceems_api_server \
510543 --resource.manager.slurm \
511544 --storage.path.data="/var/lib/ceems" \
512545 --storage.data.update.interval="30m" \
513546 --storage.data.retention.period="1y" \
514547 --log.level="debug"
515548```
516549
550+ ### ` ceems_lb `
551+
552+ A basic config file used by ` ceems_lb ` is as follows:
553+
554+ ```
555+ strategy: resource-based
556+ db_path: data/ceems_api_server.db
557+ backends:
558+ - url: "http://localhost:9090"
559+ skip_tls_verify: true
560+ - url: "http://localhost:9091"
561+ skip_tls_verify: true
562+ ```
563+
564+ - Keyword ` strategy ` can take either ` round-robin ` , ` least-connection ` and ` resource-based `
565+ as values. Using ` resource-based ` stragety, the queries are proxied to backend TSDB
566+ instances based on the data available with each instance as
567+ [ described in CEEMS load balancer] ( #ceems-load-balancer ) .
568+ - Keyword ` db_path ` takes the path to CEEMS API server DB file. This file is optional and
569+ if provided, it offers basic access control
570+ - Keyword ` backends ` take a list of TSDB backends.
571+
572+ The load balancer can be started as follows:
573+
574+ ```
575+ /path/to/cemms_lb \
576+ --config.file=config.yml \
577+ --log.level="debug"
578+ ```
579+
580+ This will start a load balancer at port ` 9030 ` by default. In Grafana we need to
581+ configure this load balancer as Prometheus data source URL as requests are proxied by
582+ the load balancer.
583+
517584## TLS and basic auth
518585
519586Exporter and API server support TLS and basic auth using
@@ -522,7 +589,8 @@ basic auth, users need to use `--web-config-file` CLI flag as follows
522589
523590```
524591ceems_exporter --web-config-file=web-config.yaml
525- ceems_server --web-config-file=web-config.yaml
592+ ceems_api_server --web-config-file=web-config.yaml
593+ ceems_lb --web-config-file=web-config.yaml
526594```
527595
528596A sample ` web-config.yaml ` file can be fetched from
0 commit comments