elastic
diff --git a/‎modules/apm/NAMING.md
+15-8 b/‎modules/apm/NAMING.md
+15-8
diff --git a/‎modules/apm/src/main/java/org/elasticsearch/telemetry/apm/AbstractInstrument.java
+3-7 b/‎modules/apm/src/main/java/org/elasticsearch/telemetry/apm/AbstractInstrument.java
+3-7
diff --git a/‎modules/apm/src/main/java/org/elasticsearch/telemetry/apm/internal/MetricNameValidator.java
+142 b/‎modules/apm/src/main/java/org/elasticsearch/telemetry/apm/internal/MetricNameValidator.java
+142
diff --git a/‎modules/apm/src/test/java/org/elasticsearch/telemetry/apm/APMMeterRegistryTests.java
+12-27 b/‎modules/apm/src/test/java/org/elasticsearch/telemetry/apm/APMMeterRegistryTests.java
+12-27
diff --git a/‎modules/apm/src/test/java/org/elasticsearch/telemetry/apm/MeterRegistryConcurrencyTests.java
+1-1 b/‎modules/apm/src/test/java/org/elasticsearch/telemetry/apm/MeterRegistryConcurrencyTests.java
+1-1
@@ -17,13 +17,13 @@ The **hierarchy** should be built by putting "more common" elements at the begin
 
 Example:
 * prefer `es.indices.docs.deleted.total `to `es.indices.total.deleted.docs`
-* This way you can later add` es.indices.docs.count, es.indices.docs.ingested.total`, etc.)
+* This way you can later add` es.indices.docs.total, es.indices.docs.ingested.total`, etc.)
 
 Prefix metrics:
 * Always use `es` as our root application name: this will give us a separate namespace and avoid any possibility of clashes with other metrics, and quick identification of Elasticsearch metrics on a dashboard.
 * Follow the root prefix with a simple module name, team or area of code. E.g. `snapshot, repositories, indices, threadpool`. Notice the mix of singular and plural - here this is intentional, to reflect closely the existing names in the codebase (e.g. `reindex` and `indices`)
-* In building a metric name, look for existing prefixes (e.g. module name and/or area of code, e.g. `blob_cache`) and for existing sub-elements as well (e.g. `error`) to build a good, consistent name. E.g. prefer the consistent use of `error.count` rather than introducing `failures`, `failed.count` or `errors`.` `
-* Avoid having sub-metrics under a name that is also a metric (e.g. do not create names like `es.repositories.elements`,` es.repositories.elements.utilization`; use` es.repositories.element.count` and` es.repositories.element.utilization `instead). Such metrics are hard to handle well in Elasticsearch, or in some internal structures (e.g. nested maps).
+* In building a metric name, look for existing prefixes (e.g. module name and/or area of code, e.g. `blob_cache`) and for existing sub-elements as well (e.g. `error`) to build a good, consistent name. E.g. prefer the consistent use of `error.total` rather than introducing `failures`, `failed.total` or `errors`.` `
+* Avoid having sub-metrics under a name that is also a metric (e.g. do not create names like `es.repositories.elements`,` es.repositories.elements.utilization`; use` es.repositories.element.total` and` es.repositories.element.utilization `instead). Such metrics are hard to handle well in Elasticsearch, or in some internal structures (e.g. nested maps).
 
 Keep the hierarchy compact: do not add elements if you don’t need to. There is a description field when registering a metric, prefer using that as an explanation. \
 For example, if emitting existing metrics from node stats, do not use the whole “object path”, but choose the most significant terms.
@@ -35,7 +35,7 @@ The metric name can be generated but there should be no dynamic or variable cont
 * Rule of thumb: you should be able to do aggregations (e.g. sum, avg) across a dimension of a given metric (without the need to aggregate over different metric names); on the other hand, any aggregation across any dimension of a given metric should be meaningful.
 * There might be exceptions of course. For example:
     * When similar metrics have significantly different implementations/related metrics.  \
-      If we have only common metrics like  `es.repositories.element.count, es.repositories.element.utilization, es.repositories.writes.total` for every blob storage implementation, then `s3,azure` should be an attribute. \
+      If we have only common metrics like  `es.repositories.element.total, es.repositories.element.utilization, es.repositories.writes.total` for every blob storage implementation, then `s3,azure` should be an attribute. \
       If we have specific metrics, e.g. for s3 storage classes, prefer using prefixed metric names for the specific metrics:  <code>es.repositories.<strong>s3</strong>.deep_archive_access.total</code> (but keep `es.repositories.elements`)
     * When you have a finite and fixed set of names it might be OK to have them in the name (e.g. "`young`" and "`old`" for GC generations).
 
@@ -47,12 +47,19 @@ Examples :
 * <code>es.indices.storage.write.<strong>io</strong></code>, instead of <code>es.indices.storage.write.<strong>bytes_per_sec</strong></code>
 * These can all be composed with the suffixes below, e.g. <code>es.process.jvm.collection.<strong>time.total</strong></code>, <code>es.indices.storage.write.<strong>total</strong></code> to represent the monotonic sum of time spent in GC and the total number of bytes written to indices respectively.
 
-**Pluralization** and **suffixes**:
-* If the metric is unit-less, use plural: `es.threadpool.activethreads`, `es.indices.docs`
-* Use `total` as a suffix for monotonic sums (e.g. <code>es.indices.docs.deleted.<strong>total</strong></code>)
-* Use `count` to represent the count of "things" in the metric name/namespace (e.g. if we have `es.process.jvm.classes.loaded`, we will express the number of classes currently loaded by the JVM as <code>es.process.jvm.classes.loaded.<strong>count</strong></code>, and the total number of classes loaded since the JVM started as <code>es.process.jvm.classes.loaded.<strong>total</strong></code>
+**Suffixes**:
+* Use `total` as a suffix for monotonic metrics (always increasing counter) (e.g. <code>es.indices.docs.deleted.<strong>total</strong></code>)
+  * Note: even though async counter is reporting a total cumulative value, it is till monotonic.
+* Use `current` to represent the non-monotonic metrics (like gauges, upDownCounters)
+  * e.g. `current` vs `total` We can have <code>es.process.jvm.classes.loaded.<strong>current</strong></code> to express the number of classes currently loaded by the JVM, and the total number of classes loaded since the JVM started as <code>es.process.jvm.classes.loaded.<strong>total</strong></code>
 * Use `ratio` to represent the ratio of two measures with identical unit (or unit-less) or measures that represent a fraction in the range [0, 1]. Examples:
     * Exception: consider using utilization when the ratio is between a usage and its limit, e.g. the ratio between <code>es.process.jvm.heap.<strong>usage</strong></code> and <code>es.process.jvm.heap.<strong>limit</strong></code> should be <code>es.process.jvm.heap.<strong>utilization</strong></code>
+* Use `status` to represent enum like gauges. example <code>es.health.overall.red.status</code> have values 1/0 to represent true/false
+* Use `usage` to represent the amount used ouf of the known resource size
+* Use `size` to represent the overall size of the resource measured
+* Use `utilisation` to represent a fraction of usage out of the overall size of a resource measured
+* Use `histogram` to represent instruments of type histogram
+* Use `time` to represent passage of time
 * If it has a unit of measure, then it should not be plural (and also not include the unit of measure, see above). Examples:  <code>es.process.jvm.collection.time, es.process.mem.virtual.usage<strong>, </strong>es.indices.storage.utilization</code>
 
 ### Attributes
 
@@ -11,6 +11,7 @@
 import io.opentelemetry.api.metrics.Meter;
 
 import org.elasticsearch.core.Nullable;
+import org.elasticsearch.telemetry.apm.internal.MetricNameValidator;
 import org.elasticsearch.telemetry.metric.Instrument;
 
 import java.security.AccessController;
@@ -23,6 +24,7 @@
  * An instrument that contains the name, description and unit.  The delegate may be replaced when
  * the provider is updated.
  * Subclasses should implement the builder, which is used on initialization and provider updates.
+ *
  * @param <T> delegated instrument
  */
 public abstract class AbstractInstrument<T> implements Instrument {
@@ -50,19 +52,13 @@ void setProvider(@Nullable Meter meter) {
     }
 
     protected abstract static class Builder<T> {
-        private static final int MAX_NAME_LENGTH = 255;
 
         protected final String name;
         protected final String description;
         protected final String unit;
 
         public Builder(String name, String description, String unit) {
-            if (name.length() > MAX_NAME_LENGTH) {
-                throw new IllegalArgumentException(
-                    "Instrument name [" + name + "] with length [" + name.length() + "] exceeds maximum length [" + MAX_NAME_LENGTH + "]"
-                );
-            }
-            this.name = Objects.requireNonNull(name);
+            this.name = MetricNameValidator.validate(name);
             this.description = Objects.requireNonNull(description);
             this.unit = Objects.requireNonNull(unit);
         }
 
@@ -0,0 +1,142 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0 and the Server Side Public License, v 1; you may not use this file except
+ * in compliance with, at your election, the Elastic License 2.0 or the Server
+ * Side Public License, v 1.
+ */
+
+package org.elasticsearch.telemetry.apm.internal;
+
+import java.util.Objects;
+import java.util.Set;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+import java.util.stream.Collectors;
+
+public class MetricNameValidator {
+    private static final Pattern ALLOWED_CHARACTERS = Pattern.compile("[a-z][a-z0-9_]*");
+    static final Set<String> ALLOWED_SUFFIXES = Set.of(
+        "total",
+        "current",
+        "ratio",
+        "status" /*a workaround for enums */,
+        "usage",
+        "size",
+        "utilization",
+        "histogram",
+        "time"
+    );
+    static final int MAX_METRIC_NAME_LENGTH = 255;
+
+    static final int MAX_ELEMENT_LENGTH = 30;
+    static final int MAX_NUMBER_OF_ELEMENTS = 10;
+
+    private MetricNameValidator() {}
+
+    /**
+     * Validates a metric name as per guidelines in Naming.md
+     *
+     * @param metricName metric name to be validated
+     * @throws IllegalArgumentException an exception indicating an incorrect metric name
+     */
+    public static String validate(String metricName) {
+        Objects.requireNonNull(metricName);
+        validateMaxMetricNameLength(metricName);
+
+        String[] elements = metricName.split("\\.");
+        hasESPrefix(elements, metricName);
+        hasAtLeast3Elements(elements, metricName);
+        hasNotBreachNumberOfElementsLimit(elements, metricName);
+        lastElementIsFromAllowList(elements, metricName);
+        perElementValidations(elements, metricName);
+        return metricName;
+    }
+
+    private static void validateMaxMetricNameLength(String metricName) {
+        if (metricName.length() > MAX_METRIC_NAME_LENGTH) {
+            throw new IllegalArgumentException(
+                "Metric name length "
+                    + metricName.length()
+                    + "is longer than max metric name length:"
+                    + MAX_METRIC_NAME_LENGTH
+                    + " Name was: "
+                    + metricName
+            );
+        }
+    }
+
+    private static void lastElementIsFromAllowList(String[] elements, String name) {
+        String lastElement = elements[elements.length - 1];
+        if (ALLOWED_SUFFIXES.contains(lastElement) == false) {
+            throw new IllegalArgumentException(
+                "Metric name should end with one of ["
+                    + ALLOWED_SUFFIXES.stream().collect(Collectors.joining(","))
+                    + "] "
+                    + "Last element was: "
+                    + lastElement
+                    + ". "
+                    + "Name was: "
+                    + name
+            );
+        }
+    }
+
+    private static void hasNotBreachNumberOfElementsLimit(String[] elements, String name) {
+        if (elements.length > MAX_NUMBER_OF_ELEMENTS) {
+            throw new IllegalArgumentException(
+                "Metric name should have at most 10 elements. It had: " + elements.length + ". The name was: " + name
+            );
+        }
+    }
+
+    private static void hasAtLeast3Elements(String[] elements, String name) {
+        if (elements.length < 3) {
+            throw new IllegalArgumentException(
+                "Metric name consist of at least 3 elements. An es. prefix, group and a name. The name was: " + name
+            );
+        }
+    }
+
+    private static void hasESPrefix(String[] elements, String name) {
+        if (elements[0].equals("es") == false) {
+            throw new IllegalArgumentException(
+                "Metric name should start with \"es.\" prefix and use \".\" as a separator. Name was: " + name
+            );
+        }
+    }
+
+    private static void perElementValidations(String[] elements, String name) {
+        for (String element : elements) {
+            hasOnlyAllowedCharacters(element, name);
+            hasNotBreachLengthLimit(element, name);
+        }
+    }
+
+    private static void hasNotBreachLengthLimit(String element, String name) {
+        if (element.length() > MAX_ELEMENT_LENGTH) {
+            throw new IllegalArgumentException(
+                "Metric name's element should not be longer than "
+                    + MAX_ELEMENT_LENGTH
+                    + " characters. Was: "
+                    + element.length()
+                    + ". Name was: "
+                    + name
+            );
+        }
+    }
+
+    private static void hasOnlyAllowedCharacters(String element, String name) {
+        Matcher matcher = ALLOWED_CHARACTERS.matcher(element);
+        if (matcher.matches() == false) {
+            throw new IllegalArgumentException(
+                "Metric name should only use [a-z][a-z0-9_]* characters. "
+                    + "Element does not match: \""
+                    + element
+                    + "\". "
+                    + "Name was: "
+                    + name
+            );
+        }
+    }
+}
@@ -35,10 +35,8 @@
 import java.util.List;
 import java.util.function.Supplier;
 
-import static org.hamcrest.Matchers.containsString;
 import static org.hamcrest.Matchers.equalTo;
 import static org.hamcrest.Matchers.hasSize;
-import static org.hamcrest.Matchers.instanceOf;
 import static org.hamcrest.Matchers.sameInstance;
 
 public class APMMeterRegistryTests extends ESTestCase {
@@ -84,8 +82,8 @@ public void testMeterIsOverridden() {
     public void testLookupByName() {
         var apmMeter = new APMMeterService(TELEMETRY_ENABLED, () -> testOtel, () -> noopOtel).getMeterRegistry();
 
-        DoubleCounter registeredCounter = apmMeter.registerDoubleCounter("name", "desc", "unit");
-        DoubleCounter lookedUpCounter = apmMeter.getDoubleCounter("name");
+        DoubleCounter registeredCounter = apmMeter.registerDoubleCounter("es.test.name.total", "desc", "unit");
+        DoubleCounter lookedUpCounter = apmMeter.getDoubleCounter("es.test.name.total");
 
         assertThat(lookedUpCounter, sameInstance(registeredCounter));
     }
@@ -103,19 +101,6 @@ public void testNoopIsSetOnStop() {
         assertThat(meter, sameInstance(noopOtel));
     }
 
-    public void testMaxNameLength() {
-        APMMeterService apmMeter = new APMMeterService(TELEMETRY_ENABLED, () -> testOtel, () -> noopOtel);
-        apmMeter.start();
-        int max_length = 255;
-        var counter = apmMeter.getMeterRegistry().registerLongCounter("a".repeat(max_length), "desc", "count");
-        assertThat(counter, instanceOf(LongCounter.class));
-        IllegalArgumentException iae = expectThrows(
-            IllegalArgumentException.class,
-            () -> apmMeter.getMeterRegistry().registerLongCounter("a".repeat(max_length + 1), "desc", "count")
-        );
-        assertThat(iae.getMessage(), containsString("exceeds maximum length [255]"));
-    }
-
     public void testAllInstrumentsSwitchProviders() {
         TestAPMMeterService apmMeter = new TestAPMMeterService(
             Settings.builder().put(APMAgentSettings.TELEMETRY_METRICS_ENABLED_SETTING.getKey(), false).build(),
@@ -125,18 +110,18 @@ public void testAllInstrumentsSwitchProviders() {
         APMMeterRegistry registry = apmMeter.getMeterRegistry();
 
         Supplier<DoubleWithAttributes> doubleObserver = () -> new DoubleWithAttributes(1.5, Collections.emptyMap());
-        DoubleCounter dc = registry.registerDoubleCounter("dc", "", "");
-        DoubleUpDownCounter dudc = registry.registerDoubleUpDownCounter("dudc", "", "");
-        DoubleHistogram dh = registry.registerDoubleHistogram("dh", "", "");
-        DoubleAsyncCounter dac = registry.registerDoubleAsyncCounter("dac", "", "", doubleObserver);
-        DoubleGauge dg = registry.registerDoubleGauge("dg", "", "", doubleObserver);
+        DoubleCounter dc = registry.registerDoubleCounter("es.test.dc.total", "", "");
+        DoubleUpDownCounter dudc = registry.registerDoubleUpDownCounter("es.test.dudc.current", "", "");
+        DoubleHistogram dh = registry.registerDoubleHistogram("es.test.dh.histogram", "", "");
+        DoubleAsyncCounter dac = registry.registerDoubleAsyncCounter("es.test.dac.total", "", "", doubleObserver);
+        DoubleGauge dg = registry.registerDoubleGauge("es.test.dg.current", "", "", doubleObserver);
 
         Supplier<LongWithAttributes> longObserver = () -> new LongWithAttributes(100, Collections.emptyMap());
-        LongCounter lc = registry.registerLongCounter("lc", "", "");
-        LongUpDownCounter ludc = registry.registerLongUpDownCounter("ludc", "", "");
-        LongHistogram lh = registry.registerLongHistogram("lh", "", "");
-        LongAsyncCounter lac = registry.registerLongAsyncCounter("lac", "", "", longObserver);
-        LongGauge lg = registry.registerLongGauge("lg", "", "", longObserver);
+        LongCounter lc = registry.registerLongCounter("es.test.lc.total", "", "");
+        LongUpDownCounter ludc = registry.registerLongUpDownCounter("es.test.ludc.total", "", "");
+        LongHistogram lh = registry.registerLongHistogram("es.test.lh.histogram", "", "");
+        LongAsyncCounter lac = registry.registerLongAsyncCounter("es.test.lac.total", "", "", longObserver);
+        LongGauge lg = registry.registerLongGauge("es.test.lg.current", "", "", longObserver);
 
         apmMeter.setEnabled(true);
 
 
@@ -28,7 +28,7 @@
 import static org.hamcrest.Matchers.sameInstance;
 
 public class MeterRegistryConcurrencyTests extends ESTestCase {
-    private final String name = "name";
+    private final String name = "es.test.name.total";
     private final String description = "desc";
     private final String unit = "kg";
     private final Meter noopMeter = OpenTelemetry.noop().getMeter("noop");