diff --git a/.yamllint b/.yamllint deleted file mode 100644 index 976409871..000000000 --- a/.yamllint +++ /dev/null @@ -1,10 +0,0 @@ ---- - -extends: default - -rules: - line-length: - max: 160 - level: warning - -ignore-from-file: .gitignore diff --git a/roles/telemetry_chargeback/.yamllint b/roles/telemetry_chargeback/.yamllint new file mode 100644 index 000000000..18483eb1a --- /dev/null +++ b/roles/telemetry_chargeback/.yamllint @@ -0,0 +1,18 @@ +--- +# Ansible-lint compatible yamllint config for this role only. +# See: https://ansible.readthedocs.io/projects/lint/rules/yaml/ +extends: default + +rules: + comments: + min-spaces-from-content: 1 + comments-indentation: false + braces: + min-spaces-inside: 0 + max-spaces-inside: 1 + octal-values: + forbid-implicit-octal: true + forbid-explicit-octal: true + line-length: + max: 160 + level: warning diff --git a/roles/telemetry_chargeback/README.md b/roles/telemetry_chargeback/README.md index 192b72a3d..f0693781e 100644 --- a/roles/telemetry_chargeback/README.md +++ b/roles/telemetry_chargeback/README.md @@ -5,7 +5,7 @@ The **`telemetry_chargeback`** role is designed to test the **RHOSO Cloudkitty** The role performs two main functions: 1. **CloudKitty Validation** - Enables and configures the CloudKitty hashmap rating module, then validates its state. -2. **Synthetic Data Generation** - Generates synthetic Loki log data for testing chargeback scenarios using a Python script and Jinja2 template. +2. **Synthetic Data Generation & Analysis** - Generates synthetic Loki log data for testing chargeback scenarios and calculates metric totals. The role automatically discovers and processes all scenario files matching `test_*.yml` in the `files/` directory. For each scenario it runs: generate synthetic data, compute syn-totals, ingest to Loki, flush Loki ingester memory, and get cost via CloudKitty rating summary (using begin/end from syn-totals). Retrieve-from-Loki is included in the load_loki_data flow. After all scenarios, the role runs cleanup (`cleanup_ck.yml`) to remove the local flush cert directory. Requirements ------------ @@ -15,7 +15,7 @@ It relies on the following being available on the target or control host: * The **OpenStack CLI client** must be installed and configured with administrative credentials. * Required Python libraries for the `openstack` CLI (e.g., `python3-openstackclient`). * Connectivity to the OpenStack API endpoint. -* **Python 3** with the following libraries for synthetic data generation: +* **Python 3** with the following libraries for synthetic data generation and analysis: * `PyYAML` * `Jinja2` @@ -23,6 +23,7 @@ It is expected to be run **after** a successful deployment and configuration of * **OpenStack:** A functional OpenStack cloud (RHOSO) environment. * **Cloudkitty:** The Cloudkitty service must be installed, configured, and running. +* **Loki / OpenShift (for ingest and flush):** When using ingest and flush tasks, the control host must have `oc` CLI access, and the Cloudkitty Loki stack (route, certificates, ingester) must be deployed. The role sets Loki push/query URLs and extracts certificates via `setup_loki_env.yml`. Role Variables -------------- @@ -40,23 +41,97 @@ These variables are used internally by the role and typically do not need to be | Variable | Default Value | Description | |----------|---------------|-------------| -| `logs_dir_zuul` | `/home/zuul/ci-framework-data/logs` | Remote directory for log files. | -| `artifacts_dir_zuul` | `/home/zuul/ci-framework-data/artifacts` | Directory for generated artifacts. | -| `ck_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. | -| `ck_data_template` | `{{ role_path }}/template/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. | -| `ck_data_config` | `{{ role_path }}/files/test_static.yml` | Path to the scenario configuration file. | -| `ck_output_file_local` | `{{ artifacts_dir_zuul }}/loki_synth_data.json` | Local path for generated synthetic data. | -| `ck_output_file_remote` | `{{ logs_dir_zuul }}/gen_loki_synth_data.log` | Remote destination for synthetic data. | +| `logs_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/logs` | Remote directory for log files. | +| `artifacts_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/artifacts` | Directory for generated artifacts. | +| `cloudkitty_scenario_dir` | `{{ role_path }}/files` | Directory containing scenario files (`test_*.yml`). | +| `cloudkitty_synth_data_suffix` | `-synth_data.json` | Suffix for generated synthetic data files. | +| `cloudkitty_loki_data_suffix` | `-loki_data.json` | Suffix for Loki query result JSON files. | +| `cloudkitty_synth_totals_suffix` | `-synth_metrics_totals.yml` | Suffix for generated metric totals files (from synthetic data). | +| `cloudkitty_loki_totals_suffix` | `-loki_totals.yml` | Suffix for CloudKitty rating summary output files (from loki_rate task). | +| `cloudkitty_loki_totals_metrics_suffix` | `-loki_metrics_totals.yml` | Suffix for metric totals computed from Loki-retrieved JSON (retrieve_loki_data task). | +| `cloudkitty_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. | +| `cloudkitty_data_template` | `{{ role_path }}/templates/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. | +| `cloudkitty_totals_script` | `{{ role_path }}/files/gen_synth_loki_metrics_totals.py` | Path to the metric totals calculation script. | + +### Loki / OpenShift Variables (vars/main.yml) + +Used by setup, ingest, flush, and retrieve tasks when running against Loki on OpenShift: + +| Variable | Default Value | Description | +|----------|---------------|-------------| +| `cert_secret_name` | `cert-cloudkitty-client-internal` | OpenShift secret name for client certificates. | +| `cert_dir` | `{{ ansible_user_dir }}/ck-certs` | Local directory for extracted ingest/query certs. | +| `client_secret` | `secret/cloudkitty-lokistack-gateway-client-http` | Secret for flush client certs. | +| `ca_configmap` | `cm/cloudkitty-lokistack-ca-bundle` | ConfigMap for CA bundle. | +| `remote_cert_dir` | `osp-certs` | Directory inside the OpenStack pod for certs. | +| `local_cert_dir` | `{{ ansible_env.HOME }}/ci-framework-data/flush_certs` | Local directory for flush certs (removed by cleanup_ck.yml after the run). | +| `logql_query` | `{service="cloudkitty"}` (overridable via `loki_query`) | LogQL query for Loki. | +| `cloudkitty_namespace` | `openstack` | OpenShift namespace for Cloudkitty/Loki resources. | +| `openstackpod` | `openstackclient` | OpenStack client pod name for exec/cp. | +| `lookback` | `6` | Days lookback for Loki query time range. | +| `limit` | `50` | Limit for Loki query results. | + +Loki push/query URLs are set dynamically in `setup_loki_env.yml` from the Cloudkitty Loki route. + +### Synthetic Data Scripts + +**gen_synth_loki_data.py** — Generates Loki-format JSON from a scenario YAML and template. The role invokes it with `-r` so that timestamps in the output are in **reverse** order (youngest first, oldest last). When run manually you can omit `-r` for chronological order (oldest first, youngest last). + +| Option | Description | +|--------|--------------| +| `--tmpl` | Path to the Jinja2 template (e.g. `loki_data_templ.j2`). | +| `-t`, `--test` | Path to the scenario YAML (e.g. `test_static_basic.yml`). | +| `-o`, `--output` | Path to the output JSON file. | +| `-p`, `--project-id` | Optional; overrides `groupby.project_id` in every log entry. | +| `-u`, `--user-id` | Optional; overrides `groupby.user_id` in every log entry. | +| `-r`, `--reverse` | Reverse timestamp order in JSON output (youngest first, oldest last). | +| `--debug` | Enable debug logging. | + +**gen_synth_loki_metrics_totals.py** — Reads the synthetic (or Loki-retrieved) JSON and writes a YAML with aggregated metrics and time bounds. The output is used by the role for validation and for the Loki query time range. + +Output YAML structure: + +* **time** — `begin`, `end` (ISO strings), `begin_nano`, `end_nano` (nanosecond timestamps for the first and last time step; used by the Loki query in `retrieve_loki_data.yml`). +* **data_log** — `total_time_steps`, `metrics_per_step`, `log_count`. +* **synth_rate** — Per-metric rates and `total_rate`. + +### Dynamically Set Variables + +Set in **main.yml** from the OpenStack CLI (`openstack project show admin` / `openstack user show admin`): + +| Variable | Description | +|----------|-------------| +| `cloudkitty_project_id` | ID of the OpenStack project named `admin` (empty string if not found). Passed as `-p` to the synthetic data generator when non-empty. | +| `cloudkitty_user_id` | ID of the OpenStack user named `admin` (empty string if not found). Passed as `-u` to the synthetic data generator when non-empty. | + +Set in **gen_synth_loki_data.yml** for each scenario file during the loop: + +| Variable | Description | +|----------|-------------| +| `cloudkitty_data_file` | Local path for generated JSON data (`{{ artifacts_dir_zuul }}/{{ scenario_name }}-synth_data.json`) | +| `cloudkitty_synth_totals_file` | Local path for calculated metric totals (`{{ artifacts_dir_zuul }}/{{ scenario_name }}{{ cloudkitty_synth_totals_suffix }}`) | +| `cloudkitty_test_file` | Path to the scenario configuration file (`{{ cloudkitty_scenario_dir }}/{{ scenario_name }}.yml`) | Scenario Configuration ---------------------- -The synthetic data generation is controlled by a YAML configuration file (`files/test_static.yml`). This file defines: +The synthetic data generation is controlled by YAML configuration files in the `files/` directory. Any file matching `test_*.yml` will be automatically discovered and processed. Files whose names start with an underscore (e.g. `_test_*.yml`) are **not** discovered by the role; they can be used as reference or for manual runs. + +Each scenario file defines: + +* **generation** — Time range configuration (days, step_seconds). +* **log_types** — List of log type definitions. Each entry has **type** (identifier and value in output), unit, description, qty, price, groupby, and metadata. The **groupby** dict typically includes dimension keys (e.g. id, user_id, project_id, tenant_id); the generator merges **date_fields** into groupby at run time. +* **required_fields** — Top-level keys required for each log type (e.g. type, unit, qty, price, groupby, metadata). +* **date_fields** — Date field names to merge into groupby (week_of_the_year, day_of_the_year, month, year). +* **loki_stream** — Loki stream configuration (service name). + +**groupby.id** should be consistent by metric type across all scenario files so that the same type always uses the same id. The reference mapping is defined in `_test_all_qty_zero.yml` (e.g. `ceilometer_cpu_num` → me1, `ceilometer_image_size` → me2, `ceilometer_ip_floating` → me7). + +Example scenario files: -* **generation** - Time range configuration (days, step_seconds) -* **log_types** - List of log type definitions with name, type, unit, qty, price, groupby, and metadata -* **required_fields** - Fields required for validation -* **date_fields** - Date fields to add to groupby (week_of_the_year, day_of_the_year, month, year) -* **loki_stream** - Loki stream configuration (service name) +* `test_static_basic.yml` — Basic static values for qty and price. +* `test_static_basic_gid.yml` — Same as above with explicit groupby ids. +* `test_dyn_basic.yml` — Dynamic values distributed across time steps. +* `_test_all_qty_zero.yml` — Reference scenario (all quantities zero); defines the standard groupby.id mapping. Not auto-discovered. Dependencies ------------ diff --git a/roles/telemetry_chargeback/files/gen_db_summary.py b/roles/telemetry_chargeback/files/gen_db_summary.py new file mode 100644 index 000000000..5a9cecd82 --- /dev/null +++ b/roles/telemetry_chargeback/files/gen_db_summary.py @@ -0,0 +1,247 @@ +#!/usr/bin/env python3 +""" +Extract [timestep, log_entry] arrays from a Loki JSON file (or text), +sort by timestep, and print each plus log_count. + +Uses the same input arguments as gen_synth_loki_metrics_totals.py. +""" +import argparse +import json +import sys +from collections import Counter +from pathlib import Path + +REQUIRED_KEYS = frozenset({"start", "end", "type", "unit", "qty", "price", "groupby"}) + + +def _is_valid_timestep(s: str) -> bool: + """Return True if s is a string of at least 19 digits.""" + return isinstance(s, str) and s.isdigit() and len(s) >= 19 + + +def _has_required_keys(obj: dict) -> bool: + """Return True if obj has all required keys (start, end, type, unit, qty, price, groupby).""" + return REQUIRED_KEYS.issubset(obj.keys()) + + +def _extract_from_loki_json(data: dict) -> list[tuple[str, str]]: + """ + Extract [timestep, log_entry_json_str] pairs from Loki-style JSON. + Returns list of (timestep_str, log_entry_json_str). + """ + streams = data.get("streams") + if streams is None: + streams = data.get("data", {}).get("result", []) + if not isinstance(streams, list): + return [] + + pairs = [] + for stream in streams: + for val in stream.get("values", []): + if not isinstance(val, (list, tuple)) or len(val) < 2: + continue + ts_str = val[0] + log_str = val[1] + if not isinstance(ts_str, str) or not isinstance(log_str, str): + continue + if not _is_valid_timestep(ts_str): + continue + try: + entry = json.loads(log_str) + if isinstance(entry, dict) and _has_required_keys(entry): + pairs.append((ts_str, log_str)) + except json.JSONDecodeError: + continue + return pairs + + +def _find_arrays_in_text(text: str) -> list[tuple[str, str]]: + """ + Find all [...]. arrays in text where: + - First element is a string of >= 19 digits (timestep). + - Second element is a JSON object string with required keys. + Handles arrays that span multiple lines. + """ + pairs = [] + i = 0 + n = len(text) + while i < n: + start = text.find("[", i) + if start == -1: + break + depth = 0 + j = start + in_string = None + escape = False + while j < n: + c = text[j] + if escape: + escape = False + j += 1 + continue + if c == "\\" and in_string is not None: + escape = True + j += 1 + continue + if in_string is not None: + if c == in_string: + in_string = None + j += 1 + continue + if c in ('"', "'"): + in_string = c + j += 1 + continue + if c == "[": + depth += 1 + j += 1 + continue + if c == "]": + depth -= 1 + if depth == 0: + slice_str = text[start : j + 1] + try: + arr = json.loads(slice_str) + if ( + isinstance(arr, list) + and len(arr) >= 2 + and isinstance(arr[0], str) + and _is_valid_timestep(arr[0]) + and isinstance(arr[1], str) + ): + entry = json.loads(arr[1]) + if isinstance(entry, dict) and _has_required_keys(entry): + pairs.append((arr[0], arr[1])) + except json.JSONDecodeError: + pass + i = j + 1 + break + j += 1 + continue + j += 1 + else: + i = start + 1 + return pairs + + +def extract_and_sort(json_path: Path) -> list[tuple[str, str]]: + """ + Load JSON (or text) from json_path, extract [timestep, log_entry] pairs, + and return them sorted by timestep (ascending). + """ + raw = json_path.read_text(encoding="utf-8", errors="replace") + + # Try parsing as Loki JSON first + try: + data = json.loads(raw) + if isinstance(data, dict): + pairs = _extract_from_loki_json(data) + if pairs: + pairs.sort(key=lambda p: int(p[0])) + return pairs + except json.JSONDecodeError: + pass + + # Fallback: extract array strings from raw text (multi-line safe) + pairs = _find_arrays_in_text(raw) + pairs.sort(key=lambda p: int(p[0])) + return pairs + + +def main() -> None: + parser = argparse.ArgumentParser( + description="Extract and sort Loki log entries from JSON; print each and log_count." + ) + parser.add_argument( + "-j", "--json", + required=True, + type=Path, + help="Path to the input JSON file.", + ) + parser.add_argument( + "-o", "--output", + type=Path, + default=None, + help="Output path (default: same as input with _total.yml suffix, e.g. file.json -> file_total.yml).", + ) + parser.add_argument( + "--debug", + type=Path, + default=None, + metavar="DIR", + help="Debug directory: create if needed and write timestep list to _diff.txt.", + ) + args = parser.parse_args() + + if not args.json.exists(): + print(f"Error: input file not found: {args.json}", file=sys.stderr) + sys.exit(1) + + output_path = args.output if args.output is not None else (args.json.parent / f"{args.json.stem}_total.yml") + pairs = extract_and_sort(args.json) + out = output_path.open("w", encoding="utf-8") + + # Debug: write timestep list to _diff.txt in --debug directory (skip if "" or null) + _debug_dir = str(args.debug).strip() if args.debug is not None else "" + if _debug_dir and _debug_dir != ".": + args.debug.mkdir(parents=True, exist_ok=True) + debug_path = args.debug / f"{args.json.stem}_diff.txt" + with debug_path.open("w", encoding="utf-8") as dbg: + for timestep, log_str in pairs: + print(json.dumps([timestep, log_str], ensure_ascii=False), file=dbg) + + # Count unique timesteps and entries per timestep + log_count = len(pairs) + counts_per_timestep = Counter(ts for ts, _ in pairs) + total_timesteps = len(counts_per_timestep) + counts = list(counts_per_timestep.values()) + metrics_per_step = counts[0] if counts else 0 + if counts and not all(c == metrics_per_step for c in counts): + metrics_per_step = "ERROR" + + # Timesteps with error: those whose count differs from the expected (first) count + expected_count = counts[0] if counts else 0 + timesteps_with_error = [ + ts for ts, cnt in counts_per_timestep.items() + if cnt != expected_count + ] if counts else [] + + # Time bounds from first (lowest) and last (highest) timestep + if pairs: + begin_nano = int(pairs[0][0]) + end_nano = int(pairs[-1][0]) + first_entry = json.loads(pairs[0][1]) + last_entry = json.loads(pairs[-1][1]) + time_begin = first_entry.get("start") + time_begin_end = first_entry.get("end") + time_end_begin = last_entry.get("start") + time_end = last_entry.get("end") + else: + begin_nano = end_nano = time_begin = time_begin_end = time_end_begin = time_end = None + + try: + print("---", file=out) + print("time:", file=out) + print(" begin_step:", file=out) + print(f" nanosec: {begin_nano}" if begin_nano is not None else " nanosec: null", file=out) + print(f" begin: {repr(time_begin)}" if time_begin is not None else " begin: null", file=out) + print(f" end: {repr(time_begin_end)}" if time_begin_end is not None else " end: null", file=out) + print(" end_step:", file=out) + print(f" nanosec: {end_nano}" if end_nano is not None else " nanosec: null", file=out) + print(f" begin: {repr(time_end_begin)}" if time_end_begin is not None else " begin: null", file=out) + print(f" end: {repr(time_end)}" if time_end is not None else " end: null", file=out) + print("data_log:", file=out) + print(f" total_timesteps: {total_timesteps}", file=out) + print(f" metrics_per_step: {metrics_per_step}", file=out) + print(f" log_count: {log_count}", file=out) + finally: + out.close() + + # Only errors go to stdout: timestep and its metric count for inconsistent timesteps + if metrics_per_step == "ERROR": + for ts in sorted(timesteps_with_error, key=int): + print(ts, counts_per_timestep[ts], file=sys.stdout) + + +if __name__ == "__main__": + main() diff --git a/roles/telemetry_chargeback/files/gen_synth_loki_data.py b/roles/telemetry_chargeback/files/gen_synth_loki_data.py index f05796e29..e2d4358ef 100755 --- a/roles/telemetry_chargeback/files/gen_synth_loki_data.py +++ b/roles/telemetry_chargeback/files/gen_synth_loki_data.py @@ -5,10 +5,44 @@ import yaml from datetime import datetime, timezone, timedelta from pathlib import Path -from typing import Dict, Any +from typing import Dict, Any, List, Union from jinja2 import Environment +def _get_value_for_step( + values: List[Union[int, float]], + step_idx: int, + num_steps: int +) -> Union[int, float]: + """ + Get the appropriate value from a list based on the current step index. + + Values are distributed evenly across all steps. For example, if there are + 12 steps and 4 values, each value covers 3 steps: + - Steps 0-2: values[0] + - Steps 3-5: values[1] + - Steps 6-8: values[2] + - Steps 9-11: values[3] + + Args: + values: List of values to choose from. + step_idx: Current step index (0-based). + num_steps: Total number of steps. + + Returns: + The value corresponding to the current step. + """ + num_values = len(values) + if num_values == 1: + return values[0] + + # Calculate how many steps each value covers + steps_per_value = num_steps / num_values + # Determine which value index to use, clamping to valid range + value_idx = min(int(step_idx // steps_per_value), num_values - 1) + return values[value_idx] + + # --- Configure logging with a default level that can be changed --- logging.basicConfig( level=logging.INFO, @@ -73,7 +107,10 @@ def generate_loki_data( start_time: datetime, end_time: datetime, time_step_seconds: int, - config: Dict[str, Any] + config: Dict[str, Any], + project_id: Union[str, int, None] = None, + user_id: Union[str, int, None] = None, + reverse_timestamps: bool = False, ): """ Generate synthetic Loki log data by preparing a data list and rendering. @@ -85,6 +122,12 @@ def generate_loki_data( end_time (datetime): The end time for data generation. time_step_seconds (int): The duration of each log entry in seconds. config (Dict[str, Any]): Configuration dictionary loaded from file. + project_id: Optional value to inject as groupby.project_id in every + log entry in the output (overrides test_* file value when set). + user_id: Optional value to inject as groupby.user_id in every + log entry in the output (overrides test_* file value when set). + reverse_timestamps (bool): If True, reverse the order of timestamps + in the JSON output (youngest first, oldest last). """ # Hardcoded constant for invalid timestamps invalid_timestamp = "INVALID_TIMESTAMP" @@ -175,37 +218,45 @@ def generate_loki_data( logger.error(f"Invalid log type configuration: {log_type_config}") raise ValueError("Each log type in log_types must be a dictionary") - log_type_name = log_type_config.get("name") - if not log_type_name: - logger.error("Each log type must have a 'name' field") - raise ValueError("Each log type must have a 'name' field") + # "type" is log-type identifier (dict key) and output value + type_key = log_type_config.get("type") + if not type_key: + logger.error("Each log type must have a 'type' field") + raise ValueError("Each log type must have a 'type' field") # Validate required fields - missing = [f for f in required_fields if f not in log_type_config] + required_for_item = [f for f in required_fields if f != "name"] + missing = [f for f in required_for_item if f not in log_type_config] if missing: logger.error( - f"Missing required fields in {log_type_name} config: {missing}" + f"Missing required fields in {type_key!r} config: {missing}" ) raise ValueError( - f"Missing required fields in {log_type_name}: {missing}" + f"Missing required fields in {type_key!r}: {missing}" ) # Build groupby from config groupby = log_type_config.get("groupby", {}) if not isinstance(groupby, dict): logger.error( - f"groupby must be a dictionary for {log_type_name}" + f"groupby must be a dictionary for {type_key!r}" ) raise ValueError( - f"groupby must be a dictionary for {log_type_name}" + f"groupby must be a dictionary for {type_key!r}" ) - log_types[log_type_name] = { - "type": log_type_config["type"], + # Ensure qty and price are lists for step-based distribution + qty_val = log_type_config["qty"] + price_val = log_type_config["price"] + qty_list = qty_val if isinstance(qty_val, list) else [qty_val] + price_list = price_val if isinstance(price_val, list) else [price_val] + + log_types[type_key] = { + "type": type_key, "unit": log_type_config["unit"], "description": log_type_config.get("description"), - "qty": log_type_config["qty"], - "price": log_type_config["price"], + "qty": qty_list, + "price": price_list, "groupby": groupby.copy(), "metadata": log_type_config.get("metadata", {}) } @@ -231,15 +282,21 @@ def tojson_preserve_order(obj): # --- Render the template in one pass with all the data --- logger.info("Rendering final output...") + if reverse_timestamps: + log_data_list.reverse() + logger.debug( + "Reversed timestamp order (youngest first, oldest last)." + ) + + # Calculate total number of steps for value distribution + num_steps = len(log_data_list) + logger.debug(f"Total number of time steps: {num_steps}") + # Pre-calculate log types with date fields for each time step log_types_list = [] for idx, item in enumerate(log_data_list): - # For the last entry, use end_time to ensure it shows today's date - if idx == len(log_data_list) - 1: - dt = end_time - else: - epoch_seconds = item["nanoseconds"] / 1_000_000_000 - dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc) + epoch_seconds = item["nanoseconds"] / 1_000_000_000 + dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc) iso_year, iso_week, _ = dt.isocalendar() day_of_year = dt.timetuple().tm_yday @@ -267,6 +324,17 @@ def tojson_preserve_order(obj): log_type_with_dates = log_type_data.copy() log_type_with_dates["groupby"] = log_type_data["groupby"].copy() log_type_with_dates["groupby"].update(date_fields) + if project_id is not None: + log_type_with_dates["groupby"]["project_id"] = project_id + if user_id is not None: + log_type_with_dates["groupby"]["user_id"] = user_id + # Select qty and price based on step index distribution + log_type_with_dates["qty"] = _get_value_for_step( + log_type_data["qty"], idx, num_steps + ) + log_type_with_dates["price"] = _get_value_for_step( + log_type_data["price"], idx, num_steps + ) log_types_with_dates[log_type_name] = log_type_with_dates log_types_list.append(log_types_with_dates) @@ -296,8 +364,19 @@ def tojson_preserve_order(obj): ) except IOError as e: logger.error(f"Failed to write to output file '{output_path}': {e}") - except Exception as e: - logger.error(f"An unexpected error occurred during file write: {e}") + raise + + # --- Step 5: Validate that the output is valid JSON --- + try: + with output_path.open('r') as f_in: + json.load(f_in) + logger.info("Output file validated as valid JSON.") + except json.JSONDecodeError as e: + logger.error( + f"Output file is not valid JSON: {e}. " + f"Delete '{output_path}' and fix the template or data." + ) + sys.exit(1) def main(): @@ -324,8 +403,30 @@ def main(): required=True, help="Path to the output file." ) + parser.add_argument( + "-p", "--project-id", + type=str, + default=None, + metavar="ID", + help="Optional alphanumeric value to use as groupby.project_id in " + "every log entry in the output (overrides value from test file)." + ) + parser.add_argument( + "-u", "--user-id", + type=str, + default=None, + metavar="ID", + help="Optional alphanumeric value to use as groupby.user_id in " + "every log entry in the output (overrides value from test file)." + ) # --- Optional Utility Arguments --- + parser.add_argument( + "-r", "--reverse", + action="store_true", + help="Reverse timestamp order in JSON output: youngest first, " + "oldest last (default is oldest first, youngest last)." + ) parser.add_argument( "--debug", action="store_true", @@ -362,7 +463,10 @@ def main(): start_time=start_time_utc, end_time=end_time_utc, time_step_seconds=step_seconds, - config=config + config=config, + project_id=args.project_id, + user_id=args.user_id, + reverse_timestamps=args.reverse, ) except FileNotFoundError: logger.error( diff --git a/roles/telemetry_chargeback/files/gen_synth_loki_metrics_totals.py b/roles/telemetry_chargeback/files/gen_synth_loki_metrics_totals.py new file mode 100644 index 000000000..04dd88be2 --- /dev/null +++ b/roles/telemetry_chargeback/files/gen_synth_loki_metrics_totals.py @@ -0,0 +1,163 @@ +#!/usr/bin/env python3 +""" +Calculate metric totals and aggregate total from a Loki JSON file. + +Output is in YAML format. +""" +import argparse +import json +import sys +import yaml +from pathlib import Path + + +def calculate_totals(json_path: Path, output_path: Path): + """ + Read Loki JSON, calculate step totals (qty * price), and sum them up. + + Args: + json_path: Path to the input JSON file. + output_path: Path to the output YAML file. + """ + try: + with json_path.open('r') as f: + data = json.load(f) + except Exception as e: + print(f"Error reading JSON file {json_path}: {e}") + sys.exit(1) + + # Support both synthetic format (top-level "streams") and Loki API format + # (top-level "data" -> "result" from query_range response) + streams = data.get('streams') + if streams is None: + streams = data.get('data', {}).get('result', []) + if not isinstance(streams, list): + streams = [] + + metric_totals = {} + aggregate_total = 0.0 + time_steps_set = set() + log_count = 0 + # Per-timestamp start/end from log entries (same for all entries at step) + time_step_bounds = {} + + # Extract values from the Loki JSON structure + for stream in streams: + for val_pair in stream.get('values', []): + log_count += 1 + try: + # The first element is the timestamp (nanoseconds) + timestamp = val_pair[0] + time_steps_set.add(timestamp) + + # The second element is a JSON string containing the log entry + entry = json.loads(val_pair[1]) + + # Start/end for this time step (same for all entries at step) + if timestamp not in time_step_bounds: + time_step_bounds[timestamp] = { + "begin": entry.get("start"), + "end": entry.get("end"), + } + + m_type = entry.get('type') + if m_type is None: + m_type = 'unknown' + + qty = float(entry.get('qty', 0)) + price = float(entry.get('price', 0)) + + step_total = qty * price + + if m_type not in metric_totals: + metric_totals[m_type] = 0.0 + + metric_totals[m_type] += step_total + aggregate_total += step_total + except (json.JSONDecodeError, ValueError, IndexError) as e: + print(f"Warning: Skipping malformed entry: {e}") + continue + + # First and last time step timestamps (order by numeric value) + sorted_ts = ( + sorted(time_steps_set, key=lambda t: int(t)) if time_steps_set else [] + ) + timestamp_begin = ( + time_step_bounds[sorted_ts[0]]["begin"] if sorted_ts else None + ) + timestamp_end = ( + time_step_bounds[sorted_ts[-1]]["end"] if sorted_ts else None + ) + begin_nano = int(sorted_ts[0]) if sorted_ts else None + end_nano = int(sorted_ts[-1]) if sorted_ts else None + + # Prepare data for YAML output with time section and rates. + # log_count = total [timestamp, log_entry] pairs. When each timestep has + # the same number of metrics, log_count == total_time_steps * + # metrics_per_step. + total_time_steps = len(time_steps_set) + metrics_per_step = ( + log_count // total_time_steps if total_time_steps > 0 else 0 + ) + if total_time_steps > 0 and log_count % total_time_steps != 0: + print( + f"Warning: log_count ({log_count}) is not divisible by " + f"total_time_steps ({total_time_steps}). " + "Expected log_count = total_time_steps × metrics_per_step." + ) + + synth_rate = { + m: round(t, 4) for m, t in sorted(metric_totals.items()) + } + synth_rate["total_rate"] = round(aggregate_total, 4) + + output_data = { + "time": { + "begin": timestamp_begin, + "begin_nano": begin_nano, + "end": timestamp_end, + "end_nano": end_nano, + }, + "data_log": { + "total_time_steps": total_time_steps, + "metrics_per_step": metrics_per_step, + "log_count": log_count, + }, + "synth_rate": synth_rate, + } + + # Write to output file in YAML format + try: + with output_path.open('w') as f_out: + f_out.write("---\n") + yaml.dump( + output_data, f_out, default_flow_style=False, sort_keys=False + ) + print( + f"Successfully calculated totals and wrote YAML to {output_path}" + ) + except Exception as e: + print(f"Error writing to output file {output_path}: {e}") + sys.exit(1) + + +def main(): + """Main entry point for the script.""" + parser = argparse.ArgumentParser( + description="Calculate totals from Loki JSON data" + ) + parser.add_argument( + "-j", "--json", required=True, type=Path, + help="Path to the input JSON file." + ) + parser.add_argument( + "-o", "--output", required=True, type=Path, + help="Path to the output YAML file." + ) + + args = parser.parse_args() + calculate_totals(args.json, args.output) + + +if __name__ == "__main__": + main() diff --git a/roles/telemetry_chargeback/files/test_dyn_basic.yml b/roles/telemetry_chargeback/files/test_dyn_basic.yml new file mode 100644 index 000000000..796d7c6f1 --- /dev/null +++ b/roles/telemetry_chargeback/files/test_dyn_basic.yml @@ -0,0 +1,144 @@ +--- +# Scenario configuration for synthetic Loki log data generation + +# Time range configuration +generation: + days: 1 + step_seconds: 14400 + +# Log type definitions (single "type" = identifier and value pushed to output) +log_types: + - type: ceilometer_image_size + unit: MiB + description: "Size of ceilometer image" + qty: + - 20.6 + - 25.0 + - 26.0 + - 30.5 + price: + - 0.02 + - 0.03 + - 0.04 + - 0.07 + - 0.10 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-01 + metadata: + container_format: bare + disk_format: qcow2 + + - type: ceilometer_cpu_num + unit: scalar + description: "max number of cpus used in time step" + qty: + - 1.0 + price: + - 0.3 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-02 + metadata: + flavor_name: m1.tiny + flavor_id: "1" + vcpus: "" + + - type: ceilometer_ip_floating + unit: ip + description: null + qty: + - 0.0 + price: + - 0.50 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-01 + metadata: + state: null + + - type: ceilometer_disk_ephemeral_size + unit: GiB + description: "Max at each timestep" + qty: + - 0.0 + price: + - 0.0 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-01 + metadata: + type: null + + - type: ceilometer-disk-root_size + unit: GiB + description: null + qty: + - 0.0 + price: + - 0.0 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-02 + metadata: + disk_format: null + + - type: ceilometer_network_outgoing_bytes + unit: B + description: null + qty: + - 0.0 + price: + - 0.0 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-01 + metadata: + vm_instance: null + + - type: ceilometer_network_incoming_bytes + unit: B + description: null + qty: + - 0.0 + price: + - 0.0 + groupby: + id: null + user_id: null + project_id: null + tenant_id: tenant-02 + metadata: + vm_instance: null + +# Required fields for validation (top-level fields only, not nested in groupby) +required_fields: + - type + - unit + - qty + - price + - groupby + - metadata + +# Date field names to add to groupby +date_fields: + - week_of_the_year + - day_of_the_year + - month + - year + +# Loki stream configuration +loki_stream: + service: cloudkitty diff --git a/roles/telemetry_chargeback/files/test_static.yml b/roles/telemetry_chargeback/files/test_static.yml deleted file mode 100644 index f94a3c1d2..000000000 --- a/roles/telemetry_chargeback/files/test_static.yml +++ /dev/null @@ -1,57 +0,0 @@ -# Scenario configuration for synthetic Loki log data generation - -# Time range configuration -generation: - days: 1 - step_seconds: 7200 - -# Log type definitions -log_types: - - name: ceilometer_image_size - type: ceilometer_image_size - unit: MiB - description: null - qty: 20.6 - price: 0.02 - groupby: - id: cd65d30f-8b94-4fa3-95dc-e3b429f479b2 - project_id: 0030775de80e4d84a4fd0d73e0a1b3a7 - user_id: null - metadata: - container_format: bare - disk_format: qcow2 - - - name: instance - type: instance - unit: instance - description: null - qty: 1.0 - price: 0.3 - groupby: - id: de168c31-ed44-4a1a-a079-51bd238a91d6 - project_id: 9cf5bcfc61a24682acc448af2d062ad2 - user_id: c29ab6e886354bbd88ee9899e62d1d40 - metadata: - flavor_name: m1.tiny - flavor_id: "1" - vcpus: "" - -# Required fields for validation (top-level fields only, not nested in groupby) -required_fields: - - type - - unit - - qty - - price - - groupby - - metadata - -# Date field names to add to groupby -date_fields: - - week_of_the_year - - day_of_the_year - - month - - year - -# Loki stream configuration -loki_stream: - service: cloudkitty diff --git a/roles/telemetry_chargeback/tasks/chargeback_tests.yml b/roles/telemetry_chargeback/tasks/chargeback_tests.yml index df07fb503..0b06b4d41 100644 --- a/roles/telemetry_chargeback/tasks/chargeback_tests.yml +++ b/roles/telemetry_chargeback/tasks/chargeback_tests.yml @@ -3,12 +3,14 @@ ansible.builtin.command: cmd: "{{ openstack_cmd }} rating module enable hashmap" register: enable_hashmap - changed_when: True + changed_when: true failed_when: enable_hashmap.rc != 0 - name: Find the current value of hashmap ansible.builtin.shell: - cmd: "{{ openstack_cmd }} rating module get hashmap -c Priority -f csv | tail -n +2" + cmd: "set -o pipefail && {{ openstack_cmd }} rating module get hashmap -c Priority -f csv | tail -n +2" + args: + executable: /bin/bash register: get_hashmap_priority changed_when: false @@ -18,7 +20,7 @@ register: set_hashmap_priority when: get_hashmap_priority.stdout | trim != '100' failed_when: (set_hashmap_priority.rc | default(42)) >= 1 or get_hashmap_priority.stdout == "" - changed_when: True + changed_when: true - name: Get status of all CloudKitty rating modules ansible.builtin.command: @@ -31,7 +33,7 @@ that: - "'hashmap' in module_list.stdout" - "'True' in (module_list.stdout_lines | select('search', 'hashmap') | first)" - fail_msg: "FAILED: CloudKitty module validation failed . Module states are not as expected." + fail_msg: "FAILED: CloudKitty module validation failed. Module states are not as expected." success_msg: "SUCCESS: CloudKitty modules (hashmap=True) are configured correctly." - name: TEST Set priority for CloudKitty hashmap module diff --git a/roles/telemetry_chargeback/tasks/cleanup_ck.yml b/roles/telemetry_chargeback/tasks/cleanup_ck.yml new file mode 100644 index 000000000..84e050c30 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/cleanup_ck.yml @@ -0,0 +1,5 @@ +--- +- name: Cleanup local certificates + ansible.builtin.file: + path: "{{ local_cert_dir }}" + state: absent diff --git a/roles/telemetry_chargeback/tasks/flush_loki_data.yml b/roles/telemetry_chargeback/tasks/flush_loki_data.yml new file mode 100644 index 000000000..45e95c654 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/flush_loki_data.yml @@ -0,0 +1,52 @@ +--- +# Flush Loki Ingester Memory to Storage + +- name: Flush Execution inside openstack CLI + block: + # create dir + - name: Create directory inside openstack CLI + ansible.builtin.command: + cmd: "oc exec -n {{ cloudkitty_namespace }} {{ openstackpod }} -- mkdir -p {{ remote_cert_dir }}" + changed_when: false + + # certs to Flush data to Loki + - name: Create a directory to extract certificates + ansible.builtin.file: + path: "{{ local_cert_dir }}" + state: directory + mode: '0755' + + # copy all certs + - name: Copy certificates to openstack CLI + ansible.builtin.command: + cmd: "oc cp {{ local_cert_dir }}/. {{ cloudkitty_namespace }}/{{ openstackpod }}:{{ remote_cert_dir }}/" + changed_when: true + + # flush loki + - name: Trigger Flush + ansible.builtin.command: + cmd: > + oc exec -n {{ cloudkitty_namespace }} {{ openstackpod }} -- + curl -v -X POST {{ ingester_flush_url }} + --cert {{ remote_cert_dir }}/tls.crt + --key {{ remote_cert_dir }}/tls.key + --cacert {{ remote_cert_dir }}/service-ca.crt + register: flush_response + changed_when: true + failed_when: flush_response.rc != 0 + + # Status + - name: Verify Flush Status + ansible.builtin.assert: + that: + - "'204' in flush_response.stderr or '200' in flush_response.stderr" + fail_msg: "Flush failed" + success_msg: "Ingester Memory Flushed successfully" + + rescue: + - name: Debug Failure Output + ansible.builtin.debug: + msg: + - "Failure" + - "Stdout: {{ flush_response.stdout | default('') }}" + - "Stderr: {{ flush_response.stderr | default('') }}" diff --git a/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml b/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml index e37b54c6b..96b21b4ab 100644 --- a/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml +++ b/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml @@ -1,39 +1,40 @@ --- -- name: Check for preexisting output file +- name: "Set variables dynamically {{ item }}" + ansible.builtin.set_fact: + cloudkitty_data_file: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_synth_data_suffix }}" + cloudkitty_synth_totals_file: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_synth_totals_metrics_suffix }}" + cloudkitty_test_file: "{{ cloudkitty_scenario_dir }}/{{ item }}.yml" + +- name: "Check for preexisting output file" ansible.builtin.stat: - path: "{{ ck_output_file_local }}" + path: "{{ cloudkitty_data_file }}" register: file_preexists -- name: TEST Generate Synthetic Data +- name: "Generate Synthetic Data {{ item }}" ansible.builtin.command: cmd: > - python3 "{{ ck_synth_script }}" - --tmpl "{{ ck_data_template }}" - -t "{{ ck_data_config }}" - -o "{{ ck_output_file_local }}" + python3 "{{ cloudkitty_synth_script }}" + -r + --tmpl "{{ cloudkitty_data_template }}" + -t "{{ cloudkitty_test_file }}" + -o "{{ cloudkitty_data_file }}" + {% if cloudkitty_project_id | default('') %} -p "{{ cloudkitty_project_id }}"{% endif %} register: script_output - when: not file_preexists.stat.exists | bool + when: not file_preexists.stat.exists | bool changed_when: script_output.rc == 0 -- name: Read the content of the file - ansible.builtin.slurp: - src: "{{ ck_output_file_local }}" - register: slurped_file - -- name: TEST Validate JSON format of synthetic data file - ansible.builtin.assert: - that: - # This filter will trigger a task failure if the string isn't valid JSON - - slurped_file.content | b64decode | from_json is defined - fail_msg: "The file does not contain valid JSON format." - success_msg: "JSON format validated successfully." - -- name: Print output_file_remote path - ansible.builtin.debug: - msg: "Synthetic data file: {{ ck_output_file_remote }}" +- name: "Generate chargeback rating from synthetic data file {{ item }}" + ansible.builtin.command: + cmd: > + python3 "{{ cloudkitty_totals_script }}" + -j "{{ cloudkitty_data_file }}" + -o "{{ cloudkitty_synth_totals_file }}" + --debug "{{ cloudkitty_debug_dir }}" + register: synth_rating_info + when: not file_preexists.stat.exists | bool + changed_when: synth_rating_info.rc == 0 -- name: Copy output file to remote host - ansible.builtin.copy: - src: "{{ ck_output_file_local }}" - dest: "{{ ck_output_file_remote }}" - mode: '0644' +- name: "Load metrics from YAML file" + ansible.builtin.include_vars: + file: "{{ cloudkitty_synth_totals_file }}" + name: synth_data_rates diff --git a/roles/telemetry_chargeback/tasks/ingest_loki_data.yml b/roles/telemetry_chargeback/tasks/ingest_loki_data.yml new file mode 100644 index 000000000..d6a5e8fe2 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/ingest_loki_data.yml @@ -0,0 +1,42 @@ +--- +# Ingest data log to Loki that is generated from gen_synth_loki_data.yml + +- name: Ingest data log to Loki via API + block: + + - name: Read log file content + ansible.builtin.slurp: + src: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_synth_data_suffix }}" + register: log_file_content + + - name: Push data to Loki + ansible.builtin.uri: + url: "{{ loki_push_url }}" + method: POST + body: "{{ log_file_content['content'] | b64decode | from_json }}" + body_format: json + client_cert: "{{ cert_dir }}/tls.crt" + client_key: "{{ cert_dir }}/tls.key" + validate_certs: false + status_code: 204 + return_content: true + register: loki_response + ignore_errors: false + failed_when: loki_response.status != 204 + + # Success + - name: Confirm Success + ansible.builtin.debug: + msg: "Ingestion Successful!" + + rescue: + # Rescue block + - name: Debug failure + ansible.builtin.debug: + msg: "{{ loki_response.status | default('N/A') }}" + + # Failure + - name: Report Ingestion Failure + ansible.builtin.fail: + msg: "Ingestion Failed" + ignore_errors: false diff --git a/roles/telemetry_chargeback/tasks/load_loki_data.yml b/roles/telemetry_chargeback/tasks/load_loki_data.yml new file mode 100644 index 000000000..6d1a58604 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/load_loki_data.yml @@ -0,0 +1,12 @@ +--- +- name: "Ingests Cloudkitty Data log: {{ item }}" + ansible.builtin.include_tasks: + file: ingest_loki_data.yml + +- name: "Flush Data to loki Storage: {{ item }}" + ansible.builtin.include_tasks: + file: flush_loki_data.yml + +- name: "Retrieve Data log from loki: {{ item }}" + ansible.builtin.include_tasks: + file: retrieve_loki_data.yml diff --git a/roles/telemetry_chargeback/tasks/loki_rate.yml b/roles/telemetry_chargeback/tasks/loki_rate.yml new file mode 100644 index 000000000..9447adfec --- /dev/null +++ b/roles/telemetry_chargeback/tasks/loki_rate.yml @@ -0,0 +1,29 @@ +--- +- name: "TEST Get Rate and Qty by type from Cloudkitty {{ item }}" + ansible.builtin.command: + cmd: "{{ openstack_cmd }} rating summary get -g type" + register: cost_totals_by_type + changed_when: true + failed_when: cost_totals_by_type.rc != 0 + +- name: "**INFO** Print the rating by type {{ item }}" + ansible.builtin.debug: + var: cost_totals_by_type + +- name: "Save output as a loadable variable file {{ item }}" + ansible.builtin.copy: + content: | + "{{ cost_totals_by_type.stdout }}" + dest: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_loki_totals_suffix }}" + mode: '0644' + +- name: "TEST Get Rate and Qty Summary from Cloudkitty {{ item }}" + ansible.builtin.command: + cmd: "{{ openstack_cmd }} rating summary get" + register: cost_totals_summary + changed_when: true + failed_when: cost_totals_summary.rc != 0 + +- name: "**INFO** Print the rating summary {{ item }}" + ansible.builtin.debug: + var: cost_totals_summary diff --git a/roles/telemetry_chargeback/tasks/main.yml b/roles/telemetry_chargeback/tasks/main.yml index 98a94b233..abf89299d 100644 --- a/roles/telemetry_chargeback/tasks/main.yml +++ b/roles/telemetry_chargeback/tasks/main.yml @@ -1,6 +1,63 @@ --- -- name: "Validate Chargeback Feature" +- name: "Validate Chargeback Feature deployed correctly" ansible.builtin.include_tasks: "chargeback_tests.yml" -- name: "Generate Synthetic Data" - ansible.builtin.include_tasks: "gen_synth_loki_data.yml" +- name: "Setup Loki Environment" + ansible.builtin.include_tasks: "setup_loki_env.yml" + +- name: "Cloudkitty debug ON" + ansible.builtin.set_fact: + cloudkitty_debug_dir: "{{ artifacts_dir_zuul }}/debug_ck_db" + when: cloudkitty_debug | bool + +- name: "Cloudkitty debug OFF" + ansible.builtin.set_fact: + cloudkitty_debug_dir: "" + when: not cloudkitty_debug | bool + +- name: Get admin project ID for CI + ansible.builtin.command: + cmd: "{{ openstack_cmd }} project show admin -f value -c id" + register: get_admin_project_id + changed_when: false + failed_when: false + +- name: Set admin project ID for CI + ansible.builtin.set_fact: + cloudkitty_project_id: "{{ (get_admin_project_id.stdout | trim) | default('') }}" + +- name: Get admin user ID for CI + ansible.builtin.command: + cmd: "{{ openstack_cmd }} user show admin -f value -c id" + register: get_admin_user_id + changed_when: false + failed_when: false + +- name: Set admin user ID for CI + ansible.builtin.set_fact: + cloudkitty_user_id: "{{ (get_admin_user_id.stdout | trim) | default('') }}" + +- name: "Find test files" + ansible.builtin.find: + paths: "{{ cloudkitty_scenario_dir }}" + patterns: "test_*.yml" + register: found_files_raw + +- name: "Extract only the filenames into a clean list" + ansible.builtin.set_fact: + found_files: "{{ found_files_raw.files | map(attribute='path') | map('basename') | map('regex_replace', '\\.yml$', '') | list }}" + +- name: "Run scenario file through workflow" + block: + - name: "Process and Loop if files exist" + ansible.builtin.include_tasks: run_test_scenarios.yml + loop: "{{ found_files }}" + when: found_files | length > 0 + + - name: Cleanup after job run + ansible.builtin.include_tasks: cleanup_ck.yml + + rescue: + - name: "Log failure" + ansible.builtin.debug: + msg: "Running test scenarios loop failed." diff --git a/roles/telemetry_chargeback/tasks/retrieve_loki_data.yml b/roles/telemetry_chargeback/tasks/retrieve_loki_data.yml new file mode 100644 index 000000000..99e7c8ea7 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/retrieve_loki_data.yml @@ -0,0 +1,71 @@ +--- +- name: "Expected Count {{ item }}" + ansible.builtin.debug: + msg: "Input file has {{ synth_data_rates.data_log.log_count }} data entries that Loki has to return" + +# Query Loki +- name: "Retrieve Logs from Loki via API {{ item }}" + block: + - name: Query Loki API + ansible.builtin.uri: + url: "{{ loki_query_url }}?query={{ logql_query | urlencode }}&start={{ synth_data_rates.time.begin_step.nanosec }}&limit={{ limit }}" + method: GET + client_cert: "{{ cert_dir }}/tls.crt" + client_key: "{{ cert_dir }}/tls.key" + ca_path: "{{ cert_dir }}/ca.crt" + validate_certs: false + return_content: true + body_format: json + register: loki_response + # Wait condition + until: + - loki_response.status == 200 + - loki_response.json.status == 'success' + - loki_response.json.data.result | length > 0 + - (loki_response.json.data.result | map(attribute='values') | map('length') | sum) >= (synth_data_rates.data_log.log_count | int) + retries: 25 + delay: 60 + + - name: Save Loki Data to JSON file + ansible.builtin.copy: + content: "{{ loki_response.json | to_json }}" + dest: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_loki_data_suffix }}" + mode: '0644' + + # Validate + - name: "Verify Data Integrity {{ item }}" + vars: + actual_count: "{{ loki_response.json.data.result | map(attribute='values') | map('length') | sum }}" + ansible.builtin.assert: + that: + - loki_response.json.status == 'success' + - loki_response.json.data.result | length > 0 + - actual_count | int == (synth_data_rates.data_log.log_count | int) + fail_msg: >- + Query did not return all data entries. Expected + {{ synth_data_rates.data_log.log_count }} log entries, but Loki + only returned {{ actual_count }} + success_msg: "Query returned all data entries. Input file had {{ synth_data_rates.data_log.log_count }} entries and Loki returned {{ actual_count }}" + + rescue: + - name: Debug failure + ansible.builtin.debug: + msg: + - "Status: {{ loki_response.status | default('Unknown') }}" + - "Body: {{ loki_response.content | default('No Content') }}" + - "Msg: {{ loki_response.msg | default('Request failed') }}" + + # Failure + - name: Report Retrieval Failure + ansible.builtin.fail: + msg: "Retrieval Failed" + +- name: "Generate chargeback stats from Loki-retrieved data file: {{ item }}" + ansible.builtin.command: + cmd: > + python3 "{{ cloudkitty_totals_script }}" + -j "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_loki_data_suffix }}" + -o "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_loki_totals_metrics_suffix }}" + --debug "{{ cloudkitty_debug_dir }}" + register: synth_rating_info + changed_when: synth_rating_info.rc == 0 diff --git a/roles/telemetry_chargeback/tasks/run_test_scenarios.yml b/roles/telemetry_chargeback/tasks/run_test_scenarios.yml new file mode 100644 index 000000000..1d12c4dc5 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/run_test_scenarios.yml @@ -0,0 +1,29 @@ +--- +- name: "Generate Synthetic Data for each file: {{ item }}" + ansible.builtin.include_tasks: "gen_synth_loki_data.yml" + +- name: "Load data to loki: {{ item }}" + ansible.builtin.include_tasks: "load_loki_data.yml" + +- name: "Get total rate from loki: {{ item }}" + ansible.builtin.include_tasks: "loki_rate.yml" + +#### diff uploaded data totals vs download data totals +- name: "Read the synthetic totals file" + ansible.builtin.slurp: + src: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_synth_totals_metrics_suffix }}" + register: synth_data + +- name: "Read loki totals yaml file" + ansible.builtin.slurp: + src: "{{ artifacts_dir_zuul }}/{{ item }}{{ cloudkitty_loki_totals_metrics_suffix }}" + register: loki_data + +- name: "TEST Compare synthetic data vs loki data results {{ item }}" + ansible.builtin.assert: + that: + # Compare data_log (gen_db_summary output has time and data_log only) + - (synth_data.content | b64decode | from_yaml).data_log == (loki_data.content | b64decode | from_yaml).data_log + fail_msg: | + FAILED! {{ item }} + success_msg: "PASSED - Data totals are identical." diff --git a/roles/telemetry_chargeback/tasks/setup_loki_env.yml b/roles/telemetry_chargeback/tasks/setup_loki_env.yml new file mode 100644 index 000000000..b3cf7c121 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/setup_loki_env.yml @@ -0,0 +1,70 @@ +--- +# Setup Loki Environment + +# Dynamic URL's +- name: Get Loki Public Route Host + ansible.builtin.command: + cmd: oc get route cloudkitty-lokistack -n {{ cloudkitty_namespace }} -o jsonpath='{.spec.host}' + register: loki_route + changed_when: false + +- name: Set Loki URLs + ansible.builtin.set_fact: + # Base URL + loki_base_url: "https://{{ loki_route.stdout }}" + + # Internal Flush URL (Service DNS: https://..svc:3100/flush) + ingester_flush_url: "https://cloudkitty-lokistack-ingester-http.{{ cloudkitty_namespace }}.svc:3100/flush" + +- name: Set Derived Loki URLs + ansible.builtin.set_fact: + loki_push_url: "{{ loki_base_url }}/api/logs/v1/cloudkitty/loki/api/v1/push" + loki_query_url: "{{ loki_base_url }}/api/logs/v1/cloudkitty/loki/api/v1/query_range" + +- name: Debug URLs + ansible.builtin.debug: + msg: + - "Loki Route: {{ loki_base_url }}" + - "Push URL: {{ loki_push_url }}" + - "Flush URL: {{ ingester_flush_url }}" + - "Query URL: {{ loki_query_url }}" + +# Certs to Ingest & Retrieve data to/from Loki +- name: Ensure Local Certificate Directory Exists + ansible.builtin.file: + path: "{{ cert_dir }}" + state: directory + mode: '0755' + +- name: Extract Certificates from Openshift Secret + ansible.builtin.command: + cmd: > + oc extract secret/{{ cert_secret_name }} + --to={{ cert_dir }} + --confirm + -n {{ cloudkitty_namespace }} + changed_when: true + +# Certs to Flush data to Loki +# - name: Create a directory to extract certificates +# ansible.builtin.file: +# path: "{{ local_cert_dir }}" +# state: directory +# mode: '0755' + +- name: Extract Client Certificates + ansible.builtin.command: + cmd: > + oc extract {{ client_secret }} + --to={{ local_cert_dir }} + --confirm + -n {{ cloudkitty_namespace }} + changed_when: true + +- name: Extract CA Bundle + ansible.builtin.command: + cmd: "oc extract {{ ca_configmap }} + --to={{ local_cert_dir }} + --confirm + -n {{ cloudkitty_namespace }}" + changed_when: true diff --git a/roles/telemetry_chargeback/template/loki_data_templ.j2 b/roles/telemetry_chargeback/templates/loki_data_templ.j2 similarity index 100% rename from roles/telemetry_chargeback/template/loki_data_templ.j2 rename to roles/telemetry_chargeback/templates/loki_data_templ.j2 diff --git a/roles/telemetry_chargeback/vars/main.yml b/roles/telemetry_chargeback/vars/main.yml index 1014a6a9e..1875dfb7f 100644 --- a/roles/telemetry_chargeback/vars/main.yml +++ b/roles/telemetry_chargeback/vars/main.yml @@ -1,9 +1,37 @@ --- -logs_dir_zuul: "/home/zuul/ci-framework-data/logs" -artifacts_dir_zuul: "/home/zuul/ci-framework-data/artifacts" - -ck_synth_script: "{{ role_path }}/files/gen_synth_loki_data.py" -ck_data_template: "{{ role_path }}/template/loki_data_templ.j2" -ck_data_config: "{{ role_path }}/files/test_static.yml" -ck_output_file_local: "{{ artifacts_dir_zuul }}/loki_synth_data.json" -ck_output_file_remote: "{{ logs_dir_zuul }}/gen_loki_synth_data.log" +# logs_dir_zuul: "/home/zuul/ci-framework-data/logs" +# artifacts_dir_zuul: "/home/zuul/ci-framework-data/artifacts" +logs_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/logs" +artifacts_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/artifacts" + +cloudkitty_debug: true +cloudkitty_scenario_dir: "{{ role_path }}/files" +cloudkitty_synth_data_suffix: "-synth_data.json" +cloudkitty_loki_data_suffix: "-loki_data.json" +cloudkitty_synth_totals_metrics_suffix: "-synth_metrics_totals.yml" +cloudkitty_loki_totals_metrics_suffix: "-loki_metrics_totals.yml" +cloudkitty_loki_totals_suffix: "-loki_totals.yml" + +cloudkitty_synth_script: "{{ role_path }}/files/gen_synth_loki_data.py" +cloudkitty_data_template: "{{ role_path }}/templates/loki_data_templ.j2" +cloudkitty_totals_script: "{{ role_path }}/files/gen_db_summary.py" + +# Cloudkitty certificates +cert_secret_name: "cert-cloudkitty-client-internal" +cert_dir: "{{ ansible_user_dir }}/ck-certs" + +client_secret: "secret/cloudkitty-lokistack-gateway-client-http" +ca_configmap: "cm/cloudkitty-lokistack-ca-bundle" +remote_cert_dir: "osp-certs" +local_cert_dir: "{{ ansible_env.HOME }}/ci-framework-data/flush_certs" + +# LogQL Query +logql_query: "{{ loki_query | default('{service=\"cloudkitty\"}') }}" + +# vars +cloudkitty_namespace: "openstack" +openstackpod: "openstackclient" + +# Time window settings +lookback: 6 +limit: 50