|
1 | | -# Azure-init Tracing System |
| 1 | +# `libazureinit-kvp` |
2 | 2 |
|
3 | | -## Overview |
| 3 | +`libazureinit-kvp` is the storage layer for Hyper-V KVP (Key-Value Pair) |
| 4 | +pool files used by Azure guests. |
4 | 5 |
|
5 | | -Azure-init implements a comprehensive tracing system that captures detailed information about the provisioning process. |
6 | | -This information is crucial for monitoring, debugging, and troubleshooting VM provisioning issues in Azure environments. |
7 | | -The tracing system is built on a multi-layered architecture that provides flexibility and robustness. |
| 6 | +It defines: |
| 7 | +- `KvpStore`: storage trait with explicit read/write/delete semantics. |
| 8 | +- `HyperVKvpStore`: production implementation backed by the Hyper-V |
| 9 | + binary pool file format. |
| 10 | +- `KvpLimits`: exported key/value byte limits for Hyper-V and Azure. |
8 | 11 |
|
9 | | -## Architecture |
| 12 | +## Record Format |
10 | 13 |
|
11 | | -The tracing architecture consists of four specialized layers, each handling a specific aspect of the tracing process: |
| 14 | +The Hyper-V pool file record format is fixed width: |
| 15 | +- Key field: 512 bytes |
| 16 | +- Value field: 2048 bytes |
| 17 | +- Total record size: 2560 bytes |
12 | 18 |
|
13 | | -### 1. EmitKVPLayer |
| 19 | +Records are appended to the file and zero-padded to fixed widths. |
14 | 20 |
|
15 | | -**Purpose**: Processes spans and events by capturing metadata, generating key-value pairs (KVPs), and writing to Hyper-V's data exchange file. |
| 21 | +## Store Semantics |
16 | 22 |
|
17 | | -**Key Functions**: |
18 | | -- Captures span lifecycle events (creation, entry, exit, closing) |
19 | | -- Processes emitted events within spans |
20 | | -- Formats data as KVPs for Hyper-V consumption |
21 | | -- Writes encoded data to `/var/lib/hyperv/.kvp_pool_1` |
| 23 | +### `write(key, value)` |
22 | 24 |
|
23 | | -Additionally, events emitted with a `health_report` field are written as special provisioning reports using the key `PROVISIONING_REPORT`. |
| 25 | +- Append-only behavior: each call appends one new record. |
| 26 | +- Duplicate keys are allowed in the file. |
| 27 | +- Returns an error when: |
| 28 | + - key is empty |
| 29 | + - key byte length exceeds `max_key_size` |
| 30 | + - value byte length exceeds `max_value_size` |
| 31 | + - an I/O error occurs |
| 32 | +- Oversized values are rejected by the store (no silent truncation). |
| 33 | + Higher layers are responsible for chunking/splitting when required. |
24 | 34 |
|
25 | | -**Integration with Azure**: |
26 | | -- The `/var/lib/hyperv/.kvp_pool_1` file is monitored by the Hyper-V `hv_kvp_daemon` service |
27 | | -- This enables key metrics and logs to be transferred from the VM to the Azure platform |
28 | | -- Administrators can access this data through the Azure portal or API |
| 35 | +### `read(key)` |
29 | 36 |
|
30 | | -### 2. OpenTelemetryLayer |
| 37 | +- Scans records and returns the value from the most recent matching key |
| 38 | + (last-write-wins). |
| 39 | +- Returns `Ok(None)` when the key is missing or file does not exist. |
31 | 40 |
|
32 | | -**Purpose**: Propagates tracing context and prepares span data for export. |
| 41 | +### `entries()` |
33 | 42 |
|
34 | | -**Key Functions**: |
35 | | -- Maintains distributed tracing context across service boundaries |
36 | | -- Exports standardized trace data to compatible backends |
37 | | -- Enables integration with broader monitoring ecosystems |
| 43 | +- Returns `HashMap<String, String>`. |
| 44 | +- Deduplicates duplicate keys using last-write-wins, matching `read`. |
| 45 | +- This exposes a logical unique-key view even though the file itself is |
| 46 | + append-only and may contain multiple records per key. |
38 | 47 |
|
39 | | -### 3. Stderr Layer |
| 48 | +### `delete(key)` |
40 | 49 |
|
41 | | -**Purpose**: Formats and logs trace data to stderr. |
| 50 | +- Rewrites the file without any matching key records. |
| 51 | +- Returns `true` if at least one record was removed, else `false`. |
42 | 52 |
|
43 | | -**Key Functions**: |
44 | | -- Provides human-readable logging for immediate inspection |
45 | | -- Supports debugging during development |
46 | | -- Captures trace events even when other layers might fail |
| 53 | +## Truncate Semantics (`truncate_if_stale`) |
47 | 54 |
|
48 | | -### 4. File Layer |
| 55 | +`HyperVKvpStore::truncate_if_stale` clears stale records from previous |
| 56 | +boots by comparing file `mtime` to the current boot timestamp. |
49 | 57 |
|
50 | | -**Purpose**: Writes formatted logs to a file (default path: `/var/log/azure-init.log`). |
| 58 | +- If file predates boot: truncate to zero length. |
| 59 | +- If file is current: leave unchanged. |
| 60 | +- If lock contention occurs (`WouldBlock`): return `Ok(())` and skip. |
| 61 | +- Non-contention lock failures are returned as errors. |
51 | 62 |
|
52 | | -**Key Functions**: |
53 | | -- Provides a persistent log for post-provisioning inspection |
54 | | -- Uses file permissions `0600` when possible |
55 | | -- Log level controlled by `AZURE_INIT_LOG` (defaults to `info` for the file layer) |
| 63 | +## Limits and Azure Compatibility |
56 | 64 |
|
57 | | -## How the Layers Work Together |
| 65 | +`KvpLimits` is exported so callers (including diagnostics layers) can |
| 66 | +enforce and reuse exact bounds. |
58 | 67 |
|
59 | | -Despite operating independently, these layers collaborate to provide comprehensive tracing: |
| 68 | +- `KvpLimits::hyperv()` |
| 69 | + - `max_key_size = 512` |
| 70 | + - `max_value_size = 2048` |
| 71 | +- `KvpLimits::azure()` |
| 72 | + - `max_key_size = 512` |
| 73 | + - `max_value_size = 1022` (UTF-16: 511 characters + null terminator) |
60 | 74 |
|
61 | | -1. **Independent Processing**: Each layer processes spans and events without dependencies on other layers |
62 | | -2. **Ordered Execution**: Layers are executed in the order they are registered in `setup_layers` (stderr, OpenTelemetry, KVP if enabled, file if available) |
63 | | -3. **Complementary Functions**: Each layer serves a specific purpose in the tracing ecosystem: |
64 | | - - `EmitKVPLayer` focuses on Azure Hyper-V integration |
65 | | - - `OpenTelemetryLayer` handles standardized tracing and exports |
66 | | - - `Stderr Layer` provides immediate visibility for debugging |
| 75 | +Why Azure limit is lower for values: |
| 76 | +- Hyper-V record format allows 2048-byte values. |
| 77 | +- Azure host handling is stricter; values beyond 1022 bytes are |
| 78 | + silently truncated by host-side consumers. |
| 79 | +- For Azure VMs, use `KvpLimits::azure()` and rely on higher-level |
| 80 | + chunking when larger payloads must be preserved. |
67 | 81 |
|
68 | | -### Configuration |
| 82 | +## Record Count Behavior |
69 | 83 |
|
70 | | -The tracing system's behavior is controlled through configuration files and environment variables, allowing more control over what data is captured and where it's sent: |
| 84 | +There is no explicit record-count cap in this storage layer. |
| 85 | +The file grows with each append until external constraints (disk space, |
| 86 | +retention policy, or caller behavior) are applied. |
71 | 87 |
|
72 | | -- `telemetry.kvp_diagnostics` (config): Enables/disables KVP emission. Default: `true`. |
73 | | -- `telemetry.kvp_filter` (config): Optional `EnvFilter`-style directives to select which spans/events go to KVP. |
74 | | -- `azure_init_log_path.path` (config): Target path for the file layer. Default: `/var/log/azure-init.log`. |
75 | | -- `AZURE_INIT_KVP_FILTER` (env): Overrides `telemetry.kvp_filter`. Precedence: env > config > default. |
76 | | -- `AZURE_INIT_LOG` (env): Controls stderr and file fmt layers’ levels (defaults: stderr=`error`, file=`info`). |
| 88 | +## References |
77 | 89 |
|
78 | | -The KVP layer uses a conservative default filter aimed at essential provisioning signals; adjust that via the settings above as needed. |
79 | | -For more on how to use these configuration variables, see the [configuration documentation](./configuration.md#complete-configuration-example). |
80 | | - |
81 | | -## Practical Usage |
82 | | - |
83 | | -### Instrumenting Functions |
84 | | - |
85 | | -To instrument code with tracing, use the `#[instrument]` attribute on functions: |
86 | | - |
87 | | -```rust |
88 | | -use tracing::{instrument, Level, event}; |
89 | | - |
90 | | -#[instrument(fields(user_id = ?user.id))] |
91 | | -async fn provision_user(user: User) -> Result<(), Error> { |
92 | | - event!(Level::INFO, "Starting user provisioning"); |
93 | | - |
94 | | - // Function logic |
95 | | - |
96 | | - event!(Level::INFO, "User provisioning completed successfully"); |
97 | | - Ok(()) |
98 | | -} |
99 | | -``` |
100 | | - |
101 | | -### Emitting Events |
102 | | - |
103 | | -To record specific points within a span: |
104 | | - |
105 | | -```rust |
106 | | -use tracing::{event, Level}; |
107 | | - |
108 | | -fn configure_ssh_keys(user: &str, keys: &[String]) { |
109 | | - event!(Level::INFO, user = user, key_count = keys.len(), "Configuring SSH keys"); |
110 | | - |
111 | | - for (i, key) in keys.iter().enumerate() { |
112 | | - event!(Level::DEBUG, user = user, key_index = i, "Processing SSH key"); |
113 | | - // Process each key |
114 | | - } |
115 | | - |
116 | | - event!(Level::INFO, user = user, "SSH keys configured successfully"); |
117 | | -} |
118 | | -``` |
119 | | - |
120 | | -## Reference Documentation |
121 | | - |
122 | | -For more details on how the Hyper-V Data Exchange Service works, refer to the official documentation: |
123 | | -[Hyper-V Data Exchange Service (KVP)](https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/integration-services#hyper-v-data-exchange-service-kvp) |
124 | | - |
125 | | -For OpenTelemetry integration details: |
126 | | -[OpenTelemetry for Rust](https://opentelemetry.io/docs/instrumentation/rust/) |
| 90 | +- [Hyper-V Data Exchange Service (KVP)](https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/integration-services#hyper-v-data-exchange-service-kvp) |
0 commit comments