Skip to content

Commit ea1643b

Browse files
committed
Document telemetry export lifecycle and timing
Added comprehensive section 6.5 explaining exactly when telemetry exports occur: - Statement close: Aggregates metrics, exports only if batch full - Connection close: ALWAYS exports all pending metrics via aggregator.close() - Process exit: NO automatic export unless close() was called - Batch size/timer: Automatic background exports Included: - Code examples showing actual implementation - Summary table comparing all lifecycle events - Best practices for ensuring telemetry export (SIGINT/SIGTERM handlers) - Key differences from JDBC (JVM shutdown hooks vs manual close) Clarified that aggregator.close() does three things: 1. Stops the periodic flush timer 2. Completes any remaining incomplete statements 3. Performs final flush to export all buffered metrics Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
1 parent b9cf684 commit ea1643b

File tree

1 file changed

+152
-0
lines changed

1 file changed

+152
-0
lines changed

spec/telemetry-design.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1632,6 +1632,158 @@ export default class DBSQLClient extends EventEmitter implements IDBSQLClient, I
16321632

16331633
---
16341634

1635+
## 6.5 Telemetry Export Lifecycle
1636+
1637+
This section clarifies **when** telemetry logs are exported during different lifecycle events.
1638+
1639+
### Export Triggers
1640+
1641+
Telemetry export can be triggered by:
1642+
1. **Batch size threshold** - When pending metrics reach configured batch size (default: 100)
1643+
2. **Periodic timer** - Every flush interval (default: 5 seconds)
1644+
3. **Statement close** - Completes statement aggregation, may trigger batch export if batch full
1645+
4. **Connection close** - Final flush of all pending metrics
1646+
5. **Terminal error** - Immediate flush for non-retryable errors
1647+
1648+
### Statement Close (DBSQLOperation.close())
1649+
1650+
**What happens:**
1651+
```typescript
1652+
// In DBSQLOperation.close()
1653+
try {
1654+
// 1. Emit statement.complete event with latency and metrics
1655+
this.telemetryEmitter.emitStatementComplete({
1656+
statementId: this.statementId,
1657+
sessionId: this.sessionId,
1658+
latencyMs: Date.now() - this.startTime,
1659+
resultFormat: this.resultFormat,
1660+
chunkCount: this.chunkCount,
1661+
bytesDownloaded: this.bytesDownloaded,
1662+
pollCount: this.pollCount,
1663+
});
1664+
1665+
// 2. Mark statement complete in aggregator
1666+
this.telemetryAggregator.completeStatement(this.statementId);
1667+
} catch (error: any) {
1668+
// All exceptions swallowed
1669+
logger.log(LogLevel.debug, `Error in telemetry: ${error.message}`);
1670+
}
1671+
```
1672+
1673+
**Export behavior:**
1674+
- Statement metrics are **aggregated and added to pending batch**
1675+
- Export happens **ONLY if batch size threshold is reached**
1676+
- Otherwise, metrics remain buffered until next timer flush or connection close
1677+
- **Does NOT automatically export** - just completes the aggregation
1678+
1679+
### Connection Close (DBSQLClient.close())
1680+
1681+
**What happens:**
1682+
```typescript
1683+
// In DBSQLClient.close()
1684+
try {
1685+
// 1. Close aggregator (stops timer, completes statements, final flush)
1686+
if (this.telemetryAggregator) {
1687+
this.telemetryAggregator.close();
1688+
}
1689+
1690+
// 2. Release telemetry client (decrements ref count, closes if last)
1691+
if (this.telemetryClientProvider) {
1692+
await this.telemetryClientProvider.releaseClient(this.host);
1693+
}
1694+
1695+
// 3. Release feature flag context (decrements ref count)
1696+
if (this.featureFlagCache) {
1697+
this.featureFlagCache.releaseContext(this.host);
1698+
}
1699+
} catch (error: any) {
1700+
logger.log(LogLevel.debug, `Telemetry cleanup error: ${error.message}`);
1701+
}
1702+
```
1703+
1704+
**Export behavior:**
1705+
- **ALWAYS exports** all pending metrics via `aggregator.close()`
1706+
- Stops the periodic flush timer
1707+
- Completes any incomplete statements in the aggregation map
1708+
- Performs final flush to ensure no metrics are lost
1709+
- **Guarantees export** of all buffered telemetry before connection closes
1710+
1711+
**Aggregator.close() implementation:**
1712+
```typescript
1713+
// In MetricsAggregator.close()
1714+
close(): void {
1715+
const logger = this.context.getLogger();
1716+
1717+
try {
1718+
// Step 1: Stop flush timer
1719+
if (this.flushTimer) {
1720+
clearInterval(this.flushTimer);
1721+
this.flushTimer = null;
1722+
}
1723+
1724+
// Step 2: Complete any remaining statements
1725+
for (const statementId of this.statementMetrics.keys()) {
1726+
this.completeStatement(statementId);
1727+
}
1728+
1729+
// Step 3: Final flush
1730+
this.flush();
1731+
} catch (error: any) {
1732+
logger.log(LogLevel.debug, `MetricsAggregator.close error: ${error.message}`);
1733+
}
1734+
}
1735+
```
1736+
1737+
### Process Exit (Node.js shutdown)
1738+
1739+
**What happens:**
1740+
- **NO automatic export** if `DBSQLClient.close()` was not called
1741+
- Telemetry is lost if process exits without proper cleanup
1742+
- **Best practice**: Always call `client.close()` before exit
1743+
1744+
**Recommended pattern:**
1745+
```typescript
1746+
const client = new DBSQLClient();
1747+
1748+
// Register cleanup on process exit
1749+
process.on('SIGINT', async () => {
1750+
await client.close(); // Ensures final telemetry flush
1751+
process.exit(0);
1752+
});
1753+
1754+
process.on('SIGTERM', async () => {
1755+
await client.close(); // Ensures final telemetry flush
1756+
process.exit(0);
1757+
});
1758+
```
1759+
1760+
### Summary Table
1761+
1762+
| Event | Statement Aggregated | Export Triggered | Notes |
1763+
|-------|---------------------|------------------|-------|
1764+
| **Statement Close** | ✅ Yes | ⚠️ Only if batch full | Metrics buffered, not immediately exported |
1765+
| **Batch Size Reached** | N/A | ✅ Yes | Automatic export when 100 metrics buffered |
1766+
| **Periodic Timer** | N/A | ✅ Yes | Every 5 seconds (configurable) |
1767+
| **Connection Close** | ✅ Yes (incomplete) | ✅ Yes (guaranteed) | Completes all statements, flushes all metrics |
1768+
| **Process Exit** | ❌ No | ❌ No | Lost unless `close()` was called first |
1769+
| **Terminal Error** | N/A | ✅ Yes (immediate) | Auth errors, 4xx errors flushed right away |
1770+
1771+
### Key Differences from JDBC
1772+
1773+
**Node.js behavior:**
1774+
- Statement close does **not** automatically export (buffered until batch/timer/connection-close)
1775+
- Connection close **always** exports all pending metrics
1776+
- Process exit does **not** guarantee export (must call `close()` explicitly)
1777+
1778+
**JDBC behavior:**
1779+
- Similar buffering and batch export strategy
1780+
- JVM shutdown hooks provide more automatic cleanup
1781+
- Connection close behavior is the same (guaranteed flush)
1782+
1783+
**Recommendation**: Always call `client.close()` in a `finally` block or using `try-finally` to ensure telemetry is exported before the process exits.
1784+
1785+
---
1786+
16351787
## 7. Privacy & Compliance
16361788

16371789
### 7.1 Data Privacy

0 commit comments

Comments
 (0)