|
| 1 | +--- |
| 2 | +title: Observing Lambdas using the OpenTelemetry Collector Extension Layer |
| 3 | +author: '[Dominik Süß](https://github.com/theSuess) (Grafana)' |
| 4 | +linkTitle: Observing Lambdas |
| 5 | +date: 2025-02-05 |
| 6 | +sig: FaaS |
| 7 | +issue: 5961 |
| 8 | +cSpell:ignore: Dominik |
| 9 | +--- |
| 10 | + |
| 11 | +Getting telemetry data out of modern applications is very straightforward (or at |
| 12 | +least it should be). You set up a collector which either receives data from your |
| 13 | +application or asks it to provide an up-to-date state of various counters. This |
| 14 | +happens every minute or so, and if it’s a second late or early, no one really |
| 15 | +bats an eye. But what if the application isn’t around for long? What if every |
| 16 | +second waiting for the data to be collected is billed? Then you’re most likely |
| 17 | +thinking of Function-as-a-Service (FaaS) environments, the most well-known being |
| 18 | +AWS Lambda. |
| 19 | + |
| 20 | +In this execution model, functions are called directly, and the environment is |
| 21 | +frozen afterward. You’re only billed for actual execution time and no longer |
| 22 | +need a server to wait for incoming requests. This is also where the term |
| 23 | +serverless comes from. Keeping the function alive until metrics can be collected |
| 24 | +isn’t really an option and even if you were willing to pay for that, different |
| 25 | +invocations will have a completely separate context and not necessarily know |
| 26 | +about all the other executions happening simultaneously. You might now be |
| 27 | +saying: "I'll just push all the data at the end of my execution, no issues |
| 28 | +here!", but that doesn’t solve the issue. You’ll still have to pay for the time |
| 29 | +it takes to send the data and with many invocations, this adds up. |
| 30 | + |
| 31 | +But there is another way! Lambda extension layers allow you to run any process |
| 32 | +alongside your code, sharing the execution runtime and providing additional |
| 33 | +services. With the |
| 34 | +[opentelemetry-lambda](https://github.com/open-telemetry/opentelemetry-lambda/blob/main/collector/README.md) |
| 35 | +extension layer, you get a local endpoint to send data to while it keeps track |
| 36 | +of the Lambda lifecycle and ensures your telemetry gets to the storage layer. |
| 37 | + |
| 38 | +## How does it work? |
| 39 | + |
| 40 | +When your function is called for the first time, the extension layer starts an |
| 41 | +instance of the OpenTelemetry Collector. The Collector build is a stripped down |
| 42 | +version, providing only components necessary in the context of Lambda. It |
| 43 | +registers with the Lambda |
| 44 | +[Extensions API](https://docs.aws.amazon.com/lambda/latest/dg/runtimes-extensions-api.html) |
| 45 | +and |
| 46 | +[Telemetry API](https://docs.aws.amazon.com/lambda/latest/dg/telemetry-api.html). |
| 47 | +By doing this, it receives notifications whenever your function is executed, |
| 48 | +emits a logline, or the execution context is about to be shut down. |
| 49 | + |
| 50 | +### This is where the magic happens |
| 51 | + |
| 52 | +Up until now, this just seems like extra work for nothing. You'll still have to |
| 53 | +wait for the Collector to export the data, right? This is where the special |
| 54 | +`decouple` processor comes in. It separates the receiving and exporting |
| 55 | +components while interfacing with the Lambda lifecycle. This allows for the |
| 56 | +Lambda to return, even if not all data has been sent. At the next invocation (or |
| 57 | +on shutdown) the Collector continues exporting the data while your function does |
| 58 | +its thing. |
| 59 | + |
| 60 | +{{< figure src="diagram-execution-timing.svg" caption="Diagram showcasing how execution timing differs with and without a Collector">}} |
| 61 | + |
| 62 | +## How can I use it? |
| 63 | + |
| 64 | +As of November 2024, the opentelemetry-lambda project publishes |
| 65 | +[releases of the Collector extension layer](https://github.com/open-telemetry/opentelemetry-lambda/releases/tag/layer-collector%2F0.12.0). |
| 66 | +It can be configured through a configuration file hosted either in an S3 bucket |
| 67 | +or on an arbitrary HTTP server. It is also possible to bundle the configuration |
| 68 | +file with your Lambda code. In both cases, you have tradeoffs to consider. |
| 69 | +Remote configuration files add to the cold start duration as an additional |
| 70 | +request needs to be made, while bundling the configuration increases the |
| 71 | +management overhead when trying to control the configuration for multiple |
| 72 | +Lambdas. |
| 73 | + |
| 74 | +The simplest way to get started is with an embedded configuration. For this, add |
| 75 | +a file called `collector.yaml` to your function. This is a regular Collector |
| 76 | +configuration file. To take advantage of the Lambda specific extensions, they |
| 77 | +need to be configured. As an example, the configuration shown next receives |
| 78 | +traces and logs from the Telemetry API and sends them to another endpoint. |
| 79 | + |
| 80 | +```yaml |
| 81 | +receivers: |
| 82 | + telemetryapi: |
| 83 | +exporters: |
| 84 | + otlphttp/external: |
| 85 | + endpoint: 'external-collector:4318' |
| 86 | +processors: |
| 87 | + batch: |
| 88 | + decouple: |
| 89 | +service: |
| 90 | + pipelines: |
| 91 | + traces: |
| 92 | + receivers: [telemetryapi] |
| 93 | + processors: [batch, decouple] |
| 94 | + exporters: [otlphttp/external] |
| 95 | + logs: |
| 96 | + receivers: [telemetryapi] |
| 97 | + processors: [batch, decouple] |
| 98 | + exporters: [otlphttp/external] |
| 99 | +``` |
| 100 | +
|
| 101 | +The `decouple` processor is configured by default if omitted. It is explicitly |
| 102 | +added in this example to illustrate the entire pipeline. For more information, |
| 103 | +see |
| 104 | +[Autoconfiguration](https://github.com/open-telemetry/opentelemetry-lambda/tree/main/collector#auto-configuration). |
| 105 | + |
| 106 | +Afterward, set the `OPENTELEMETRY_COLLECTOR_CONFIG_URI` environment variable to |
| 107 | +`/var/task/collector.yaml`. Once the function is redeployed, you’ll see your |
| 108 | +function logs appear! You can see this in action in the video below. |
| 109 | + |
| 110 | +<p> |
| 111 | + <video controls style="width: 100%"> |
| 112 | + <source src="./video-lambda-real-time.webm" /> |
| 113 | + </video> |
| 114 | +</p> |
| 115 | + |
| 116 | +Every log line your Lambda produces will be sent to the `external-collector` |
| 117 | +endpoint specified. You don't need to modify the code at all! From there, |
| 118 | +telemetry data flows to your backend as usual. Since the transmission of |
| 119 | +telemetry data might be frozen when the lambda is not active, logs can arrive |
| 120 | +delayed. They'll either arrive during the next execution or during the shutdown |
| 121 | +interval. |
| 122 | + |
| 123 | +If you want further insight into your applications, also see the |
| 124 | +[language specific auto instrumentation layers](https://github.com/open-telemetry/opentelemetry-lambda/?tab=readme-ov-file#extension-layer-language-support). |
0 commit comments