|
| 1 | +--- |
| 2 | +title: Elastic Contributes its Continuous Profiling Agent to OpenTelemetry |
| 3 | +linkTitle: Elastic Contributes Profiling Agent # Mandatory, make sure that your short title. |
| 4 | +date: 2024-06-07 |
| 5 | +# prettier-ignore |
| 6 | +cSpell:ignore: Bahubali Christos Dmitry Filimonov Geisendörfer Halliday Kalkanis Shetti |
| 7 | +author: |
| 8 | + >- # If you have only one author, then add the single name on this line in quotes. |
| 9 | + [Bahubali Shetti](https://github.com/bshetti) (Elastic), [Alexander |
| 10 | + Wert](https://github.com/AlexanderWert) (Elastic), [Morgan |
| 11 | + McLean](https://github.com/mtwo) (Splunk), [Ryan |
| 12 | + Perry](https://github.com/Rperry2174) (Grafana) |
| 13 | +issue: https://github.com/open-telemetry/community/issues/1918 |
| 14 | +sig: Profiling SIG |
| 15 | +--- |
| 16 | + |
| 17 | +Following significant collaboration between |
| 18 | +[Elastic](https://www.elastic.co/observability-labs/blog/elastic-donation-proposal-to-contribute-profiling-agent-to-opentelemetry) |
| 19 | +and [OpenTelemetry's profiling community](/blog/2024/profiling/), which included |
| 20 | +a thorough review process, we’re excited to announce that the OpenTelemetry |
| 21 | +project has accepted |
| 22 | +[Elastic's donation of its continuous profiling agent](https://github.com/open-telemetry/community/issues/1918). |
| 23 | + |
| 24 | +This marks a significant milestone in establishing profiling as a core telemetry |
| 25 | +signal in OpenTelemetry. Elastic’s [eBPF based](https://ebpf.io/) profiling |
| 26 | +agent observes code across different programming languages and runtimes, |
| 27 | +third-party libraries, kernel operations, and system resources with low CPU and |
| 28 | +memory overhead in production. Both, SREs and developers can now benefit from |
| 29 | +these capabilities: quickly identifying performance bottlenecks, maximizing |
| 30 | +resource utilization, reducing carbon footprint, and optimizing cloud spend. |
| 31 | + |
| 32 | +Elastic’s decision to contribute the project to OpenTelemetry was made to |
| 33 | +accelerate OpenTelemetry’s mission and enable effective observability through |
| 34 | +high-quality, portable telemetry. This collaboration also shows the commitment |
| 35 | +to vendor neutrality and community-driven development enhancing the overall |
| 36 | +profiling and observability ecosystems. |
| 37 | + |
| 38 | +The donation happened through a great and constructive cooperation between |
| 39 | +Elastic and the OpenTelemetry community. We look forward to jointly establishing |
| 40 | +continuous profiling as an integral part of OpenTelemetry. |
| 41 | + |
| 42 | +With today’s acceptance, Elastic’s continuous profiling agent will be |
| 43 | +contributed to OpenTelemetry. This agent will now be jointly supported by both |
| 44 | +Elastic’s team as well as a diverse set of official maintainers from different |
| 45 | +companies: |
| 46 | + |
| 47 | +- Dmitry Filimonov (Grafana Labs) |
| 48 | +- Felix Geisendörfer (Datadog) |
| 49 | +- Jonathan Halliday (Red Hat) |
| 50 | +- Christos Kalkanis (Elastic) |
| 51 | + |
| 52 | +## What is continuous profiling? |
| 53 | + |
| 54 | +[Continuous profiling](https://www.cncf.io/blog/2022/05/31/what-is-continuous-profiling/) |
| 55 | +is a technique used to understand the behavior of a software application by |
| 56 | +collecting information about its execution over time. This includes tracking the |
| 57 | +duration of function calls, memory usage, CPU usage, and other system resources |
| 58 | +along with associated metadata. |
| 59 | + |
| 60 | +## Benefits of Continuous Profiling |
| 61 | + |
| 62 | +Traditional profiling solutions, typically used for one-off, development time |
| 63 | +optimizations, can have significant drawbacks limiting adoption in production |
| 64 | +environments: |
| 65 | + |
| 66 | +- Significant cost and performance overhead due to code instrumentation |
| 67 | +- Disruptive service restarts |
| 68 | +- Inability to get visibility into third-party libraries |
| 69 | + |
| 70 | +Continuous profiling, however, runs in the background with minimal overhead, |
| 71 | +providing real-time, actionable insights without the need to replicate issues in |
| 72 | +separate environments. |
| 73 | + |
| 74 | +This allows SREs, DevOps, and developers to see how code affects performance and |
| 75 | +cost, making code and infrastructure improvements easier. |
| 76 | + |
| 77 | +## Contribution of comprehensive profiling abilities |
| 78 | + |
| 79 | +The continuous profiling agent, that Elastic is donating, is |
| 80 | +[based on eBPF](https://ebpf.io/) and by that a whole system, always-on solution |
| 81 | +that observes code and third-party libraries, kernel operations, and other code |
| 82 | +you don't own. It eliminates the need for code instrumentation |
| 83 | +(run-time/bytecode), recompilation, or service restarts with low overhead, low |
| 84 | +CPU (~1%), and memory usage in production environments. |
| 85 | + |
| 86 | +The donated profiling agent facilitates identifying non-optimal code paths, |
| 87 | +uncovering "unknown unknowns", and provides comprehensive visibility into the |
| 88 | +runtime behavior of all applications. The continuous profiling agent provides |
| 89 | +support for a wide range of runtimes and languages, such as: |
| 90 | + |
| 91 | +- C/C++ |
| 92 | +- Rust |
| 93 | +- Zig |
| 94 | +- Go |
| 95 | +- Java |
| 96 | +- Python |
| 97 | +- Ruby |
| 98 | +- PHP |
| 99 | +- Node.js / V8 |
| 100 | +- Perl |
| 101 | +- .NET |
| 102 | + |
| 103 | +## Benefits to OpenTelemetry |
| 104 | + |
| 105 | +This contribution not only boosts the standardization of continuous profiling |
| 106 | +for observability but also accelerates its adoption as a key signal in |
| 107 | +OpenTelemetry. Customers benefit from a vendor-agnostic method of collecting |
| 108 | +profiling data correlating it with existing signals, like tracing, metrics, and |
| 109 | +logs, opening new potential for observability insights and a more efficient |
| 110 | +troubleshooting experience. |
| 111 | + |
| 112 | +### User benefits of OpenTelemetry Profiling |
| 113 | + |
| 114 | +OpenTelemetry-based continuous profiling unlocks the following possibilities for |
| 115 | +users: |
| 116 | + |
| 117 | +- Continuous profiling data compliments the existing signals (traces, metrics |
| 118 | + and logs) by providing detailed, code-level insights on the services' |
| 119 | + behavior. |
| 120 | + |
| 121 | +- Seamless correlation with other OpenTelemetry signals such as traces, |
| 122 | + increasing fidelity and investigatory depth. |
| 123 | + |
| 124 | +- Estimate environmental impact: Combining profiling data with OpenTelemetry's |
| 125 | + resource information (i.e. resource attributes) allows to derive insights into |
| 126 | + the services' carbon footprint. |
| 127 | + |
| 128 | +- Through a detailed breakdown of services' resource utilization, profiling data |
| 129 | + provides actionable information on performance optimization opportunities. |
| 130 | + |
| 131 | +- Improved vendor neutrality: a vendor-agnostic eBPF-based profiling agent |
| 132 | + removes the need to rely on proprietary agents to collect profiling telemetry. |
| 133 | + |
| 134 | +With these benefits, SREs, developers, and DevOps, can now manage the overall |
| 135 | +application’s efficiency on the cloud while ensuring their engineering teams |
| 136 | +optimize it. |
| 137 | + |
| 138 | +As the next step, the OpenTelemetry profiling SIG, that Elastic is a part of, |
| 139 | +will jointly work on integrating the donated agent into OpenTelemetry's |
| 140 | +components ecosystem. We look forward to providing a fully integrated and usable |
| 141 | +version of the new OpenTelemetry eBPF profiling agent to the users, soon. |
0 commit comments