Skip to content

[EXPORTER] ostream log exporter, fix memory ownership issues #3417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 21, 2025

Conversation

marcalff
Copy link
Member

@marcalff marcalff commented May 15, 2025

Fixes #3135
Fixes #2651

Changes

Fixed the following memory ownership issues in the ostream log exporter:

  • In class ReadWriteLogRecord, member body_ now owns a copy of the log body
  • In class ReadWriteLogRecord, member attributes_map_ now owns a copy of log attributes

This prevents the use of stale pointers, that could lead to crashes previously.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

Copy link

netlify bot commented May 15, 2025

Deploy Preview for opentelemetry-cpp-api-docs canceled.

Name Link
🔨 Latest commit 9303dee
🔍 Latest deploy log https://app.netlify.com/projects/opentelemetry-cpp-api-docs/deploys/682d9d3ab897e40008da7177

@marcalff marcalff changed the title POC, alternate fix for log record lifecycle. [POC], [ALTERNATE], Fix lifetime for sdk::ReadWriteLogRecord May 15, 2025
Copy link

codecov bot commented May 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.86%. Comparing base (f7babf1) to head (9303dee).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3417      +/-   ##
==========================================
- Coverage   90.06%   89.86%   -0.19%     
==========================================
  Files         212      212              
  Lines        6937     6941       +4     
==========================================
- Hits         6247     6237      -10     
- Misses        690      704      +14     
Files with missing lines Coverage Δ
sdk/src/logs/read_write_log_record.cc 96.43% <100.00%> (+0.05%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@marcalff marcalff changed the title [POC], [ALTERNATE], Fix lifetime for sdk::ReadWriteLogRecord [EXPORTER] ostream log exporter, fixed memory ownership issues May 15, 2025
@marcalff marcalff marked this pull request as ready for review May 15, 2025 20:49
@marcalff marcalff requested a review from a team as a code owner May 15, 2025 20:49
@marcalff marcalff changed the title [EXPORTER] ostream log exporter, fixed memory ownership issues [EXPORTER] ostream log exporter, fix memory ownership issues May 15, 2025
Copy link
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done, thanks. I would ideally prefer to get rid of ReadableLogRecord, and just have ReadWriteLogRecord (probably renamed to LogData), and so have the API consistent with SpanData, but that's not related to the issue :)

@ThomsonTan
Copy link
Contributor

Is this a breaking change if ReadWriteLogRecord is used by the user? Mention this in the changelog if so.

@ThomsonTan
Copy link
Contributor

Nicely done, thanks. I would ideally prefer to get rid of ReadableLogRecord, and just have ReadWriteLogRecord (probably renamed to LogData), and so have the API consistent with SpanData, but that's not related to the issue :)

Seems #3147 includes the renaming to LogRecordData.

@marcalff
Copy link
Member Author

Added an important changes section in the CHANGELOG, to address @ThomsonTan comments.

Waiting for @owent approval, as an alternative fix to PR #3147.

@marcalff
Copy link
Member Author

Additional testing:

@owent
Copy link
Member

owent commented May 19, 2025

Added an important changes section in the CHANGELOG, to address @ThomsonTan comments.

Waiting for @owent approval, as an alternative fix to PR #3147.

This changes API, and may make users can not compile. Personally, I prefer to use #3147, which only change the APIs in v2 version.

@lalitb
Copy link
Member

lalitb commented May 19, 2025

This changes API, and may make users can not compile. Personally, I prefer to use #3147, which only change the APIs in v2 version.

@owent - this is only SDK breaking change, so do we still need to have v2 ?

@marcalff
Copy link
Member Author

This changes API, and may make users can not compile. Personally, I prefer to use #3147, which only change the APIs in v2 version.

@owent - this is only SDK breaking change, so do we still need to have v2 ?

See related discussion #3147 (comment)

@owent
Copy link
Member

owent commented May 20, 2025

This changes API, and may make users can not compile. Personally, I prefer to use #3147, which only change the APIs in v2 version.

@owent - this is only SDK breaking change, so do we still need to have v2 ?

Sorry, I shouldn't use the ABI version macro. My goal is to allow users time to migrate their code when upgrading otel-cpp, rather than causing immediate compilation errors.

@lalitb
Copy link
Member

lalitb commented May 20, 2025

Sorry, I shouldn't use the ABI version macro. My goal is to allow users time to migrate their code when upgrading otel-cpp, rather than causing immediate compilation errors.

Thanks for clarifying. As I understand it, compile-time errors would only occur in the following cases:

  • Custom exporters using ReadWriteLogRecord or ReadableLogRecord.
  • Custom processors that rely on ReadWriteLogRecord or ReadableLogRecord for record filtering or redaction.

None of the core processors and exporters are affected by this change. Given that this is likely a small subset of users, I think introducing these breaking changes should be fine. The changelog should clearly specify the migration steps for affected users in this scenario.

@owent
Copy link
Member

owent commented May 20, 2025

Sorry, I shouldn't use the ABI version macro. My goal is to allow users time to migrate their code when upgrading otel-cpp, rather than causing immediate compilation errors.

Thanks for clarifying. As I understand it, compile-time errors would only occur in the following cases:

  • Custom exporters using ReadWriteLogRecord or ReadableLogRecord.
  • Custom processors that rely on ReadWriteLogRecord or ReadableLogRecord for record filtering or redaction.

None of the core processors and exporters are affected by this change. Given that this is likely a small subset of users, I think introducing these breaking changes should be fine. The changelog should clearly specify the migration steps for affected users in this scenario.

Yes, but in #3147 , a new record class LogRecordData is added, which is used by ostream exporter. This class is added only to make the similar name as trace::SpanData. But it's almost the same as ReadWriteLogRecord, the changes of ReadWriteLogRecord just want to keed ABI compatibility to resolve the comment in #3147 (comment) .

@marcalff
Copy link
Member Author

Sorry, I shouldn't use the ABI version macro. My goal is to allow users time to migrate their code when upgrading otel-cpp, rather than causing immediate compilation errors.

Thanks for clarifying. As I understand it, compile-time errors would only occur in the following cases:

  • Custom exporters using ReadWriteLogRecord or ReadableLogRecord.
  • Custom processors that rely on ReadWriteLogRecord or ReadableLogRecord for record filtering or redaction.

None of the core processors and exporters are affected by this change. Given that this is likely a small subset of users, I think introducing these breaking changes should be fine. The changelog should clearly specify the migration steps for affected users in this scenario.

The changelog already gives details about this, in the important changes section.

I agree that the potential build break will affect only a small subset of users, which is why this should be fine.

Also, note that people affected by the build break will be users who deliberately reuse some SDK classes in their own implementation: they should be fully aware of risks when choosing to do so.

Yes, but in #3147 , a new record class LogRecordData is added, which is used by ostream exporter. This class is added only to make the similar name as trace::SpanData. But it's almost the same as ReadWriteLogRecord, the changes of ReadWriteLogRecord just want to keed ABI compatibility to resolve the comment in #3147 (comment) .

The comment about ABI changes in the SDK is technically correct, as ReadWriteLogRecord indeed changed (in the initial fix from #3147).

However, I think this comment should not be blocking: opentelemetry-cpp does not guarantee binary stability for the SDK implementation, so if an SDK class changes, it changes.

The rationale for creating yet another class like LogRecordData is to preserve a binary compatible ReadWriteLogRecord,
but if there is no need to preserve binary compatibility (there is none), everything can be greatly simplified.

This entire fix (the present PR) is +70 -57 lines, and this includes the documentation in the changelog, and unit tests adjustments.

Please, let's not blow the fix scope out of proportions, when it can be done simply.

@owent
Copy link
Member

owent commented May 21, 2025

Sorry, I shouldn't use the ABI version macro. My goal is to allow users time to migrate their code when upgrading otel-cpp, rather than causing immediate compilation errors.

Thanks for clarifying. As I understand it, compile-time errors would only occur in the following cases:

  • Custom exporters using ReadWriteLogRecord or ReadableLogRecord.
  • Custom processors that rely on ReadWriteLogRecord or ReadableLogRecord for record filtering or redaction.

None of the core processors and exporters are affected by this change. Given that this is likely a small subset of users, I think introducing these breaking changes should be fine. The changelog should clearly specify the migration steps for affected users in this scenario.

The changelog already gives details about this, in the important changes section.

I agree that the potential build break will affect only a small subset of users, which is why this should be fine.

Also, note that people affected by the build break will be users who deliberately reuse some SDK classes in their own implementation: they should be fully aware of risks when choosing to do so.

Yes, but in #3147 , a new record class LogRecordData is added, which is used by ostream exporter. This class is added only to make the similar name as trace::SpanData. But it's almost the same as ReadWriteLogRecord, the changes of ReadWriteLogRecord just want to keed ABI compatibility to resolve the comment in #3147 (comment) .

The comment about ABI changes in the SDK is technically correct, as ReadWriteLogRecord indeed changed (in the initial fix from #3147).

However, I think this comment should not be blocking: opentelemetry-cpp does not guarantee binary stability for the SDK implementation, so if an SDK class changes, it changes.

The rationale for creating yet another class like LogRecordData is to preserve a binary compatible ReadWriteLogRecord, but if there is no need to preserve binary compatibility (there is none), everything can be greatly simplified.

This entire fix (the present PR) is +70 -57 lines, and this includes the documentation in the changelog, and unit tests adjustments.

Please, let's not blow the fix scope out of proportions, when it can be done simply.

OK, I closed #3147, let's use this PR.

@marcalff marcalff merged commit b4254a4 into open-telemetry:main May 21, 2025
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

heap-use-after-free with BatchLogRecordProcessor common::MakeAttributes creates garbled output for std::string
5 participants