Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to set capabilities #366

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jaronoff97
Copy link
Contributor

Closes #365

@jaronoff97 jaronoff97 requested a review from a team as a code owner March 24, 2025 20:21
@jaronoff97
Copy link
Contributor Author

For context i need this to solve open-telemetry/opentelemetry-operator#3822. Introducing leader election to the bridge for HA would mean one of the bridge pods would need to change its capabilities while running. I could accomplish this by shutting down and then re-starting the opamp client, but that's very heavy and unnecessary IMO.

Copy link

codecov bot commented Mar 24, 2025

Codecov Report

Attention: Patch coverage is 77.35849% with 12 lines in your changes missing coverage. Please review.

Project coverage is 80.02%. Comparing base (4b62964) to head (cf4a028).

Files with missing lines Patch % Lines
client/internal/clientcommon.go 68.96% 4 Missing and 5 partials ⚠️
client/internal/clientstate.go 83.33% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #366      +/-   ##
==========================================
- Coverage   80.10%   80.02%   -0.09%     
==========================================
  Files          25       25              
  Lines        2423     2458      +35     
==========================================
+ Hits         1941     1967      +26     
- Misses        374      380       +6     
- Partials      108      111       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -10,6 +10,8 @@ import (
"github.com/open-telemetry/opamp-go/protobufs"
)

var _ OpAMPClient = &httpClient{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We use both styles in the code, but this one is more common.

Suggested change
var _ OpAMPClient = &httpClient{}
var _ OpAMPClient = (*httpClient)(nil)

@@ -22,6 +22,8 @@ const (
defaultShutdownTimeout = 5 * time.Second
)

var _ OpAMPClient = &wsClient{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var _ OpAMPClient = &wsClient{}
var _ OpAMPClient = (*wsClient)(nil)

Copy link
Contributor

@andykellr andykellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tigrannajaryan
Copy link
Member

The spec does not say whether this is a support operation (to change capabilities after they are first reported). Let me think about it.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking temporarily, give me a bit time to think about the implications of this.

@andykellr
Copy link
Contributor

andykellr commented Mar 25, 2025

The spec does not say whether this is a support operation (to change capabilities after they are first reported). Let me think about it.

Sounds good. I held off on merging until you had a chance to look at it.

My reading of the spec was that since it was not specifically mentioned or prohibited that it would be ok to allow it. Some servers may expect this to stay the same for the life of the connection (based on the existing go implementation and not a requirement of the spec) but since it is passed with every message I think the server should be able to adjust as needed.

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Mar 25, 2025

I don't think we can make this blanket change.

There are capabilities which are checked at the Start() and the corresponding invariants for the capabilities are checked at the same time.

For example, in PrepareStart() we verify that if AcceptsPackages is set then PackagesStateProvider is also provided. Later in receivedProcessor we rely on this invariant. If we break the invariant (which you can easily do via SetCapabilities) then receivedProcessor will attempt to use a nil PackagesStateProvider (which will either crash or error, I didn't look further).

At least from implementation perspective this is not a change we can make.

I am also not sure conceptually it works for every single capability, to allow changing them on the fly.

Before we make this blanket change I would like to see the analysis which explain why it is OK to change each particular capability on the fly (I am not sure that is true). Then we will need to make sure the implementation is ready for that (it is currently not). We will also need to update the spec to explain that this is an allowed mode of operation.

As an alternate, is there a particular capability that you need to change after Start()? we can look at supporting just that.

@jaronoff97
Copy link
Contributor Author

Yes, i did notice that. I think we could do instantiation on demand though based on the capabilities being set. Initially, I just need to be able to enable AcceptsRemoteConfig, and I'd be okay limiting the SetCapabilities to only allow flipping that on and off initially, adding in more allowed changes in the future?

@tigrannajaryan
Copy link
Member

I think AcceptsRemoteConfig is doable. I don't see any special invariants for this capability in the codebase.

We can do this:

  1. Modify OpAMP spec to explicitly call out that the agent may send a different set of capabilities in subsequent AgentToServer messages after the initial one. Explain that only AcceptsRemoteConfig can change.
  2. Restrict SetCapabilities to only allow certain capabilities to be changed after Start. For now it will only allow AcceptsRemoteConfig. We will need to vet all new capabilities one by one if we want to allow changing them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Client should be able to change capabilities while running
3 participants