Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition causes metric updates to be lost #132

Open
pwinckles opened this issue Nov 16, 2024 · 1 comment · Fixed by #133 · May be fixed by #137
Open

Race condition causes metric updates to be lost #132

pwinckles opened this issue Nov 16, 2024 · 1 comment · Fixed by #133 · May be fixed by #137

Comments

@pwinckles
Copy link
Contributor

pwinckles commented Nov 16, 2024

If a metric is updated while QuiescentRegistryListener is restarting PcpMmvWriter due to a monitorable being added, then it's possible that the metric update will be lost. This can happen because the metric update calls PcpMmvWriter.updateMetric(), which immediately returns if started is false. When QuiescentRegistryListener restarts PcpMmvWriter, it first stops it, setting started to false, and then starts it again. PcpMmvWriter.start() iterates over all of the metric data, writing it again, prior to setting started to true. So, the race occurs if a metric is updated between the time the old value is rewritten in PcpMmvWriter.start() and the time started is set to true.

pwinckles added a commit to pwinckles/parfait that referenced this issue Nov 17, 2024
Fixes a race condition where metric updates can be lost if the happen
while `PcpMmvWriter` is being reset. Previously, updates were dropped
if the writer was not in a started state. If this happened while the
the writer was executing `start()`, then it's possible the updated
value would be lost if the old metric value had already been written.

The issue is addressed here by blocking the metric update when the
writer is in a `STARTING` state, until the writer has been fully
started. If the writer is in a `STARTED` state, the update is applied
immediately. If the writer is in a `STOPPED` state, the update is
dropped immediately.

Resolves performancecopilot#132
@pwinckles pwinckles reopened this Feb 21, 2025
@pwinckles
Copy link
Contributor Author

Unfortunately, my prior PR only partially addressed the issue. The race condition still exists if a monitorable is updated before the writer is started again but after the monitorable has been re-added to the writer. I hope to get a PR up to fix this case tomorrow.

@pwinckles pwinckles linked a pull request Feb 21, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant