-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto restart all instance types if there is no activity #3474
Comments
Do we have any updates on this feature? The issue of the monitoring instance, which causes message consumption to frequently stop, keeps occurring. |
@YurivanRuler We don't have roadmaps but thanks for engaging and letting us know this is important to you. |
How about 10 months later? :-) |
Hi @ramonsmits, do you have any updates on this issue? During peak loads we notice that the audit services stop consuming messages. Presently, we've implemented the workaround by restarting the audit services hourly, but we are seeking a more stable solution. Additionally, during peak moments, the scheduled service restarts could potentially slow down ingestion. Any insights or recommendations would be greatly appreciated. |
Hi @YurivanRuler, I'm afraid we still can't provide any timelines on this. Thanks for bringing this back to our attention. Once we start working on the issue, we'll keep you up to date on this issue. |
Auto-restart must be able to deal with any orphaned child processes due to: |
Describe the feature.
Is your feature related to a problem? Please describe.
Sporadically we see users mention that ingestion halted. Logs provide no insights and show no issues but after users restart that instance ingestion resumes.
Users detect such issues because their queues reach their quotas which has other side effects.
Describe the requested feature
The feature could keep track how much time has passed after the last ingested message and if there has been no activity in for example 5 minutes the instance should trigger a termination sequence so that the host will restart the instance.
It could be that there are actually no messages in the queue and the restart was not required but then at least the instance is running as a fresh process.
As an alternative, this could also be done after a certain duration although that could be handled in the environment via a scheduled task (restart service every day at 02:00 AM).
Optionally expose the "last message received timestamp" to a JSON result on a
/health
APIQueue monitoring
This logic could be enhanced by querying the age of the oldest message in the queue (or alternatively, the length of a queue). If the queue is empty then it is expected that there is no activity but otherwise, this indicates that the message pump is no longer working and we are in an unrecoverable state and should terminate.
Workaround
The text was updated successfully, but these errors were encountered: