Skip to content

Conversation

remusb
Copy link

@remusb remusb commented Oct 17, 2018

Note: Please remember to review the Datadog Contribution Guidelines
if you have not yet done so.

What does this PR do?

Adds an extra check in probe.sh to first check if supervisorctl status exits with a 0. If status can not run, probe will fail before we should try to parse its contents.

Motivation

We have a scenario where the collector fails and I was expecting the health check to fail and recycle the task.

Upon checking, I found that supervisorctl encounters an exception and the egrep check does not handle this case.

Traceback (most recent call last):
  File "/opt/datadog-agent/bin/supervisorctl", line 6, in <module>
    from pkg_resources import load_entry_point
  File "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 36, in <module>
  File "/opt/datadog-agent/embedded/lib/python2.7/email/parser.py", line 12, in <module>
    from email.feedparser import FeedParser
  File "/opt/datadog-agent/embedded/lib/python2.7/email/feedparser.py", line 158, in <module>
    class FeedParser:
  File "/opt/datadog-agent/embedded/lib/python2.7/email/feedparser.py", line 161, in FeedParser
    def __init__(self, _factory=message.Message):
AttributeError: 'module' object has no attribute 'Message'
root@ip-10-71-29-36:/# echo $?
0

Exit code: 0
A scheduler that checks for the exit code of the probe will not catch this.

After adding the check for the supervisorctl exit code:

root@ip-10-71-29-36:/# /probe.sh
Traceback (most recent call last):
  File "/opt/datadog-agent/bin/supervisorctl", line 6, in <module>
    from pkg_resources import load_entry_point
  File "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 36, in <module>
  File "/opt/datadog-agent/embedded/lib/python2.7/email/parser.py", line 12, in <module>
    from email.feedparser import FeedParser
  File "/opt/datadog-agent/embedded/lib/python2.7/email/feedparser.py", line 158, in <module>
    class FeedParser:
  File "/opt/datadog-agent/embedded/lib/python2.7/email/feedparser.py", line 161, in FeedParser
    def __init__(self, _factory=message.Message):
AttributeError: 'module' object has no attribute 'Message'
root@ip-10-71-29-36:/# echo $?
1

Exit code: 1

A simple first check of supervisorctl status executed first to ensure it exits with a 0 solves this. Any exception or execution that can not even list the status should marked the container as failed.

Testing Guidelines

N/A - happy to be guided and add something if the probe is covered anywhere as a test scenario

Additional Notes

Can have an implication for this issue: #314
In our environment even with the extra check, it completes in under 1s. Naturally, this will depend on how many resources are allocated to the container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant