You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
gitops-engine directly calls kubectl command code to create/apply/replace/delete K8s resources on the cluster. This ensures that the logic used by gitops-engine consumers (such as Argo CD) interacts with those K8s resources in a way that is compatible to kubectl.
However, at present, gitops-engine does not specify a timeout value for 'kubectl create/apply/replace' commands.
This means that in rare cases (such as cluster/network issues), the kubectl operation will remaining running forever, waiting for an I/O operation that may never complete.
Normally this would just be a small memory leak (i.e. not necessarily the end of the world), however, in order to call the kubectl command code, gitops-engine writes manifest files to '/dev/shm', which are then passed via the '-f' file option to kubectl.
This means that those long-running I/O operations are also leaking K8s manifest files to /dev/shm: the K8s manifest files must remain in '/dev/shm' while the I/O operation is in progress. '/dev/shm' appears limited to 64MB, which can fill quickly.
When examining the contents of /dev/shm from users that have reported this issue, we see a large number of miscellanous manifests that are hours or days old (dating back to the lasted Pod restart).
The proposed solution (PR attached) is to add a long default timeout to calls to kubectl's apply command.
The text was updated successfully, but these errors were encountered:
jgwest
changed the title
'Apply/ReplaceResource'
'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' if the kubectl 'apply/replace' command never times out
May 4, 2024
jgwest
changed the title
'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' if the kubectl 'apply/replace' command never times out
'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' since the kubectl 'apply/replace' commands never time out
May 4, 2024
jgwest
added a commit
to jgwest/gitops-engine
that referenced
this issue
May 4, 2024
gitops-engine directly calls kubectl command code to create/apply/replace/delete K8s resources on the cluster. This ensures that the logic used by gitops-engine consumers (such as Argo CD) interacts with those K8s resources in a way that is compatible to kubectl.
However, at present, gitops-engine does not specify a timeout value for 'kubectl create/apply/replace' commands.
This means that in rare cases (such as cluster/network issues), the kubectl operation will remaining running forever, waiting for an I/O operation that may never complete.
Normally this would just be a small memory leak (i.e. not necessarily the end of the world), however, in order to call the kubectl command code, gitops-engine writes manifest files to '/dev/shm', which are then passed via the '-f' file option to kubectl.
This means that those long-running I/O operations are also leaking K8s manifest files to /dev/shm: the K8s manifest files must remain in '/dev/shm' while the I/O operation is in progress. '/dev/shm' appears limited to 64MB, which can fill quickly.
/dev/shm
from users that have reported this issue, we see a large number of miscellanous manifests that are hours or days old (dating back to the lasted Pod restart).The proposed solution (PR attached) is to add a long default timeout to calls to kubectl's apply command.
Related: #568
The text was updated successfully, but these errors were encountered: