Releases: SAP/component-operator-runtime
v0.3.84
v0.3.83
This release is about revisiting/improving the timeout handling of components.
Improving the logic of the processing/timeout flow
It is well-known that every component has a processing timeout. Components can specify the timeout value by implementing the component.TimeoutConfiguration
interface. Otherwise (or if a zero timeout is specified), it will be defaulted by the effective requeue interval, which defaults to 10 minutes.
Then, note that a component can be in a 'processing' or 'non-processing' state (which is not directly related to status.state
being Processing
). Here, 'processing' means that status.processingSince
is non-initial. Now, if a component is reconciled, a certain component digest is calculated from the component's annotations, spec and references in the spec (see below for more details about references). Whenever this component digest differs from the current status.processingDigest
, then status.processingSince
is set to the current time, and status.processingDigest
is set to the new component digest.
Roughly spoken, that means a new timeout countdown is started.
In addition to 'processing' a component can be in a 'timeout' state; this is the case if the status.processingSince
timestamp lies more than the specified timeout duration in the past. If a component gets into the 'timeout' state
- in non-error situations, then the component status (that is
status.state
) will be set toError
with condition reasonTimeout
- in error situations, then the component status, then the component status will be according to the error (that is,
Error
orPending
), and the condition reason is set toTimeout
.
That means, a timeout can always be reliably detected by checking if the condition reason equals Timeout
.
A 'processing' component will be set to 'non-processing' (that is, status.processingSince
is cleared) if the component becomes ready (in that case, in addition, one immediate requeue is triggered).
Calculation of the component digest
At the beginning of the reconcilation of a component, a (component) digest is calculated that considers
- the
metadata.annotations
of the component - the
metadata.generation
resp. the spec of the component - the loaded content of all spec fields having one of the following types:
ConfigMapReference
,ConfigMapKeyReference
,SecretReference
,SecretKeyReference
,Reference
.
Such references will be automatically loaded at the beginning of the reconcile iteration; for the builtinConfigMap
and Secret
reference types the logic is part of the framework, and for types implementing the
type Reference[T Component] interface {
Load(ctx context.Context, clnt client.Client, component T) error
Digest() string
}
interface, the loading and digest logic is to be provided by the implementation. Besides being used in the timeout handling as status.processingDigest
, the component digest
- is used when calculating event annotations
- is passed to generators in their context
- is used when calculating the object digest of dependent objects with an effective reconcile policy of
OnObjectOrComponentChange
.
Roughly speaking, the component digest should identify result of reconciling the component as exact as possible; that means: applying two components with identical digest should produce the same cluster state.
Incompatible changes
Besides the changes outlined above (which should not have a bad impact) this release contains the following incompatible changes:
- so far, if a retriable error occurred, then
status.state
was set toPending
with reasonPending
, respectively toDeletionPending
with reasonDeletionPending
; the reason values are changed toRetrying
andDeletionRetrying
, respectively - a new reason
Restarting
was added, that will be used withstatus.state
beingPending
, if the processing state of a component is reset due to a component digest change.
v0.3.82: fix(deps): update non-minor dependencies (#257)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
v0.3.81: fix(deps): update node.js to v23.10.0 (#253)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
v0.3.80: chore(deps): update dependency autoprefixer to v10.4.21 (#249)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
v0.3.79
Enhancements
So far, the framework emitted really many component events, mostly if the component is in Processing
state. That often exceeded the burst of the event broadcaster provided by controller-runtime (b=25, r=1/300
, see https://github.com/kubernetes/client-go/blob/b46275ad754db4dd7695a48cd3ca673e0154dd9e/tools/record/events_cache.go#L43).
We change that now. If there are identical subsequent events produced for a component, only the first one will be emitted within 5 minutes; after 5 minutes, again one instance of the throttled event may be sent, and so on.
v0.3.78
Notable changes
- It is now written in stone: hooks must not change the component's metadata or spec; this was actually always clear, but now it is really explicitly forbidden.
- The component digest (which is for example passed to generators and influencing the
status.ProcessingDigest
) is now considering themetadata.generation
of the component.
v0.3.77
Enhancements
-
New methods are added to
cluster.Client
:type Client interface { // ... // Return a rest config for this client. Config() *rest.Config // Return a http client for this client. HttpClient() *http.Client }
-
In addition there is a new reconciler option
type ReconcilerOptions struct { // ... // NewClient allows to modify or replace the default client used by the reconciler. // The returned client is used by the reconciler to manage the component instances, and passed to hooks. // Its scheme therefore must recognize the component type. NewClient NewClientFunc }
with
type NewClientFunc func(clnt cluster.Client) (cluster.Client, error)
This allow to replace or modify the default component/hook client that would be used by the reconciler
- to manage component instances
- when calling hooks.
v0.3.76: chore(deps): update dependency postcss to v8.5.3 (#238)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
v0.3.75
Enhancements
Additional managed types
By its nature, component-operator-runtime tries to handle extension types (such as CRDs or API groups added through APIService federation), and instances of these types, in a smart way.
That is, if the component contains extension types, and also instances of these types, it tries to process things in the right order; that means, during apply the instances will be applied as late as possible (to ensure that controllers and webhooks are up); and during delete, the instances will be deleted as early as possible (to ensure that controllers and webhooks are still there). Furthermore, during deletion, foreign instances (that is, instances of these types that are not part of the component) block the deletion of the whole component.
Sometimes, components are implicitly adding extension types to the cluster; in the sense that the extension types are not explicitly part of the manifests, but added in the dark through controllers, once running. A typical example are crossplane providers.
This PR tries to add some relief in this situation. Components can now list 'additional managed types', by implementing the TypeConfiguration
interface; these 'additional managed types' will be treated in the same way as extension types which are explicitly mentioned in the manifest.
Improved APIService handling
Up to now, APIService
objects were deployed along with the other regular (that was: unmanaged) objects of the current apply wave. As a consequence, if the federated API server was not yet ready, stale group version
errors were returned by the discovery API of the main API server. To overcome this problem, APIService
objects receive a special handling now, in the sense that they are reconciled (in the apply wave) after all other regular objects, and before all managed instances. That means: within each apply order, objects are deployed to readiness in three sub stages
- regular objects (all 'normal' objects)
- late objects (currently, this is only
APIService
objects) - instances of managed types (that is instances of types which are added in this component as CRD or through an
APIService
)
Within each of these sub groups, the static ordering defined in sortObjectsForApply()
is effective.
More robust handling of external recreations happening during deletion
Previously there was a rare race condition while deleting objects (either during component delete or component apply):
The old logic was:
- Delete objects that are are to be deleted (if they are in phase
ScheduledForDeletion
during apply or if the whole component is being deleted); if successful (that is API server responds with 2xx) then the inventory status of the dependent object is set toDeleting
. - Wait until object is gone.
Now, if the object was recreated by someone right between 1. and 2. then the reconciler went stuck.
Note that really does not happen usually (also because the critical period is very, very short).
To overcome, we are now checking the deletion timestamp of the dependent object (if still or again existing). If it has none, then we check the owner; if it is not us, then we give the object up (because apparently, someone else has just recreated it).