-
Notifications
You must be signed in to change notification settings - Fork 4.3k
fix(custom-resources): waiter state machine retry fails with ExecutionAlreadyExists #35988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(custom-resources): waiter state machine retry fails with ExecutionAlreadyExists #35988
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This review is outdated)
|
Exemption Request This fix is in runtime code (Lambda function execution) and does not change CloudFormation templates or infrastructure. The existing integration tests verify infrastructure creation, which is unaffected by this change. Unit tests provide comprehensive coverage of the runtime behavior change. |
2a0d935 to
6d329d8
Compare
|
alternatively we could forward the request id from the lambda. That should never repeat. |
packages/aws-cdk-lib/custom-resources/test/provider-framework/runtime.test.ts
Outdated
Show resolved
Hide resolved
|
I have confirmed that this PR fixes the issue. |
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
…ustom-resources-waiter-retry-execution-name
Pull request has been modified.
|
Integration test failure are expected due to the changed asset. They are not caused by the new integ-runner engine. You'll need to work with your PR reviewer to update all snapshots. For framework changes like this, I'd typically recommend that a CDK team member is doing this for you. |
Description
Fixes an issue where retrying a CloudFormation deployment that uses a custom resource with an async waiter fails with
ExecutionAlreadyExistserror.Root Cause
The custom resource provider framework uses CloudFormation's
RequestIdas the Step Functions execution name when starting the waiter state machine. When CloudFormation retries a failed deployment, it reuses the sameRequestId. Since Step Functions execution names must be unique for 90 days, subsequent retry attempts fail withExecutionAlreadyExists.Solution
Removed the
nameparameter from thestartExecutioncall, allowing Step Functions to auto-generate unique execution names. This is the recommended approach per the AWS Step Functions StartExecution API Reference, where thenameparameter is optional and Step Functions will automatically generate a universally unique identifier (UUID) as the execution name if not provided.Changes
name: resourceEvent.RequestIdfrom the waiter state machine execution call inframework.tsnamefieldnameis not included in thestartExecutioncallTesting
waiter state machine execution does not include name field (allows retries)to verify the fixnamebeing undefinedRelated Issue
Fixes #35957
Verification
The fix was verified by:
namefield is not includedBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license