Skip to content

Add taskName and input to retry context #1423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions examples/src/main/java/io/dapr/examples/workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Those examples contain the following workflow patterns:
4. [External Event Pattern](#external-event-pattern)
5. [Child-workflow Pattern](#child-workflow-pattern)
6. [Compensation Pattern](#compensation-pattern)
7. [Suspend/Resume Pattern](#suspendresume-pattern)
8. [RetryHandler](#retryhandler)

### Chaining Pattern
In the chaining pattern, a sequence of activities executes in a specific order.
Expand Down Expand Up @@ -707,4 +709,94 @@ The client log:
```text
Started a new external-event model workflow with instance ID: 23410d96-1afe-4698-9fcd-c01c1e0db255
workflow instance with ID: 23410d96-1afe-4698-9fcd-c01c1e0db255 completed.
```

### RetryHandler

When an activity or child workflow fails, Dapr supports auto retry mechanisms such as a `WorkflowTaskRetryHandler` and
`WorkflowTaskRetryPolicy`. An example of `WorkflowTaskRetryPolicy` in use can be found in the child workflow example.

A `WorkflowTaskRetryHandler` allows you to have complete control over whether an activity or child workflow retries or fails.
This is done by implemented the handle method within this interface.

The example RetryHandler below allows for unlimited retries. If a task of type `FailureActivity` fails, it pulls out the
input passed to the activity, an `Instant` in this case, and then uses that to calculate a backoff time.
```java
public class DemoRetryHandler implements WorkflowTaskRetryHandler {

@Override
public boolean handle(WorkflowTaskRetryContext retryContext) {
WorkflowContext workflowContext = retryContext.getWorkflowContext();
Logger logger = retryContext.getWorkflowContext().getLogger();
Object input = retryContext.getInput();
String taskName = retryContext.getTaskName();

if(taskName.equalsIgnoreCase(FailureActivity.class.getName())) {
logger.info("FailureActivity Input: {}", input);
Instant timestampInput = (Instant) input;
// Add a second to ensure, it is 100% passed the time to success
Instant timeToSuccess = timestampInput.plusSeconds(FailureActivity.TIME_TO_SUCCESS + 1);
long timeToWait = timestampInput.until(timeToSuccess, TimeUnit.SECONDS.toChronoUnit());

logger.info("Waiting {} seconds before retrying.", timeToWait);
workflowContext.createTimer(Duration.ofSeconds(timeToWait)).await();
logger.info("Send request to FailureActivity");
}

return true;
}
}
```

Start the workflow and client using the following commands:
<!-- STEP
name: Run RetryHandler workflow
match_order: none
output_match_mode: substring
expected_stdout_lines:
- "Starting RetryWorkflow: io.dapr.examples.workflows.retryhandler.DemoRetryWorkflow"
- "RetryWorkflow is calling Activity: io.dapr.examples.workflows.retryhandler.FailureActivity"
- "Starting Activity: io.dapr.examples.workflows.retryhandler.FailureActivity"
- "Input timestamp:"
- "Throwing exception for Activity: io.dapr.examples.workflows.retryhandler.FailureActivity"
- "FailureActivity Input:"
- "Waiting 11 seconds before retrying."
- "Completing Activity: io.dapr.examples.workflows.retryhandler.FailureActivity"
background: true
sleep: 60
timeout_seconds: 60
-->

```sh
dapr run --app-id demoworkflowworker --resources-path ./components/workflows --dapr-grpc-port 50001 -- java -jar target/dapr-java-sdk-examples-exec.jar io.dapr.examples.workflows.retryhandler.DemoRetryHandlerWorker
```

```sh
java -jar target/dapr-java-sdk-examples-exec.jar io.dapr.examples.workflows.retryhandler.DemoRetryHandlerClient
```

<!-- END_STEP -->

The worker logs:
```text
== APP == 2025-06-16 14:13:42,821 {HH:mm:ss.SSS} [pool-2-thread-1] INFO io.dapr.workflows.WorkflowContext - Starting RetryWorkflow: io.dapr.examples.workflows.retryhandler.DemoRetryWorkflow
== APP == 2025-06-16 14:13:42,821 {HH:mm:ss.SSS} [pool-2-thread-1] INFO io.dapr.workflows.WorkflowContext - RetryWorkflow is calling Activity: io.dapr.examples.workflows.retryhandler.FailureActivity
== APP == 2025-06-16 14:13:42,851 {HH:mm:ss.SSS} [pool-2-thread-1] INFO i.d.e.w.retryhandler.FailureActivity - Starting Activity: io.dapr.examples.workflows.retryhandler.FailureActivity
== APP == 2025-06-16 14:13:42,861 {HH:mm:ss.SSS} [pool-2-thread-1] INFO i.d.e.w.retryhandler.FailureActivity - Input timestamp: 2025-06-16T18:13:42.820Z
== APP == 2025-06-16 14:13:42,861 {HH:mm:ss.SSS} [pool-2-thread-1] INFO i.d.e.w.retryhandler.FailureActivity - Throwing exception for Activity: io.dapr.examples.workflows.retryhandler.FailureActivity
== APP == 2025-06-16 14:13:42,901 {HH:mm:ss.SSS} [pool-2-thread-1] INFO io.dapr.workflows.WorkflowContext - FailureActivity Input: 2025-06-16T18:13:42.820Z
== APP == 2025-06-16 14:13:42,901 {HH:mm:ss.SSS} [pool-2-thread-1] INFO io.dapr.workflows.WorkflowContext - Waiting 11 seconds before retrying.
== APP == Jun 16, 2025 2:13:52 PM io.dapr.durabletask.TaskOrchestrationExecutor$ContextImplTask$RetriableTask shouldRetry
== APP == INFO: shouldRetryBasedOnHandler: true
== APP == 2025-06-16 14:13:53,052 {HH:mm:ss.SSS} [pool-2-thread-1] INFO i.d.e.w.retryhandler.FailureActivity - Starting Activity: io.dapr.examples.workflows.retryhandler.FailureActivity
== APP == 2025-06-16 14:13:53,052 {HH:mm:ss.SSS} [pool-2-thread-1] INFO i.d.e.w.retryhandler.FailureActivity - Input timestamp: 2025-06-16T18:13:42.820Z
== APP == 2025-06-16 14:13:53,052 {HH:mm:ss.SSS} [pool-2-thread-1] INFO i.d.e.w.retryhandler.FailureActivity - Completing Activity: io.dapr.examples.workflows.retryhandler.FailureActivity
== APP == Jun 16, 2025 2:13:53 PM io.dapr.durabletask.TaskOrchestrationExecutor$ContextImplTask$RetriableTask shouldRetry
== APP == INFO: shouldRetryBasedOnHandler: true
```

The client log:
```text
Started a new external-event workflow with instance ID: 9f3c70b6-329d-4715-95ed-6ec9bc55ca39
workflow instance with ID: 9f3c70b6-329d-4715-95ed-6ec9bc55ca39 completed with result: 2025-06-16T18:06:24.068590500Z
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
* Copyright 2025 The Dapr Authors
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
limitations under the License.
*/

package io.dapr.examples.workflows.retryhandler;

import io.dapr.workflows.WorkflowContext;
import io.dapr.workflows.WorkflowTaskRetryContext;
import io.dapr.workflows.WorkflowTaskRetryHandler;
import org.slf4j.Logger;

import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.TimeUnit;

public class DemoRetryHandler implements WorkflowTaskRetryHandler {

@Override
public boolean handle(WorkflowTaskRetryContext retryContext) {
WorkflowContext workflowContext = retryContext.getWorkflowContext();
Logger logger = retryContext.getWorkflowContext().getLogger();
Object input = retryContext.getInput();
String taskName = retryContext.getTaskName();

if(taskName.equalsIgnoreCase(FailureActivity.class.getName())) {
Copy link
Contributor

@siri-varma siri-varma Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, so we can have multiple if blocks

if (taskName.equals..(ChildWorkf1.class)) {
// do something
}

if (taskName.equals..(ChildWorkf2.class)) {
// do something
}

if (taskName.equals..(ChildActivity1.class)) {
// do something
}

Wouldn't it be more maintainable to colocate the workflow or activity with its corresponding retry logic? While I understand the code duplication part, centralizing all retry handling in a single utility class does not seem optimal

@cicoyle , @artur-ciocanu , @salaboy I would like to know your thoughts too.

Copy link
Contributor Author

@TheForbiddenAi TheForbiddenAi Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the case of multiple activites, I would recommend breaking them up into separate retry handlers that call each other (See chain of responsibility design pattern). The main benefit of chaining handlers is that you can reuse logic from prior handlers to reduce code duplication. For example

class HandlerOne implements WorkflowTaskRetryHandler {

  private WorkflowTaskRetryHandler nextHandler;
  
  public boolean handle(WorkflowTaskRetryContextcontext) {
    // handler logic here
    if(nextHandler != null) {
      return nextHandler.handle(context);
    }
    return true;
  }

  void setNextHandler(WorkflowTaskRetryHandler handler) {
     this.nextHandler = handler;
  }
}

In a workflow for instance, it would look like this (HandlerTwo follows same layout as HandlerOne above)

public class TestWorkflow implements Workflow {

  @Override
  public WorkflowStub create() {
    return context -> {
      Logger logger = context.getLogger();
      logger.info("Starting RetryWorkflow: {}", context.getName());

      HandlerOne handlerOne = new HandlerOne();
      HandlerTwo handlerOne = new HandlerTwo();

      handlerOne.setNextHandler(handlerTwo);
      WorkflowTaskOptions taskOptions = new WorkflowTaskOptions(handlerOne);

      logger.info("RetryWorkflow is calling Activity: {}", FailureActivity.class.getName());
      Instant currentTime = context.getCurrentInstant();
      Instant result = context.callActivity(FailureActivity.class.getName(), currentTime, taskOptions, Instant.class).await();

      logger.info("RetryWorkflow finished with: {}",  result);
      context.complete(result);
    };
  }

}

Ideally, you would define an interface or abstract class that defines any common methods/logic (i.e. setNextHandler) when using this pattern, but for the sake of simplicity, I left it out in the example above

Copy link
Contributor Author

@TheForbiddenAi TheForbiddenAi Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I'm curious about scenarios where retries are driven solely by the names of workflow/activities. Could you share an example where that pattern would be more effective?

Perhaps, you want to have variable backoff times depending on what each activity is doing. Additionally, if you know the name of the activity, you know the type of its input. Also, having the name is useful when it comes to tracking what task is being retried in the logs, especially if you are running multiple activities concurrently.

Another possible situation is: let's say you have Task A. This task makes a request to a service under the assumption that this service has access to other data. If this service doesn't have this data, then Task A fails. To recover from this failure, you may want to spin off a new activity, we'll call this Task B, to do some action that creates that data.

In this hypothetical situation, you would only want Task B to be called if Task A fails to prevent duplicating data. Once Task B completes, we could then return to our retry handler to retry Task A automatically.

Copy link
Contributor

@siri-varma siri-varma Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheForbiddenAi I was revisiting this comment earlier this morning and had a couple of thoughts:

Regarding logs: if you include a logger statement inside each activity, it should help you identify which activity is currently running, right?

As for the second scenario, one approach could be to structure the workflow so that it starts with Activity A. If Activity A fails due to missing data, the workflow can then trigger Activity B. Once B completes successfully, you can start Activity A within the same workflow.

Also, the way I see retries, they are for transient failures but in your case Activity A fails with permanent failure

Let me know your thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for logs, if you have multiple workflows running or multiple activities running asynchronously. It would be very difficult to differentiate logs in the retry handler from one another.

As for second scenario, yes that is a valid approach. However it would require a try catch and then manually calling the activity again, meaning if you had multiple of these scenarios for the same activity, you would have to chain try catches which would not be great from a readability and maintainability standpoint

logger.info("FailureActivity Input: {}", input);
Instant timestampInput = (Instant) input;
// Add a second to ensure, it is 100% passed the time to success
Instant timeToSuccess = timestampInput.plusSeconds(FailureActivity.TIME_TO_SUCCESS + 1);
long timeToWait = timestampInput.until(timeToSuccess, TimeUnit.SECONDS.toChronoUnit());

logger.info("Waiting {} seconds before retrying.", timeToWait);
workflowContext.createTimer(Duration.ofSeconds(timeToWait)).await();
logger.info("Send request to FailureActivity");
}

return true;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
/*
* Copyright 2025 The Dapr Authors
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
limitations under the License.
*/

package io.dapr.examples.workflows.retryhandler;

import io.dapr.workflows.client.DaprWorkflowClient;
import io.dapr.workflows.client.WorkflowInstanceStatus;

import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.TimeoutException;

public class DemoRetryHandlerClient {
/**
* The main method to start the client.
*
* @param args Input arguments (unused).
* @throws InterruptedException If program has been interrupted.
*/
public static void main(String[] args) {
try (DaprWorkflowClient client = new DaprWorkflowClient()) {
String instanceId = client.scheduleNewWorkflow(DemoRetryWorkflow.class);
System.out.printf("Started a new external-event workflow with instance ID: %s%n", instanceId);

// Block until the orchestration completes. Then print the final status, which includes the output.
WorkflowInstanceStatus workflowInstanceStatus = client.waitForInstanceCompletion(
instanceId,
Duration.ofSeconds(30),
true);

System.out.printf("workflow instance with ID: %s completed with result: %s%n", instanceId,
workflowInstanceStatus.readOutputAs(Instant.class));
} catch (TimeoutException | InterruptedException e) {
throw new RuntimeException(e);
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/*
* Copyright 2025 The Dapr Authors
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
limitations under the License.
*/

package io.dapr.examples.workflows.retryhandler;

import io.dapr.workflows.runtime.WorkflowRuntime;
import io.dapr.workflows.runtime.WorkflowRuntimeBuilder;

public class DemoRetryHandlerWorker {
/**
* The main method of this app.
*
* @param args The port the app will listen on.
* @throws Exception An Exception.
*/
public static void main(String[] args) throws Exception {
// Register the Workflow with the builder.
WorkflowRuntimeBuilder builder = new WorkflowRuntimeBuilder().registerWorkflow(DemoRetryWorkflow.class);
builder.registerActivity(FailureActivity.class);

// Build and then start the workflow runtime pulling and executing tasks
WorkflowRuntime runtime = builder.build();
System.out.println("Start workflow runtime");
runtime.start();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Copyright 2025 The Dapr Authors
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
limitations under the License.
*/

package io.dapr.examples.workflows.retryhandler;

import io.dapr.workflows.Workflow;
import io.dapr.workflows.WorkflowStub;
import io.dapr.workflows.WorkflowTaskOptions;
import io.dapr.workflows.WorkflowTaskRetryHandler;
import org.slf4j.Logger;

import java.time.Instant;

public class DemoRetryWorkflow implements Workflow {

@Override
public WorkflowStub create() {
return context -> {
Logger logger = context.getLogger();
logger.info("Starting RetryWorkflow: {}", context.getName());

WorkflowTaskRetryHandler retryHandler = new DemoRetryHandler();
WorkflowTaskOptions taskOptions = new WorkflowTaskOptions(retryHandler);

logger.info("RetryWorkflow is calling Activity: {}", FailureActivity.class.getName());
Instant currentTime = context.getCurrentInstant();
Instant result = context.callActivity(FailureActivity.class.getName(), currentTime, taskOptions, Instant.class).await();

logger.info("RetryWorkflow finished with: {}", result);
context.complete(result);
};
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
/*
* Copyright 2025 The Dapr Authors
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
limitations under the License.
*/

package io.dapr.examples.workflows.retryhandler;

import io.dapr.workflows.WorkflowActivity;
import io.dapr.workflows.WorkflowActivityContext;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.time.Instant;

public class FailureActivity implements WorkflowActivity {

private static final Logger LOGGER = LoggerFactory.getLogger(FailureActivity.class);
public static final long TIME_TO_SUCCESS = 10;

@Override
public Object run(WorkflowActivityContext ctx) {
LOGGER.info("Starting Activity: {}", ctx.getName());

Instant timestamp = ctx.getInput(Instant.class);

LOGGER.info("Input timestamp: {}", timestamp);
if(timestamp.plusSeconds(TIME_TO_SUCCESS).isBefore(Instant.now())) {
LOGGER.info("Completing Activity: {}", ctx.getName());
return Instant.now();
}

LOGGER.info("Throwing exception for Activity: {}", ctx.getName());

throw new RuntimeException("Failure!");
}
}
Loading
Loading