Flow-Bench

Dataset

The dataset is in support of our approach to utilize LLMs to translate natural language into an intermediate representation with Python syntax that facilitates final conversion into widely adopted business process definition languages.

The approach and the methodology that was used to create and validate the dataset can be found in the arxiv paper

The dataset consists of 101 incremental build test cases targeted at supporting and evaluating approaches and research in natural language-driven business process automation.

To ensure compact and clear representations of prior context and expected workflows, FLOW-BENCH adopts a constrained subset of Python syntax. This subset includes assignment statements, conditional statements (if-statements), loops (for and while), and function calls.

The conditional_ootb.yaml file contains the 101 tests. An example test is shown below:

  - _metadata:
      tags:
        - "97"
        - conditional_update
        - conditional_update_replace
      uid: 97
    input:
      utterance: |-
        Instead of retrieving all the issues just create a new issue in each repo
      prior_sequence:
        - |-
          repositories = GitHub_Repository__3_0_0__retrievewithwhere_Repository()
          for repo in repositories:
            new_issue = GitHub_Issue__3_0_0__retrievewithwhere_Issue()
      prior_context: []
      bpmn:
        $ref: "context/uid_97_context.bpmn"
    expected_output:
      sequence:
        - |-
          repositories = GitHub_Repository__3_0_0__retrievewithwhere_Repository()
          for repo in repositories:
            updated_issue = GitHub_Issue__3_0_0__create_Issue()
      bpmn:
        $ref: "output/uid_97_output.bpmn"

The example contains metadata along with tags that outline whether the test is conditional, or linear as well as if its update, delete or implicitly creation. The prior_sequence contains pythonic syntax representation of the previously created BPMN. bpmn points to the corresponding BPMN representations available in the context folder. The expected_output contains the groud truth pythonic syntax representation as well as a reference to the bpmn representation which can be found in the output folder.

The ootb_catalog.json file contains the unique identified id as well as the description of the API. An example is shown below

{
    "id": "bambooHR_benefits__2_0_0__retrievewithwhere_benefits",
    "description": "Retrieve all the benefit deduction types"
}

Approach

For details on the approach to generate flows and the evaluation results on the tests suite refer to Sections 3 and 4 of the arxiv paper, respectively.

Videos

Here are some videos showcasing our approach for multiple use cases.

forloop.mp4

ifconditions.mp4

incremental_build.mp4

How to Cite

@misc{duesterwald2025flowbench,
      title={FLOW-BENCH: Towards Conversational Generation of Enterprise Workflows}, 
      author={Evelyn Duesterwald and Siyu Huo and Vatche Isahagian and K. R. Jayaram and Ritesh Kumar and Vinod Muthusamy and Punleuk Oum and Debashish Saha and Gegi Thomas and Praveen Venkateswaran},
      year={2025},
      url={https://arxiv.org/abs/2505.11646}, 
}

Contributors

In alphabetical order

Evelyn Duesterwald
Siyu Huo
Vatche Isahagian
K.R. Jayaram
Ritesh Kumar
Vinod Muthusamy
Punleuk Oum
Debashish Saha
Gegi Thomas
Praveen Venkateswaran

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
videos		videos
LICENSE		LICENSE
README.md		README.md
flow_bench_arxiv.pdf		flow_bench_arxiv.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flow-Bench

Dataset

Approach

Videos

How to Cite

Contributors

About

Uh oh!

Releases

Packages

Contributors 3

License

IBM/flow-bench

Folders and files

Latest commit

History

Repository files navigation

Flow-Bench

Dataset

Approach

Videos

How to Cite

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages