-
Notifications
You must be signed in to change notification settings - Fork 118
[DRAFT] Workflow graph extraction and o11y updates #397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[DRAFT] Workflow graph extraction and o11y updates #397
Conversation
🦋 Changeset detectedLatest commit: de123cd The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
@karthikscale3 is attempting to deploy a commit to the Vercel Labs Team on Vercel. A member of the Team first needs to authorize it. |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
| const graphNode = | ||
| nodesWithStepId.length === 1 | ||
| ? nodesWithStepId[0] | ||
| : nodesWithStepId[occurrenceIndex]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| : nodesWithStepId[occurrenceIndex]; | |
| : nodesWithStepId[occurrenceIndex % nodesWithStepId.length]; |
When there are multiple graph nodes for the same step and more runtime invocations than graph nodes, the code silently drops the extra step executions instead of mapping them to a node.
View Details
Analysis
Missing bounds check in processStepGroup causes silent data loss with multiple loop invocations
What fails: mapRunToExecution() silently drops step execution data when a workflow step with multiple graph nodes is invoked more times than there are nodes defined for that step. The processStepGroup() function returns undefined without mapping execution data to any node, resulting in missing entries in nodeExecutions map.
How to reproduce:
- Create a workflow with a step that appears in multiple graph nodes (e.g., multiple conditional branches or loop unrollings)
- Execute the workflow where that step is invoked more times than the number of graph nodes (e.g., 3 invocations but only 2 graph nodes)
- Call
mapRunToExecution(run, steps, events, graph)with the execution data - The executions beyond the number of available nodes will not appear in the returned
WorkflowRunExecution.nodeExecutionsmap
Example scenario: 2 graph nodes for step//my-function but 3 runtime invocations
- First invocation maps to
node-1✓ - Second invocation maps to
node-2✓ - Third invocation:
nodesWithStepId[2]returnsundefined, function returns early at line 221-223, execution data dropped ✗
Root cause: Line 205 accesses nodesWithStepId[occurrenceIndex] directly without bounds checking. When occurrenceIndex >= nodesWithStepId.length, array access returns undefined, causing the if (!graphNode) check at line 221 to return early, skipping all execution data mapping for that invocation.
Fix: Use modulo operator to cycle through available nodes: nodesWithStepId[occurrenceIndex % nodesWithStepId.length]. This ensures all invocations are always mapped to a valid node, with excess invocations cycling back through the available node definitions.
| filename: string, | ||
| source: string, | ||
| mode: 'workflow' | 'step' | 'client' | false, | ||
| mode: 'workflow' | 'step' | 'client' | 'graph' | false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be a separate swc transform instead of another "mode" - or are we reusing enough code that you think it's simpler to just keep everything in one plugin? I'm worried this plugin is getting too bloated but not a v strong opinion. @ijjk got any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly inclined towards breaking it into its own swc transform since this graph generation isn't core for the runtime
also graph generation could run in parallel to the workflow/step generation in the esbuild plugin - or could even be disabled by configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, good points. I was originally thinking about providing a configuration option. Also, breaking it up into its own swc transform could help keeping it separate from the core. I do not have any strong opinions on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes irrelevant now that the extraction is done on the bundle and not through SWC transform anymore. Kinda simplifies everything.
| workflows: {}, | ||
| }; | ||
|
|
||
| for (const workflowFile of workflowFiles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of doing it here, I imagine it's better to actually do it post bundling?
I'm curious how this plugin handles
- a workflow calling a function defined in another non-step function fomr another file, which in turn calls multiples steps, and
- workflows and steps defined inside npm packages, for example,
DurableAgentfrom the@workflow/aipackage
I always imagined that swc could be used to emit some meadata comments with segments of the graph, but the final manifest generation would happen in an esbuild plugin that happens after the swc transform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering doing it post bundling initially. But, we would lose the capability to provide realtime updates to the graph on the dashboard during development. Also, it get a bit tricky to identify workflow and step boundaries from a bundle, as the directives are already transformed and removed at that stage. Having said that, I am still going to do a spike task to explore this path for a bit.
Re: these two scenarios:
- a workflow calling a function defined in another non-step function from another file, which in turn calls multiples steps, and
So, not at the moment. The graph traversal is shallow right now. And I am aware of the downsides to it and this is one of them. I am incrementally improving the traversal algorithm to go deeper to cover all possible scenarios - which is also the reason why I am going back to doing this post bundling cos then we know exactly what workflows and steps are pulled in post transformation. I will have a better answer to this question in a day or two.
- workflows and steps defined inside npm packages, for example, DurableAgent from the @workflow/ai package
Great point! I had not considered this. So I did a small experiment and realized that the DurableAgent exported from the package also transforms with in the project where it's imported and consumed. This is great, cos this means the graph generation does discover the workflow defined in it.
But, it still has the same limitation when it comes to discovering the steps - because of the shallow traversal we have. So in essence, it discovers and shows the workflows of the imported package, but, it won't show the steps yet. Let me think through this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, some updates:
I actually went with the post bundling approach and completely scrapped the graph mode transformation. You were right and great call on this!
It started getting rather too complex to extract the structure for the scenarios you suggested. On the bright side the post bundling approach extracts everything accurately since we operate just on a single file.
This makes the changes quite simple and most of the PR is now focused on UI updates and enhancements to the web project(o11y dashboard).
| }; | ||
| use swc_workflow::{StepTransform, TransformMode}; | ||
|
|
||
| #[testing::fixture("tests/graph/**/input.js")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should just run this on the existing fixtures, including the error fixtures, instead of new graph fixtures. feel free to add new fixtures to the master set too to try the map-callback, loop-step-call, etc.
.changeset/graph-step-traversal.md
Outdated
| @@ -0,0 +1,6 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't forget changeset for builders, next, and all the other packages when you're ready to actually ssubmit the incremental PRs for review)
| // Try to find by exact workflowName match first | ||
| const workflow = Object.values(graphManifest.workflows).find((w) => | ||
| run.workflowName.includes(w.workflowName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Try to find by exact workflowName match first | |
| const workflow = Object.values(graphManifest.workflows).find((w) => | |
| run.workflowName.includes(w.workflowName) | |
| // Parse the run's workflow name to extract the function name | |
| const parsed = parseWorkflowName(run.workflowName); | |
| if (!parsed) return null; | |
| // Find the workflow with exact matching on the function name | |
| const workflow = Object.values(graphManifest.workflows).find( | |
| (w) => w.workflowName === parsed.shortName |
The workflow matching logic uses .includes() instead of exact name matching, which can incorrectly match workflows with overlapping names. For example, a workflow named "workflow" would match runs for "data-processing-workflow".
View Details
Analysis
Incorrect workflow matching in run-detail-view uses substring instead of exact matching
What fails: RunDetailView incorrectly matches workflows due to substring matching using .includes() instead of exact function name comparison.
How to reproduce:
- Define two workflows with overlapping names:
workflowandmyWorkflow - Trigger a run for
myWorkflow(workflowName:workflow//src/main.ts//myWorkflow) - The component attempts to find the matching workflow graph
What happens: The .includes() method matches workflow first because the run's workflowName string contains "workflow" as a substring, even though the actual function name is myWorkflow. This causes the wrong workflow graph to be displayed or a "Workflow Graph Not Found" error.
Expected: The component should parse the run's workflowName (format: workflow//path/to/file.ts//functionName) and perform exact matching on the function name against w.workflowName in the graph manifest.
Root cause: In packages/web/src/components/run-detail-view.tsx lines 86-88, the code used:
run.workflowName.includes(w.workflowName)This substring match is incorrect because:
run.workflowNamecontains the full qualified name:workflow//src/main.ts//myWorkfloww.workflowNameis just the function name:myWorkflow- A workflow named "workflow" matches any run containing "workflow" substring, including runs for "myWorkflow" and "data-processing-workflow"
Fix: Parse the run's workflow name using the already-imported parseWorkflowName() function and perform exact matching on the extracted function name.
- Make header sticky at top of page with proper positioning - Remove border line above tabs in run detail view - Enable scrollable layout for larger workflow graphs - Increase workflow sheet width from 50vw to 75vw for better visibility
…/improve-dashboard
- Add new TypeScript-based graph-extractor.ts for post-bundle CFG extraction - Update builders to use post-bundle extraction instead of SWC graph mode - Supports accurate CFG with loops, parallels, and conditionals - Works with imported step functions from external packages - No changes to Rust SWC plugin (graph mode still exists but unused)
8832152 to
33237b2
Compare
af742c6 to
f5cbe81
Compare
I am working on a few updates to the o11y dashboard.
* Wrote a new SWC transform mode called 'graph mode' that uses the AST visitor pattern to extract the workflow structure as a graph, outputting it to a JSON manifest file (.swc/graph-manifest.json) with a React Flow-compatible structure, enabling anyone to build visualization UIs on top.I am still testing this and I will break it up into a series of PRs to make reviews easier. Looking for initial thoughts/feedback.