Skip to content

fix: terminate worker threads after block evaluation in policy-service#6075

Open
vshvets-bc wants to merge 1 commit into
hashgraph:developfrom
Climission:fix/policy-memory-leaks
Open

fix: terminate worker threads after block evaluation in policy-service#6075
vshvets-bc wants to merge 1 commit into
hashgraph:developfrom
Climission:fix/policy-memory-leaks

Conversation

@vshvets-bc
Copy link
Copy Markdown
Collaborator

Three policy blocks spawn worker_threads Workers via new Worker(...) but never call worker.terminate().
Node Worker threads do not auto-exit when the script body returns — they keep the V8 isolate alive (~30 MB each) waiting for more messages.
Every formula / custom-logic / data-transformation invocation leaked one worker thread. A heavy policy could accumulate multi-GB of leaked workers.

Adds a small cleanup() helper that calls worker.terminate() (with a swallowed .catch() for already-exited workers, e.g. via OOM-kill). Called on the terminating message, on error, and on every reject path.

  • math-block.ts: terminate on the single 'done' message + on error
  • custom-logic-block.ts (JS branch): terminate after the final 'done' (data.final === true) so debug messages are still forwarded for multi-emit custom-logic runs.
  • data-transformation-addon.ts: terminate after final result + on error

Behavior unchanged for callers; resources are now released after each block evaluation.

Three policy blocks spawn worker_threads Workers via `new Worker(...)`
but never call worker.terminate(). Node Worker threads do not auto-exit
when the script body returns — they keep the V8 isolate alive (~30 MB
each) waiting for more messages on the parentPort. Every formula /
custom-logic / data-transformation invocation leaked one worker thread.
A heavy policy could accumulate multi-GB of leaked workers over the
course of an MRV submission.

Adds a small cleanup() helper that calls worker.terminate() (with a
swallowed .catch() for already-exited workers, e.g. via OOM-kill).
Called on the terminating message, on error, and on every reject path.

- math-block.ts: terminate on the single 'done' message + on error
- custom-logic-block.ts (JS branch): terminate after the final 'done'
  (data.final === true) so debug messages are still forwarded for
  multi-emit custom-logic runs. Python branch already has timeout-based
  termination + an 'exit' handler and is left untouched.
- data-transformation-addon.ts: terminate after final result + on error

No public API change. Behavior unchanged for callers; resources are now
released after each block evaluation.
@vshvets-bc vshvets-bc requested review from a team as code owners May 14, 2026 11:00
@vshvets-bc vshvets-bc changed the title Fix policy-service: terminate worker threads after block evaluation Fix: terminate worker threads after block evaluation in policy-service. May 14, 2026
@Pyatakov Pyatakov changed the title Fix: terminate worker threads after block evaluation in policy-service. fix: terminate worker threads after block evaluation in policy-service May 17, 2026
@Pyatakov Pyatakov self-assigned this May 17, 2026
// isolate is released. Without this the worker thread stays alive
// after the script body returns, leaking ~30 MB per transformation.
const cleanup = () => { worker.terminate().catch(() => { /* noop */ }); };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No 'exit' handler. Additional code block needs to be added in case if a worker exits without emitting 'done' and without emitting 'error' (e.g., process.exit(0) inside the worker, OOM-kill, or any unhandled internal throw that terminates the isolate)

Suggested change
worker.on('exit', (code) => {
cleanup();
if (code !== 0 && code !== null) {
reject(new Error(`Data transformation worker exited with code ${code}`));
}
});

// Terminate the worker after the final result / on error so the V8
// isolate is released. Without this the worker thread stays alive
// after the script body returns, leaking ~30 MB per transformation.
const cleanup = () => { worker.terminate().catch(() => { /* noop */ }); };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove '/* noop */' here and in the two other places

Comment on lines +53 to +55
// Terminate the worker after the final result / on error so the V8
// isolate is released. Without this the worker thread stays alive
// after the script body returns, leaking ~30 MB per transformation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-line comment is enough, please replace here and in the other two places with:
// Release the worker's V8 isolate; without this each invocation leaks ~30 MB.

@@ -87,15 +84,21 @@ export class MathBlock {
return new Promise<IPolicyDocument>(async (resolve, reject) => {
const workerFile = path.join(path.dirname(filename), '..', 'helpers', 'workers', 'math-worker.js');
const worker = new Worker(workerFile, { workerData });
// Terminate the worker once it finishes so the V8 isolate is released.
// Otherwise every formula evaluation leaks a worker thread (~30 MB).
const cleanup = () => { worker.terminate().catch(() => { /* noop */ }); };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const cleanup = () => { worker.terminate().catch(() => { /* noop */ }); };
const cleanup = () => { worker.terminate().catch(() => {}); };
worker.on('exit', (code) => {
cleanup();
if (code !== 0 && code !== null) {
reject(new Error(`Math worker exited with code ${code}`));
}
});

// the V8 isolate is released. Without this the worker thread
// stays alive after the script body returns, leaking ~30 MB per
// custom-logic invocation.
const cleanup = () => { worker.terminate().catch(() => { /* noop */ }); };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const cleanup = () => { worker.terminate().catch(() => { /* noop */ }); };
const cleanup = () => { worker.terminate().catch(() => {}); };
worker.on('exit', (code) => {
cleanup();
if (code !== 0 && code !== null) {
reject(new Error(`Custom logic worker exited with code ${code}`));
}
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants