Skip to content

Conversation

@elitegreg
Copy link
Contributor

@elitegreg elitegreg commented Oct 2, 2025

Summary of Changes

  • Replace blocking threads with async tasks
  • Fixes shutdown freeze caused by blocking threads. This allows systemd to restart the process on catastropic failure.

Closes #1215

Testing Verification

  • Existing tests pass

Copy link
Contributor

@bgm-malbeclabs bgm-malbeclabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like we need to show more evidence that this fixes the issue since the tests don't seem to have materially changed. Also why move to non-blocking? I have a pretty good idea but it's not clear why the changes from the description.

@elitegreg
Copy link
Contributor Author

Feels like we need to show more evidence that this fixes the issue since the tests don't seem to have materially changed. Also why move to non-blocking? I have a pretty good idea but it's not clear why the changes from the description.

The blocking tasks in the existing activator prevent it from shutting down, so we have to use SIGKILL. This has the consequence of when things fail catastrophically, it winds up blocking on shutdown instead of closing out and being restarted by systemd.

@elitegreg elitegreg requested a review from snormore October 2, 2025 12:23
@elitegreg
Copy link
Contributor Author

Updated the summary of the PR and also included the GH issue

let pubkey = keyed_account.pubkey.clone();
match rpckeyedaccount_decode(keyed_account) {
Ok(Some((pubkey, account))) => {
tx.send((pubkey, account)).await.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do something like:

if let Err(e) = tx.send(...).await {
    print_some_error_or_whatever
    break;
}

The unwrap would crash the entire activator process otherwise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I never intended to leave this in, I was just trying to get to a compilation milestone.

error!("Activator thread exited unexpectedly with reason: {err:?}");
}
_ = websocket_task(Arc::clone(&client), tx.clone(), shutdown.clone()) => {
info!("Websocket task finished, stopping activator...");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have to shutdown.store(true, Ordering::Relaxed) here as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Activator ignores SIGTERM/SIGINT and never exits

5 participants