Skip to content

WebsocketProvider hangs after inactive period #4251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jibrail-idris opened this issue Jul 18, 2023 · 7 comments
Open

WebsocketProvider hangs after inactive period #4251

jibrail-idris opened this issue Jul 18, 2023 · 7 comments
Assignees
Labels
investigate Under investigation and may be a bug. v6 Issues regarding v6

Comments

@jibrail-idris
Copy link

Ethers Version

6.5.1

Search Terms

#1053

Describe the Problem

The problem is that the websocket hangs after a period of inactivity. This issue has been discussed in #1053

In particular, the fix is:
#1053 (comment)

const EXPECTED_PONG_BACK = 15000
const KEEP_ALIVE_CHECK_INTERVAL = 7500

export const startConnection = () => {
  provider = new ethers.providers.WebSocketProvider(config.ETH_NODE_WSS)

  let pingTimeout = null
  let keepAliveInterval = null

  provider._websocket.on('open', () => {
    keepAliveInterval = setInterval(() => {
      logger.debug('Checking if the connection is alive, sending a ping')

      provider._websocket.ping()

      // Use `WebSocket#terminate()`, which immediately destroys the connection,
      // instead of `WebSocket#close()`, which waits for the close timer.
      // Delay should be equal to the interval at which your server
      // sends out pings plus a conservative assumption of the latency.
      pingTimeout = setTimeout(() => {
        provider._websocket.terminate()
      }, EXPECTED_PONG_BACK)
    }, KEEP_ALIVE_CHECK_INTERVAL)

    // TODO: handle contract listeners setup + indexing
  })

  provider._websocket.on('close', () => {
    logger.error('The websocket connection was closed')
    clearInterval(keepAliveInterval)
    clearTimeout(pingTimeout)
    startConnection()
  })

  provider._websocket.on('pong', () => {
    logger.debug('Received pong, so connection is alive, clearing the timeout')
    clearInterval(pingTimeout)
  })
}

The ping() function is missing from the websocket object.

image

export interface WebSocketLike {
    onopen: null | ((...args: Array<any>) => any);
    onmessage: null | ((...args: Array<any>) => any);
    onerror: null | ((...args: Array<any>) => any);

    readyState: number;

    send(payload: any): void;
    close(code?: number, reason?: string): void;
}

Not sure if there's a correct way to sustain the websocket connection, but the code snippet above from #1053 (comment) seems to be the only feasible way.

Code Snippet

No response

Contract ABI

No response

Errors

No response

Environment

No response

Environment (Other)

No response

@jibrail-idris jibrail-idris added investigate Under investigation and may be a bug. v6 Issues regarding v6 labels Jul 18, 2023
@hubchub
Copy link

hubchub commented Jul 20, 2023

#1053 seems like it was a fix for ethers v5. Do we have a fix for v6? I was looking at putting up a PR with a fix I've been using in a fork but did not want to duplicate work

@fallonp
Copy link

fallonp commented Sep 20, 2023

Any update on this?

Stopping websocket dropping is going to be a big issue for a lot of people - surely this should be more of a priority to fix?

@iquirino
Copy link

iquirino commented Oct 2, 2023

+1

@iquirino
Copy link

iquirino commented Oct 3, 2023

#1053 seems like it was a fix for ethers v5. Do we have a fix for v6? I was looking at putting up a PR with a fix I've been using in a fork but did not want to duplicate work

Can you share your fork with the solution?

@iquirino
Copy link

iquirino commented Oct 3, 2023

I've found a way to implement that old solution on v6:

import { Networkish, WebSocketProvider } from "ethers";
import WebSocket from "ws";

const EXPECTED_PONG_BACK = 15000;
const KEEP_ALIVE_CHECK_INTERVAL = 60 * 1000; //7500;

const debug = (message: string) => {
  console.debug(new Date().toISOString(), message);
};

export const ResilientWebsocket = (
  url: string,
  network: Networkish,
  task: (provider: WebSocketProvider) => void
) => {
  let terminate = false;
  let pingTimeout: NodeJS.Timeout | null = null;
  let keepAliveInterval: NodeJS.Timeout | null = null;
  let ws: WebSocket | null;

  const sleep = (ms: number) =>
    new Promise((resolve) => setTimeout(resolve, ms));

  const startConnection = () => {
    ws = new WebSocket(url);
    ws.on("open", async () => {
      keepAliveInterval = setInterval(() => {
        if (!ws) {
          debug("No websocket, exiting keep alive interval");
          return;
        }
        debug("Checking if the connection is alive, sending a ping");

        ws.ping();

        // Use `WebSocket#terminate()`, which immediately destroys the connection,
        // instead of `WebSocket#close()`, which waits for the close timer.
        // Delay should be equal to the interval at which your server
        // sends out pings plus a conservative assumption of the latency.
        pingTimeout = setTimeout(() => {
          if (ws) ws.terminate();
        }, EXPECTED_PONG_BACK);
      }, KEEP_ALIVE_CHECK_INTERVAL);

      const wsp = new WebSocketProvider(() => ws!, network);

      while (ws?.readyState !== WebSocket.OPEN) {
        debug("Waiting for websocket to be open");
        await sleep(1000);
      }

      wsp._start();

      while (!wsp.ready) {
        debug("Waiting for websocket provider to be ready");
        await sleep(1000);
      }

      task(wsp);
    });

    ws.on("close", () => {
      console.error("The websocket connection was closed");
      if (keepAliveInterval) clearInterval(keepAliveInterval);
      if (pingTimeout) clearTimeout(pingTimeout);
      if (!terminate) startConnection();
    });

    ws.on("pong", () => {
      debug("Received pong, so connection is alive, clearing the timeout");
      if (pingTimeout) clearTimeout(pingTimeout);
    });

    return ws;
  };

  startConnection();

  return () => {
    terminate = true;
    if (keepAliveInterval) clearInterval(keepAliveInterval);
    if (pingTimeout) clearTimeout(pingTimeout);
    if (ws) {
      ws.removeAllListeners();
      ws.terminate();
    }
  };
};

Usage:

terminate = ResilientWebsocket(
    WEBSOCKET_URL,
    Number(CHAIN_ID),
    async (provider) => {
      console.log("connected");
    }
  );

So, you can terminate your process anytime using terminate();

Edit: fixed with @antoinefarley coment: #4251 (comment)

@antoinefarley
Copy link

@iquirino Your solution works for our setup, although

    ws.on("pong", () => {
      debug("Received pong, so connection is alive, clearing the timeout");
      if (pingTimeout) clearInterval(pingTimeout);
    });

should be replaced with:

    ws.on("pong", () => {
      debug("Received pong, so connection is alive, clearing the timeout");
      if (pingTimeout) clearTimeout(pingTimeout);
    });

@iquirino
Copy link

iquirino commented Oct 25, 2023

@iquirino Your solution works for our setup, although

    ws.on("pong", () => {
      debug("Received pong, so connection is alive, clearing the timeout");
      if (pingTimeout) clearInterval(pingTimeout);
    });

should be replaced with:

    ws.on("pong", () => {
      debug("Received pong, so connection is alive, clearing the timeout");
      if (pingTimeout) clearTimeout(pingTimeout);
    });

Hey @antoinefarley, thank you for your reply, I've fixed my last comment with the right code to prevent people from copy wrong code.

So, the correct to work with the v6 is:

import { Networkish, WebSocketProvider } from "ethers";
import WebSocket from "ws";

const EXPECTED_PONG_BACK = 15000;
const KEEP_ALIVE_CHECK_INTERVAL = 60 * 1000; //7500;

const debug = (message: string) => {
  console.debug(new Date().toISOString(), message);
};

export const ResilientWebsocket = (
  url: string,
  network: Networkish,
  task: (provider: WebSocketProvider) => void
) => {
  let terminate = false;
  let pingTimeout: NodeJS.Timeout | null = null;
  let keepAliveInterval: NodeJS.Timeout | null = null;
  let ws: WebSocket | null;

  const sleep = (ms: number) =>
    new Promise((resolve) => setTimeout(resolve, ms));

  const startConnection = () => {
    ws = new WebSocket(url);
    ws.on("open", async () => {
      keepAliveInterval = setInterval(() => {
        if (!ws) {
          debug("No websocket, exiting keep alive interval");
          return;
        }
        debug("Checking if the connection is alive, sending a ping");

        ws.ping();

        // Use `WebSocket#terminate()`, which immediately destroys the connection,
        // instead of `WebSocket#close()`, which waits for the close timer.
        // Delay should be equal to the interval at which your server
        // sends out pings plus a conservative assumption of the latency.
        pingTimeout = setTimeout(() => {
          if (ws) ws.terminate();
        }, EXPECTED_PONG_BACK);
      }, KEEP_ALIVE_CHECK_INTERVAL);

      const wsp = new WebSocketProvider(() => ws!, network);

      while (ws?.readyState !== WebSocket.OPEN) {
        debug("Waiting for websocket to be open");
        await sleep(1000);
      }

      wsp._start();

      while (!wsp.ready) {
        debug("Waiting for websocket provider to be ready");
        await sleep(1000);
      }

      task(wsp);
    });

    ws.on("close", () => {
      console.error("The websocket connection was closed");
      if (keepAliveInterval) clearInterval(keepAliveInterval);
      if (pingTimeout) clearTimeout(pingTimeout);
      if (!terminate) startConnection();
    });

    ws.on("pong", () => {
      debug("Received pong, so connection is alive, clearing the timeout");
      if (pingTimeout) clearTimeout(pingTimeout);
    });

    return ws;
  };

  startConnection();

  return () => {
    terminate = true;
    if (keepAliveInterval) clearInterval(keepAliveInterval);
    if (pingTimeout) clearTimeout(pingTimeout);
    if (ws) {
      ws.removeAllListeners();
      ws.terminate();
    }
  };
};

Usage:

terminate = ResilientWebsocket(
    WEBSOCKET_URL,
    Number(CHAIN_ID),
    async (provider) => {
      console.log("connected");
    }
  );

So, you can terminate your process anytime using terminate();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigate Under investigation and may be a bug. v6 Issues regarding v6
Projects
None yet
Development

No branches or pull requests

6 participants