-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Hello! We've been trying to integrate this library into our code base, we use Socket IO quite generously and balancing traffic based on the sid combined with not using the master node process as a proxy sounded like a very appealing idea.
We've hit a couple of roadblocks along the way, and I'm wondering if it's just on how we've integrated it. I wanted to share the issues I struggled with and seek some guidance from others who have looked at this issue before. I'm on the verge of going back to proxying requests in the master and balancing based on SID that way.
-
I couldn't use the setupWorker function out of the box, we have multiple Socket IO servers attached to our HTTP server (to cater for multiple paths), and if I run setupWorker multiple times it attached multiple process message listeners. I had to split the function apart into the Engine IO connection listener (which I attached to each Socket IO server) and the process one which I made sure was only attached once.
-
We have some serious issues with long-polling. From what I understand browsers reuse connections under the hood when using HTTP 1.1, and that causes some serious issues for long polling. For example the handshake of a request could come in on Worker 1 and that worker would 'claim' the sid because the Engine IO connection event gets triggered first. But it could be that requests are sent over another connection and are routed to a completely different worker, which causes the dreaded
Session ID unknownerror. I think the mechanism of transferring connections to the Worker doesn't align well with how connections are used in a HTTP 1.1 context, and it seems polling messages can end up on any open connection independent of what Worker it's connected to.
An important consequence for this is also that sticky sessions persist longer than they have to (longer than the existence of the Socket IO socket) as the new Socket gets "claimed" by workers often because the handshake tends to share the open connection used by the old socket connection and hits the worker directly without passing through the master. That essentially means all requests for a user are very likely to end up on a single Worker. -
Given the above, I'm a bit skeptical about the
least connectionload balancing method, it's counting based on the Engine IO connection which doesn't directly seem to correspond to the actual open connection on the HTTP server. Should that be switched to counting connection through event listeners on the HTTP server instead or are the Engine IO connections a good approximation of actual connections to Workers?