Cover photo by fabio
EDIT: Second part of the series is released Scalable Websocket Server Implemented by ChatGPT Check it out, it has a fully wor...
For further actions, you may consider blocking this person and/or reporting abuse
Appreciated post for an underrated problem.
Having built several products that rely on websocket transports, scaling is a challenging problem to architect - so looking forward to follow up articles (pls link if already published)
Somewhat Node.js specific, I've been reliant on socket.io, which I'm sure ws fans are aware is not strictly ws compliant since the library enhances the vanilla protocol to solve several pain points inherent in large apps that use socket transports (reconnection, long poll fallback etc)
I've recently implemented socket.io's own drop in replacement for pm2 that mediates socket connections across a clustered socket.io server on my RPA product and so far have seen quite a significant difference in performance.
Not a silver bullet, but if you're using socket.io and perhaps Feathers.js I highly recommend trying out @socket.io/pm2
socket.io/docs/v4/pm2/
Thanks for the contribution, great points! I wasn't aware of the pm2 adapter but it can certainly increase performance, as it uses pm2 process communication via
pm2.sendDataToProcessId
.I haven't shared the follow up article yet, but it is definitely on my list. You can follow me on dev.to or you can subscribe to mailing list from nooptoday.com to get notified.
Amazing. Followed. Subbed to your news letter. Looking forward to the follow up.
The socket.io adapter only works for a single clustered server instance AFAIK, but have been experimenting with multi-instance servers over pm2's IPC, but there's still plenty of yaks to shave when getting NGINX involved, so was naturally intrigued by your posts.
@nooptoday just remembered when Phoenix, the Elixir web framework, managed to sustain 2 million concurrent connections back in 2015.
phoenixframework.org/blog/the-road...
Phoenix is one of my favorite frameworks and inspiration for much of my work. Haven't had much luck hiring Elixir devs so opted for Node.js and Feathers which is as close as I could get. Still planning to migrate a product into Phoenix within the next year, budget permitting.
I always recommend trying our Phoenix and Elixir, coz its really fun and after reading your approach to hashed connection management, I'm going to spend more time on Phoenix to see how they manage connections under the hood.
Interesting implementation here also gist.github.com/Aetherus/2779c154b...
Thanks again for the inspiration
Elixir is definitely the number 1 solution for handling large amounts of concurrent processes. I think it is more about the language itself rather than the framework. As you pointed out, it is hard to find & hire Elixir devs, so usually we see it is used in companies that have really large scales such as Whatsapp. Though I'm not familiar with the language, I will definitely try out example from the gist, much appreciated!
You nailed it there. The BEAM vm is a work of art imho and tbh so is Elixir. Long been fascinated with the Erlang vm and the Siemens hyper concurrent use case it was born under, but Erlang is a monster language to write. Jose Valim really did us all a solid with Elixir... lol if I had the resources I'd certainly fund evangelizing the language to create a much larger developer pool to hire from - maybe there's still time.
Would be really keen to get your impression of the language and working with Phoenix.
The channels faculties really were a game changer in the way I think about real-time architecture and helped make me the massive fan of Feathers.js and its channels implementation - speaking of which, feathers version 5 release candidate is awesome, the schemas addition has really refined the work on my current project.
This is a really awesome and in-depth on guide websocket. I didn't know you can use "consistent hashing" and redistribution algo to do rebalancing of connections. May I know more about the redistribution algorithm?
Cause how does the algo solve the problem that after you had added server 2, either server 1 or 2 goes down assuming it is possible at scale?
Thanks for the reply, I definitely recommend you to watch ByteByteGo explanation on this. Also you can subscribe to my blog or follow me on here, because I am planning to write a series about how you can implement consistent hashing solution in Node.js
Building software is great but building scalable solutions is just something else.
Thanks a lot. This information is so valuable to me as love real-time communication
Thank you for your kind reply, I'm glad you find valuable information in this post.
Interesting read, thanks!
Am I getting it right, the server for connection is determined by user id? So if we have two servers, it's possible that all users with odd ids are offline, then one of the servers will be idle. And this technique doesn't guarantee that servers will be loaded equally, so it seems to be an inefficient solution. Maybe there is some way to keep track of active users on each server and to attach user to the least loaded one?
Also, it may be a good idea to keep a single channel withing a specific server, so all users of this channel are served with a single server and there is no need for a message broker between all servers. When we write a message in Discord, it won't be broadcasted across all Discord servers, that would be a very naive approach.
Thanks for the great questions! I will try to answer these questions in the next post but here are some quick answers:
Great article
Thanks!
Very insightful, not only for knowing how to distribute Websocket load, but in general - thanks!
Thank you for the kind reply
Great article, very easy to understand. Thanks a lot!
Thanks for the reply!