Setting up Socket.io with node.js cluster

Image result for socket.io + nodejs images
Node.js is getting more and more popular in development world, so are the WebSockets (real time connection), but still to make WebSockets and Node.js Cluster work well together using socket.io isn't well documented and a taboo . However its not anymore☺.

In this article we will learn What websockets are, How Node.js cluster works, What problems we face while setting up WebSockets with Node.js cluster and  How we are going to solve them. Before getting into the article its assumed you have little knowledge on node.js and how it works.If you are a newbie don't worry, we have still got you covered. Go through this blog Understanding node.js and come back right after.

What are WebSockets:

      "Websockets are an advanced technology that makes it possible to open an interactive communication session between the user's browser and a server. With this API, you can send messages to a server and receive event-driven responses without having to poll the server for a reply." 

But why do we need webSockets we already have AJAX? WebSockets represent a standard for bi-directional realtime communication between servers and clients. Firstly in web browsers, but ultimately between any server and any client. The standards first approach means that as developers we can finally create functionality that works consistently across multiple platforms. Connection limitations are no longer a problem since WebSockets represent a single TCP socket connection. Cross domain communication has been considered from day one and is dealt with within the connection handshake.

Now as we know what WebSockets are lets dive into setting up a Node.js Cluster.

Node.js cluster API:

Node.Js processes runs on a single process,While it’s still very fast in most cases, this really doesn’t take advantage of multiple processors if they’re available. If you have an 8 core CPU and run a Node.Js program via $ node app.js it will run in a single process, wasting the rest of CPUs. Hopefully for us NodeJS offers the "cluster" module that allows you to create a small network of separate processes which can share server ports; this gives your Node.js app access to the full power of your server.

uhh! That's enough of talking lets see a real example which:

  • Creates a master process that retrives the number of CPUs and forks a worker process for each CPU, and
  • Each child process prints a message in console and exit.

const http = require('http'); const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
masterProcess();
} else {
childProcess();
}
function masterProcess() {
console.log(`Master ${process.pid} is running`);
for (let i = 0; i < numCPUs; i++) {
console.log(`Forking process number ${i}...`);
cluster.fork();
}
process.exit();
}
function childProcess() {
console.log(`Worker ${process.pid} started and finished`);
process.exit();
}
Save the code in app.js file and run executing: $ node app.js. The output should be something similar to:
$ node app.js

Master 8463 is running
Forking process number 0...
Forking process number 1...
Forking process number 2...
Forking process number 3...
Worker 8464 started and finished
Worker 8465 started and finished
Worker 8467 started and finished
Worker 8466 started and finished
simple isn't it? Indeed it is.


Making socket.io work with node.js cluster API:

So can we just start using web-sockets(with socket.io library) with node cluster API?  ummm not yet.
The problem comes with how socket connections are established. Before going further lets understand basic steps in establishing a socket connection.

Handshake
When creating a WebSocket connection, the first step is a handshake over TCP in which the client and server agree to use the WebSocket Protocol.
The handshake from the client looks like this:
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
The handshake from the server:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat

Create a WebSocket Connection
A WebSocket connection is established by upgrading from the HTTP protocol to the WebSocket Protocol during the initial handshake between the client and the server, "over the same underlying TCP connection". An Upgrade header is included in this request that informs the server that the client wishes to establish a WebSocket connection. 

Hence, If we plan to distribute the load of connections among different processes(i.e a cluster), we have to make sure that requests associated with a particular session id connect to the process that originated them.
This is due to certain transports like XHR Polling or JSONP Polling relying on firing several requests during the lifetime of the “socket”. 
Failing to enable sticky balancing will result in the dreaded:



Error during WebSocket handshake: Unexpected response code: 400

Which means that the upgrade request was sent to a node(one among all the available cluster nodes) which did not know the given socket id, hence the HTTP 400 response.


This can be solved with a simple trick, that is to ensure file descriptors (ie: connections) are routed based on the originating remoteAddress (requests from a particular address are routed to same node) rather than in a round-robin fashion. 

Lets do it :D 
const workers = [];
const cluster = require('cluster');
const cpus = require('os').cpus().length;
const port = process.env.PORT || 3001; const path = require('path'); const net = require('net'); const socketio = require('socket.io');
const farmhash = require('farmhash');
const net = require('net');
if(cluster.isMaster){
console.log('Master started process id', process.pid);
for(let i=0;i<cpus;i++){
workers.push( cluster.fork());
console.log('worker strated '+workers[i].id);
workers[i].on('disconnect',() => {
console.log('worker '+ workers[i].id+'died');
});
}
// get worker index based on Ip and total no of workers so that it can be tranferred to same worker
const getWorker_index = (ip,len) => {
return farmhash.fingerprint32(ip)%len;
}
// ceating TCP server
const server = net.createServer({
// seting pauseOnCOnnect to true , so that when we receive a connection pause it
// and send to corresponding worker
pauseOnConnect: true
}, (connection) => {
// We received a connection and need to pass it to the appropriate
// worker. Get the worker for this connection's source IP and pass
// it the connection. we use above defined getworker_index func for that
const worker = workers[getWorker_index(connection.remoteAddress,cpus)];
// send the connection to the worker and send resume it there using .resume()
worker.send({
cmd:'sticky-session'
},connection);
}).listen(port);
}
else{
// listning for message event sent by master to catch the connection and resume
cluster.worker.on('message',(obj,data) => {
switch(obj.cmd){
case "sticky-session":
Expserver.emit('connection',data);
data.resume();
break;
default: return;
}
});
}
As you can see requests originating from same IP address goes to same node in the cluster hence sticky balancing the requests
Please note that this might lead to unbalanced routing, depending on the hashing method we use.

There you go. This is how you make socket.io work with Node.js cluster API:)

Here's the repo of sample chat application using Socket.io and node cluster API:
https://github.com/ANURAGVASI/socket.io-multiserver-chatApp

Comments

  1. Very informative thanks....but
    I would like to know how to ensure the session management in cluster API?

    ReplyDelete
    Replies
    1. Thanks man! You may want to use redis for central management, working with web sockets is a little catchy and IPC(Inter process communication) is a tedious job to manage application, it is not preferred practice for handling complex applications.

      Delete

Post a Comment

Popular Posts