Node.js · Distributed Systems

Utility of server pooling

Mike Whitfield Sr. Software Engineer, EPAM, Google

September 23rd, 2015

I recently spoke with a company that swore by server pooling so they could write code "lazily" such that it broke often with minimal impact to service.  This means that the servers reboot when they go down but another instance is ready to handle the request.

He claims the usual amount of pooling is 3 servers, whereas he was using 8 which in a Node environment almost never happens.

While I'm busy implementing distributed pools for different reasons (plug-and-play multiple running instances of different versions using the network to manage accessibility/access), I'm wondering what other considerations there are for creating pooling instances of servers?  As a more product-oriented person, I'm mostly concerned with the ability to rapidly deploy new features and advance my software without breaking the existing experience of our products.  This is best achieved by running a distributed set of nodes that run multiple versions simultaneously.

Brendan Gowing CTO at CENTURY Tech

September 23rd, 2015

I think that you should have asked your acquaintance what he meant by"server" - the hardware/VM kind that runs an OS or the software kind that provides a service. It's typical that you run Node.js in a load balanced pool. As Mikko pointed out, Node.js is single threaded and to properly get use of a modern (hardware/VM) server with multiple cores, you need to run an instance of Node.js (the software server) per core. 

Node.js is notoriously unstable (e.g., memory leaks, sloppy programming), so having more than one instance for any given service is crucial. Having many instances is common. To then allow for hardware or virtual host failure/restart, you need more than one machine to run your Node.js instances - the minimum obviously being two. Thus, with two x 4 core machines you have 8 Node.js servers at a minimum. I can imagine that most production systems with many users will have many more Node.js servers running.

If using containerisation systems, such as Docker, I would not imagine that a single Node.js server would be running per Docker image. It would be a large overhead. It's much more likely that you would have multiple Node.js servers running per Docker image. Node.js is particularly easy to run under upstart with a supervisor like forever for handling restarts on failure.

HTH

Mike Whitfield Sr. Software Engineer, EPAM, Google

September 23rd, 2015

Expertly explained Brendan.  Thanks so much.

Mike Whitfield Sr. Software Engineer, EPAM, Google

September 23rd, 2015

Single-threaded node was part of his justification, Mikko (make the OS use more resources to run node, I think).

I'm skeptical when it comes to containers since they seem widely popular with AWS and cloud-hosting users (i.e. I think it's less about engineering and more technical marketing/tech culture).  I'd be interested to hear a justification nonetheless as to why the scripts needed to build the containers is either time unintensive or worth the overhead.

Mikko Koppanen Senior Technologist

September 23rd, 2015

I haven't really heard "server pooling" as described in your post before. It sounds closely like having a hot spare but I don't necessarily understand why you actually need to reboot the servers and why do they go down.

I guess in a node environment you would run multiple processes due to the single-threaded nature of the system, especially if you need to scale to more clients. This would be the case especially with applications where the callbacks are "blocking", i.e. taking some time to finish and thus blocking the main event loop.

If you are looking to run multiple versions of the software and rapidly deploy different versions take a look at Docket, rkt or any other container system. Containers have a lot of benefits such as isolation and easier dependency management.

Mikko Koppanen Senior Technologist

September 23rd, 2015

Hello Mike,

I read your initial post again and I think I don't understand what the term "server pool" means. Could you clarify?

Without knowing anything about the software, the organisation building it and deployment target it's very hard to give you anything but generic advise. In my view the main benefit of containerised systems are that it isolates each app and allows you to make easily deployable units.

As an example: let's say you're running a webapp. You could have the software running on a couple of containers in your live environment and have something like HAProxy in front the containers. Now, when you want to upgrade you spin up new containers for new version, update HAProxy to gracefully redirect traffic to the new containers and as all users have been migrated you shut down the old containers. Found a showstopper bug with new version? Just do the same thing in reverse and spin up the older version containers and update your HAProxy (this naturally assumes there were no destructive changes to data store in between).

The containers also work well in the micro-services architecture where you have small services responsible for small tasks. Having these in containers allow you to flexibly change where the services are run (for example if you run out of capacity on a physical host). Deployments also become somewhat easier because the same container you run locally will also run in other environments (no more "someone forgot to install this and that dependency on that box").

I guess I could write an essay as an answer but in the end the decision whether the benefits outweigh the cost of getting the system running can only be made by someone who understands the actual software and the organisation around it.