Building A Pub/Sub Service In-House Using Node.js And Redis

Building A Pub/Sub Service In-House Using Node.js And Redis

Building A Pub/Sub Service In-House Using Node.js And Redis

Dhimil Gosalia


Today’s world operates in real time. Whether it’s trading stock or ordering food, consumers today expect immediate results. Likewise, we all expect to know things immediately — whether it’s in news or sports. Zero, in other words, is the new hero.

This applies to software developers as well — arguably some of the most impatient people! Before diving into BrowserStack’s story, it would be remiss of me not to provide some background about Pub/Sub. For those of you who are familiar with the basics, feel free to skip the next two paragraphs.

Many applications today rely on real-time data transfer. Let’s look closer at an example: social networks. The likes of Facebook and Twitter generate relevant feeds, and you (via their app) consume it and spy on your friends. They accomplish this with a messaging feature, wherein if a user generates data, it will be posted for others to consume in nothing short of a blink. Any significant delays and users will complain, usage will drop, and if it persists, churn out. The stakes are high, and so are user expectations. So how do services like WhatsApp, Facebook, TD Ameritrade, Wall Street Journal and GrubHub support high volumes of real-time data transfers?

All of them use a similar software architecture at a high level called a “Publish-Subscribe” model, commonly referred to as Pub/Sub.

“In software architecture, publish–subscribe is a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be. Similarly, subscribers express interest in one or more classes and only receive messages that are of interest, without knowledge of which publishers, if any, there are.“


Bored by the definition? Back to our story.

At BrowserStack, all of our products support (in one way or another) software with a substantial real-time dependency component — whether its automate tests logs, freshly baked browser screenshots, or 15fps mobile streaming.

In such cases, if a single message drops, a customer can lose information vital for preventing a bug. Therefore, we needed to scale for varied data size requirements. For example, with device logger services at a given point of time, there may be 50MB of data generated under a single message. Sizes like this could crash the browser. Not to mention that BrowserStack’s system would need to scale for additional products in the future.

As the size of data for each message differs from a few bytes to up to 100MB, we needed a scalable solution that could support a multitude of scenarios. In other words, we sought a sword that could cut all cakes. In this article, I will discuss the why, how, and results of building our Pub/Sub service in-house.

Through the lens of BrowserStack’s real-world problem, you will get a deeper understanding of the requirements and process of building your very own Pub/Sub.

Our Need For A Pub/Sub Service

BrowserStack has around 100M+ messages, each of which is somewhere between approximately 2 bytes and 100+ MB. These are passed around the world at any moment, all at different Internet speeds.

The largest generators of these messages, by message size, are our BrowserStack Automate products. Both have real-time dashboards displaying all requests and responses for each command of a user test. So, if someone runs a test with 100 requests where the average request-response size is 10 bytes, this transmits 1×100×10 = 1000 bytes.

Now let’s consider the larger picture as — of course — we don’t run just one test a day. More than approximately 850,000 BrowserStack Automate and App Automate tests are run with BrowserStack each and every day. And yes, we average around 235 request-response per test. Since users can take screenshots or ask for page sources in Selenium, our average request-response size is approximately 220 bytes.

So, going back to our calculator:

850,000×235×220 = 43,945,000,000 bytes (approx.) or only 43.945GB per day

Now let’s talk about BrowserStack Live and App Live. Surely we have Automate as our winner in form of size of data. However, Live products take the lead when it comes to the number of messages passed. For every live test, about 20 messages are passed each minute it turns. We run around 100,000 live tests, which each test averaging around 12 mins meaning:

100,000×12×20 = 24,000,000 messages per day

Now for the awesome and remarkable bit: We build, run, and maintain the application for this called pusher with 6 t1.micro instances of ec2. The cost of running the service? About $70 per month.

Choosing To Build vs. Buying

First things first: As a startup, like most others, we were always excited to build things in-house. But we still evaluated a few services out there. The primary requirements we had were:

  1. Reliability and stability,
  2. High performance, and
  3. Cost-effectiveness.

Let’s leave the cost-effectiveness criteria out, as I can’t think of any external services that cost under $70 a month (tweet me if know you one that does!). So our answer there is obvious.

In terms of reliability and stability, we found companies that provided Pub/Sub as a service with 99.9+ percent uptime SLA, but there were many T&C’s attached. The problem is not as simple as you think, especially when you consider the vast lands of the open Internet that lie between the system and client. Anyone familiar with Internet infrastructure knows stable connectivity is the biggest challenge. Additionally, the amount of data sent depends on traffic. For example, a data pipe that’s at zero for one minute may burst during the next. Services providing adequate reliability during such burst moments are rare (Google and Amazon).

Source: Smashing Magazine