Let’s say you have a browser application that needs to keep data in synch between multiple users in real-time. Perhaps it’s a real-time data dashboard, or an online collaboration tool like Sensei, or a multiplayer game. Should you use a publish/subscribe data model or the peer-to-peer model of the WebRTC DataChannel?
Recently I have been speaking to user groups about building real-time web applications as part of the run-up to a book we are writing at AgilityFeat. My presentations explain the basic concepts of both publish/subscribe patterns for data exchange, as well as the video, audio, and data channels of WebRTC. If we set aside the need for any video or audio capabilities, and just think about data exchange between browsers, then the question of which model to use came up in one of my talks this week – and so I’ll post the answer to it here.
First, let’s explain the two architectures briefly…
Publish / Subscribe
In a pubsub network, you have a central messaging server somewhere that is going to host all messages between browsers, and relay them to all users connected to your application. This could be implemented using WebSockets or a number of open source solutions like Faye. We like to use a paid service called PubNub, because it scales easily and they manage the messaging server infrastructure for us.
In a publish / subscribe model, whenever one of the users in your application needs to send a piece of data to all other users, they publish that message to a specific channel on the messaging server. The name of this channel can be anything you want, and the name implies the purpose of the channel and who is going to consume the data. You control what applications or user types you want to consume that data by deciding who you are going to tell the channel name to. In other words, you must build all instances of your application to use the same channel name in order for them to share data.
Anyone who previously subscribed to that channel (which is done by sending a message to the messaging server asking to subscribe to that same channel name), will then receive a callback to a method in the subscriber’s application with the data that was just published.
This method is very simple and versatile, and allows you to connect lots of subscribers to the same channel without creating any extra load on the publisher. The messaging server is responsible for the burden of delivering messages to all subscribers, and therefore the number of subscribers or publishers creates no burden for anyone connected to the messaging channel.
WebRTC Data Channel
WebRTC is something very different. First of all, it supports video and audio channels in addition to a data channel. But for the purposes of this blog post, we are only discussing the DataChannel.
WebRTC is a Peer-to-Peer model, where two users of your application will connect directly to each other. The data that you send does not go through any intermediaries, and all data is automatically encrypted. This provides a nice built in level of security that you don’t have by default in a pubsub architecture.
There is a central server that is used for a handshaking process, called “Signaling” in the WebRTC world, that still needs to happen at the beginning of the conversation. You can do this with a service like PubNub, or you can roll your own signaling implementation. The purpose of this intermediary server is only for your two users’ browsers to learn enough about each other so they know how to route messages. Once they have established the connection to each other, the browsers now speak to each other Peer-to-Peer. If the data you are sending is sensitive, then this is comforting because there is in theory no chance that an intervening server has a chance to examine your data as you send it along to the final destination.
However, because WebRTC is Peer-to-Peer, that means every user of your application must establish a separate connection to every other user of your application that they need to communicate with. This creates a scaling issue that pubsub networks don’t have, so you should keep this in mind if you are trying to send a lot of data messages around to a large number of users (let’s say more than 8 or 10 as a rule of thumb, but your performance results may vary).
Which should you use?
Pubsub style networks and the WebRTC DataChannel are both powerful tools in your arsenal. We regularly find applications for both, but you need to consider the use cases for both. Just because you may be using WebRTC to build in video/audio tools into your application does not automatically mean you should also use it for data messaging.
When to use PubSub real-time messaging
1. Applications with a large number of users who all need to share data/messages with each other
2. Applications where the complexity of the data is not large – your messages are short and do not contain large data sets in them
3. Applications that require cross browser support – publish subscribe networks are much more mature than the WebRTC standard, and while you still need to be careful which frameworks you choose and test them in all required browsers, you can build a much more stable solution across all the major browsers. Many of the publish subscribe frameworks will automatically fall back to other protocols for you if a WebSockets connection is not possible back to the messaging server.
When to use the WebRTC DataChannel
1. Applications where security is paramount. Because your data will be encrypted and is never sent via an intermediary server, you can use the WebRTC DataChannel to build in messaging to health care, corporate enterprise, or other applications where regulations or security concerns make it less desireable to send data through a cloud service.
2. File sharing applications where you need to send large files to other users – You can use the DataChannel to basically setup a Peer-to-Peer file exchange with a limited number of users, and send files directly to each other.
3. Applications where you control the browser. Since WebRTC is currently only supported in Chrome, Firefox, and Opera, you need to have an application where your users are either already using those browsers, or where you control the deployment somehow. For instance, in an corporate enterprise environment, you can simply mandate that those who need to use your real-time dashboard must do it in Chrome so that they get the full benefits of the WebRTC channels. In the future, this will hopefully not be an issue as IE and Safari recognize the growing user demand for WebRTC enabled browser applications.
4. Apps with less than a dozen simultaneous users. Because WebRTC is Peer-to-Peer (P2P), you must establish a separate DataChannel connection to each user you want to trade messages or data with. This can be onerous, so you should not use the WebRTC DataChannel for the basis of a large social media application with lots of users. But it might just be perfect for a corporate application, a data dashboard, a multiplayer game, online education or telemedicine applications, Machine-to-Machine (M2M) communication, or the Internet of Things (IoT).
What do you think?
There are undoubtedly many other uses cases to consider when choosing what type of real-time messaging architecture you want to use. What are the decision trade-offs you need to make? I’d be happy to hear about them, and maybe your feedback will even make it into our upcoming book. And anytime you write a blog post or release a project using these technologies, let me know and maybe we’ll highlight in an upcoming issue of RealTimeWeekly.com.