MQTT (the acronym that, apparently, shouldn’t be expanded to Message Queue Telemetry Transport anymore) is a communication protocol focused on energy efficiency, data-transfer minimization, and assurance of delivery. These three qualities make it the perfect choice for any Internet of Things projects where the Internet connection is given but can be unreliable. Definitions aside, we can use MQTT for:
For a detailed overview of the technology, please read my previous post.
MQTT is an obvious choice when we are building a text-based messaging app. We can go one step further, though, and incorporate MQTT into an application that does much more. Let’s imagine an app that enables its users to perform video calls. Its basic functionalities would be:
MQTT cannot pass video and audio data – we’ll need to use another technology for this, such as Voice over IP (VoIP). That said, we can utilize MQTT to build the supporting architecture to ensure the stability and reliability of our application.
The first problem we will encounter is how to really connect with another user, for example to invite them to a call. The first most obvious idea would be to use their phone number – it’s already in our contact book, and uniquely identifies the user. Since simplest ideas are usually the best, let’s stick to it.
We also want to personalize our profile info and display it in the app. We want our friends to see our latest photo, nickname, and, perhaps, some additional information such as the email address or an alternative phone number. This information cannot be simply deduced from native contact book. It needs to be kept on some kind of a server, which will run all the time and be able to provide the latest info to all involved parties. This time, the most obvious way to achieve this – i.e. keeping the info on the server which allows accessing it via HTTP API – is not the best. The scenario would need to look like this:
The first disadvantage could be overcome by aggregating all changes in one request, so User B could ask for all changes since their last update. As a result, they would get one response with all the data they need. But this will not resolve the second problem. For that, they would need to perform periodic requests, for example one per minute, or even one per second.
The same applies to the connection status. We would like to display an indicator of the connection status for every contact in the app, and we want it to always reflect the current situation, without any delays. There are mechanisms to do this, also to inform the user as soon as the data is available. The server can send a push notification to the user’s app to perform a GET request. It sounds like a little overkill, though, since a push notification will launch the app even when it’s killed, e.g. when user is not interested in this data at the moment. We want to be updated, but only when users uses the app, to minimize unnecessary data and energy consumption.
Let’s use MQTT then. Its publish/subscribe architecture is exactly what we want. With MQTT, the scenario will look like this:
But what if User A updates the connection status to online, but after that, their phone dies (or they kills the app). The app will not be able to update the connection status topic to offline, which will result in other users having a false information. Well, MQTT has a solution to this problem as well. It’s called Last Will and Testament. Last Will is a special setting sent to the broker by a client, which informs the broker to send a special message to all subscribers when the client disconnects ungracefully, so without a chance to properly end a connection and update the topics. In our case, the message, called a Testament, could be “false” sent to “user/{clientID}/isConnected”. Now, we have a complete solution for passing user information.
Let’s talk about the client ID, mentioned indirectly in the sections above. The ID needs to be unique for each user, and inferrable for all clients, which means every user needs to be able to generate or deduce a clientID for every contact they have in their contact book. We mentioned before that a phone number has both of these qualities. The problem with a phone number is that different users may save the same phone number in different styles, for example, with or without prefix. It could be overcome if we normalize phone numbers to some standard, for example E.164 (where all phone numbers are prefixed and written without spaces, e.g. +48555555555). Although it would be enough for our case, let’s not forget that topics are public, which means everyone can subscribe to them. We probably don’t want our users’ phone numbers to be openly passed to the broker. We need a mechanism to obfuscate them. Frankly, it’s pretty easy to do. We can use an MD5 shortcut for them. Phone numbers will be used to create clientIDs, which could then be easily recreated by other users. And the process of decrypting a phone number from a clientID will be really hard, which gives us some basic security. The mechanism of doing all that would look like this:
Another advantage of encrypting the phone numbers as clientIDs is that no matter how long the original phone number is, the clientID will have the same length (depending on the shortcut function used). It makes reverse engineering harder, and, in addition, it gives us nice and predictable topic names.
With clientIDs, we can finally call each other. It can also be done via MQTT messages. To initialize the VoIP call both sides must know the sessionID of the call being established. It can be sent easily by publishing an invite message by one side on a special topic “invite/{clientID}/{sessionID}, where clientID is the ID of the recipient of the call. But to make it work, the recipient needs to be subscribed to this topic to receive the info published on it. But it does not mean that a user needs to subscribe to every possible combination of sessionID’s digits, to a million of topics. They can use a MQTT option called wildcard subscriptions. It means that subscribing for “invite/{clientID}/+” (or “invite/{hisClientID}/#” – learn more in the article mentioned at the beginning of this post) they will receive information published to every topic that will match the convention where the first part is “invite/{clientID}/” and the suffix is any String. Pretty cool, huh?
So we have a basic reliable video calling application. But let’s make it more interesting. Since MQTT is really popular in IoT solutions, let’s add more IoT to our app. Imagine that with our app you can call not only your human friends, but also robots. You heard me right: robots, machines, soulless creations. Let’s also assume a few things:
The scenario described above allows the owner of the robot to remotely control it from any place in the world via a video call. The video stream itself must be fast and reliable, and it must adjust itself to different network conditions. A VoIP protocol can do all these things for us, and we will not discuss them further here. But along with the video stream, we also want to have an immediate real-time feedback from the robot as a result of our commands. This can be done with the use of, you guessed right, MQTT.
First of all, let’s discuss the first point. The clientID for the robot cannot be inferred by a user. Robots, in our case, do not have SIM cards – they are connected to the network in another way, e.g. Wi-Fi. We can resolve this issue by creating a pairing scenario. It can look like this:
As you can see, the pairing process is fast and simple. The calling scenario would be identical to the one for the call between regular user. The same applies to passing profile information. Still, we want to make our app interesting and allow the user to send commands to the robot. It can be done via MQTT topics.
Upon establishing a call between the user and the robot, both parties subscribe to command topics – the user subscribes to “session/{command}/response”, while the robot subscribes to “session/{command}/request”. During the call, the user wants the robot to go to another room (to check if the apartment is empty and safe, or their dog is eating their shoes, I don’t know, possibilities are endless...). The scenario will look as follows:
The MQTT allowed us to create a simple and seamless mechanism for passing commands, without interfering with the VoIP call or without making costly HTTP requests. Moreover, it allowed us to inform the user in real time about the robot’s action and any possible problems. For example, if the bedroom door was shut, the robot could publish { “destination” : “bedroom”, “status” : “blocked” } and allow the user to change their command and respond properly.
This simple use case highlighted two of the biggest strengths of MQTT: real-time updates and the assurance of delivery. Obviously, our project would benefit from low energy and data usage as well – the batteries in our phones and robots are going to last longer. The presented scenario would be tricky if we had to rely on HTTP communication. With MQTT, it's almost seamless.
We need to emphasize one more thing: using MQTT is easy. The protocol is based on simplicity and reliability, so anyone can start using it in a matter of days, if not hours. Thanks to that, it's useful in the prototyping phase, where you can obtain the results faster than with other technologies. And even in the future, any modifications are dead simple. We can replace client libraries or brokers without the need to adjust the other side. In most cases, it feels like a plug’n’play solution.
So go on now and try to create something yourself! There are many free brokers and client libraries, which allows you to have fun with MQTT in a blink of an eye.