It was created in 1999 by Dr. Andy Stanford-Clark of IBM, and Arlen Nipper of Arcom (now Eurotech). For such young technology, it’s astonishing how much this project grew. It’s been used by an uncountable number of developers as many of the big players on the market, to mention few of them like Facebook’s Messenger, Amazon IoT, Adafruit IO, Microsoft Azure IoT Hub or McAfee OpenDXL.
Previously known as “SCADA protocol”, “MQ Integrator SCADA Device Protocol” (MQIsdp) or “WebSphere MQTT” (WMQTT) the MQTT protocol is in the process of undergoing standardization at OASIS. It’s already an ISO standard (ISO/IEC PRF 20922).
How does it work
To really understand MQTT it’s good to present it in reference to another technology already familiar to most developers. Let’s compare it to HTTP. To get some information with it a party must send a request and wait for the response. So, for example, to assure that the latest data available on the server is mirrored on the requester side, the requests must be performed periodically. And this is where the MQTT shows it’s real strength. It uses a publish/subscribe architecture, which means that parties interested in some updates are subscribing for it to the MQTT broker. The MQTT broker is a service run on a server, which gathers all subscriptions. When someone publishes data on a certain topic, all subscribers to this topic will get an instant message with the new data. No need for re-requesting, if they are connected to the MQTT broker - they are up-to-date.
The MQTT broker is a central which dispatches all messages between publisher and subscribers. Only it has knowledge of all subscribers and publishers, the clients are oblivious, they are only interested in the data. That enables developers to create simple and manageable architecture on their sides. The only value known to all interested parties is the topic on which the data is being published. The topic itself is represented as a string, so it’s a plain text.
Enough theory, let’s show some examples
Let’s say that the MQTT protocol has been implemented in a temperature sensor. It performs temperature readings on a second basis. After launch, it connects to the MQTT broker with its URL address. Then it starts to publish the readings on “temperature” topic with retain status. That means, that the reading is being kept by the broker to the point a new reading overwrites it. There are no subscribers at this point, so the broker keeps all data to itself.
A client interested in the current temperature launches his mobile application. It connects to the MQTT broker and subscribes to the “temperature” topic. At the instant, the app gets the latest readings because of the retained status of the topic. From that point, whenever the sensor publishes a reading, the app will be notified via the MQTT about the new value.
Another client launches her desktop application. The scenario repeats. Now both clients have the latest data.
First client decides that he knows what he wanted. He disconnects from MQTT broker, which stops updates to his app. Now the broker only updates the second client. But then she decides as well that she does not need this info anymore. She unsubscribes from this topic, which informs the broker, that this particular client does not want to receive any new information. But she stays connected to the broker itself. After that, the client can decide to subscribe to some different topic, let’s say “air_pressure”. No sensor is publishing to this topic, so she does not get any updates from the broker. But, as soon as some other sensor connects and do so, she will receive the information at an instant.
The topics are completely independent, every subscriber can subscribe to any number of topics, as any publisher can publish to as many topics as it wants. That gives a developer a powerful tool for creating enormous but lightweight, fast and reliable architectures.
Topics are identifiers used by publishers, subscribers, and broker to identify how to manage data being published to them. They are simple strings, but with hidden qualities.
First of all, they can be hierarchical, which means they can have levels. For example, previously mentioned topics can be changed to conform to some arbitrary design like “sensor/temperature/reading”, “sensor/air_pressure/reading”. This structure, levels divided by slash, is being recognized by the broker and parsed accordingly. That enables a subscriber to do an amazingly simple yet beautiful thing: to subscribe to wildcard topics. A wildcard topic does not point to a specific topic but informs the broker, that a subscriber is interested in all topics which match the convention. To create a wildcard topic the part of the topics which needs to be generic can be replaced with a “+”. So if someone wants to get information from all sensors, he can subscribe to “sensor/+/reading” and will get all the information published to both of the previously mentioned topics.
The broker is able to recognize even multilevel wildcard subscriptions which are created with a “#”. Let’s say that we have another topic “sensor/temperature/specification” which has technical details for the certain sensor. To subscribe to all topics at once, the subscriber can use “sensors/#” topic. And that’s it! A thing to remember is that “#” can be used only as the last character of the topic name, so it cannot be used for example as “#/readings” for subscribing to all readings from any sensor, device or service. However, there is a way to subscribe to all topics on the broker side with topic “#” but it should not be used on the client side. It’s just a handy tool for debugging the broker on the server side.
All messages have additional settings for ensuring their delivery and for creating different scenarios. One of them is previously mentioned retain status. If it is set to true, all new subscribers will get the last saved information, even if there were no updates for quite some time. If it’s false, the new subscribers will only get the information published at this point in time.
Another setting is Quality of Service. It can take three values:
- 0 - at most once - guarantees a best effort delivery, which means that most of the cases the subscriber will get the information saved on the topic. However, in some cases of network interruption, reconnecting or poor network in general, some messages may be not received by the subscriber. It provides the same behavior as underlying TCP protocol, the behavior often called “fire & forget”.
- 1 - at least once - guarantees a delivery of the message. The sender usually will send the information and wait for the acknowledgment for some time. If there isn’t any, it will send it again. That means that there are some edge case scenarios when the message will be received more than once, which must be handled properly by the receiver.
- 2 - exactly once - guarantees that a message will be delivered exactly once. It’s the safest, but also the slowest of these options. It has an additional exchange of messages implemented to ensure that the message was received AND processed by the receiver. It’s worth to mention that many of the brokers have this option implemented just like the 1 - at least once. It does not conform to the MQTT specification, but it’s enough for most use cases.
It’s important to know, that QoS can differ between the publisher and the subscriber. If so, the message would be received/sent on the broker side with QoS specific to their settings. That means, that even if the publisher sends the message with QoS = 2 to the broker, if subscriber subscribed to this topic with QoS = 1, there is a chance that he will receive the message more than once.
The last thing about topics is the possibility to invalidate the retained message stored for certain topic. It can be done by publishing a zero byte payload on a certain topic. It’s an information for the broker that it can safely remove all of the data stored on this topic.
Publishers & subscribers
The only party of the whole MQTT infrastructure which is always running is the MQTT broker. That means that there is no need for publisher and subscribers to run at the same time. The broker stores and passes the messages. The publish/receive event on the client side will not halt the broker in any way, it’s a great place to parallelize the broker so it can process a large number of requests at once.
The process of a client connecting to the broker is simple and straightforward. The client sends the CONNECT message, to which the broker responds with CONNACK. From that, it will keep the connection alive until the client disconnects.
The client can send a special message to the broker which is The Last Will. It’s the message which is sent to all of the subscribers for a certain client’s topics when it disconnects ungracefully, so for example when the application is being killed without a chance to perform disconnection. It’s often used to store the retained status of connection for a certain client.
Clean vs Persistent Session
The client can connect to the broker with clean session setting set to true or false. The first option means, that when the client disconnects, all of its subscriptions will be invalidated and it will need to subscribe to them again when connecting. The advantage of this is a lower number of messages kept on the broker side because they can be invalidated as well. It can also be important for some architectures when the predictable and stable environment must be ensured at every launch of the application.
The other option, the persistent session, means that the broker will keep the subscriptions active and send the message for updated topics as soon as the client reconnects without a necessity to resubscribe. It reduces the time of connecting to the broker, especially when a client is subscribed to a large number of topics.
The MQTT broker
The broker typically is a service running on a server. There are many open source brokers available, which can differ in implementation and functionalities. Although the golden rule is for a broker to conform fully to the MQTT standard, many of them are lacking in some parts. It’s crucial to 1. Determine what are the qualities needed for the project 2. Find a broker which implements all of them. But even if the time comes to move to another broker, the migration process is mostly painless due to the nature of the MQTT itself. The only communication between a client and a broker is via the messages sent on topics, and their structure is uniform between brokers implementations.
How to choose the perfect broker? Let’s start with listing the features we want:
- Basics. All brokers will have the possibility to publish to a topic, to subscribe to it, most of them will support a wildcard subscriptions.
- Bulk subscriptions. When a client is subscribing to a large number of similar topics, for example to online statuses of all contacts in a phone contact book, the bulk subscriptions can be helpful. That means, instead of performing serial subscriptions, one after the other, the bulk subscription can be created which contains all of the topics the client wants. Then it can be sent as one transaction, which significantly reduces the amount of network usage and processing time.
- SSL support. Although it can be important for certain solutions to have the whole communication encrypted, it’s worth to remember that doing it does add a significant network overhead. The better solution would be to encrypt and decrypt the data (like username and password) only on the publisher/subscriber side and to use unencrypted MQTT connection. With that, the MQTT will stay lightweight, which is the biggest of its qualities.
- QoS. As mentioned before, some brokers do not fully implement the QoS = 2, which means that even if the topics Quality of Service is set to “exactly once”, they can send the message more than once. It’s not a problem for most cases, but if it’s crucial for a certain developed solution, it has to be tested.
- Clean session. The behavior of the broker differs greatly for two possible setting for a clean session. Not all brokers process them as stated in the MQTT specification.
It’s important to first test the chosen broker with all possible scenarios. It’s also helpful to check all opened and closed issues on GitHub to prevent future disappointments. Once the broker is chosen and tested the last part is to implement the publisher and subscriber part. It can be the same application, there is no problem to publish information to a certain topic as well as subscribe to it in the same scope!
There is much more to discuss in terms of the MQTT protocol, the purpose of this post is to just give a quick glance over it with a special consideration for all possible implementations of it on the broker side. The possible solutions which can be delivered with the use of the MQTT are endless. Have fun!