Deep Learning Frameworks Comparison – TensorFlow, Keras, MXNet and More

Updated Jun 15, 2022 • 18 min read

Since many businesses want to make use of AI in order to scale up or take their start-up off the ground, it is crucial to realize one thing: the technology they choose to work with must be paired with an adequate deep learning framework, especially because each framework serves a different purpose. Finding that perfect fit is essential in terms of smooth and fast business development, as well as successful deployment.

The following list of deep learning frameworks might come in handy during the process of selecting the right one for the particular challenges that you’re facing. Compare the pros and cons of different solutions, check their limitations, and learn about best use cases for each solution!

1. TensorFlow

Created by Google and written in C++ and Python, TensorFlow is perceived to be one of the best open source libraries for numerical computation. It just has to be good, especially if giants like DeepMind, Uber, AirBnB, or Dropbox have all decided to leverage this framework.

TensorFlow is good for advanced projects, such as creating multilayer neural networks. It’s used in voice/image recognition and text-based apps (like Google Translate).

Of course, experts have considered both its pros...

It has a lot of documentation and guidelines;
It offers monitoring for training processes of the models and visualization (Tensorboard);
It’s backed by a large community of devs and tech companies;
It provides model serving;
It supports distributed training;
Tensorflow Lite enables on-device inference with low latency for mobile devices;
Tensorflow JS - enables deploying models in JavaScript environments, both frontend and Node.js backend. TensorFlow.js also supports defining models in JavaScript and training them directly in the browser using a Keras-like API.

...and cons:

It struggles with poor results for speed in benchmark tests compared with, for example, CNTK and MXNet,
It has a higher entry threshold for beginners than PyTorch or Keras. Plain Tensorflow is pretty low-level and requires a lot of boilerplate coding,
And the default Tensorflow “define and run” mode makes debugging very difficult.

There is also one significant limitation: the only fully supported language is Python.

Changes in Tensorflow 2.0

The next major version of the framework is Tensorflow 2.0. It is going to be released soon, right now it's accessible as an RC (release candidate).
It brings us a bunch of exciting features, such as:

Support for the Keras framework
It is possible to use Keras inside Tensorflow. It ensures that new Machine Learning models can be built with ease.
Supports debugging your graphs and networks - TensorFlow 2.0 runs with eager execution by default for ease of use and smooth debugging.
Robust model deployment in production on any platform.
Powerful experimentation for research.
Simplifying the API by cleaning up deprecated APIs and reducing duplication.

2. PyTorch

PyTorch is the Python successor of Torch library written in Lua and a big competitor for TensorFlow. It was developed by Facebook and is used by Twitter, Salesforce, the University of Oxford, and many others.

PyTorch is mainly used to train deep learning models quickly and effectively, so it’s the framework of choice for a large number of researchers.

It has some significant advantages:

The modeling process is simple and transparent thanks to the framework’s architectural style;
The default define-by-run mode is more like traditional programming, and you can use common debugging tools as pdb, ipdb or PyCharm debugger;
It has declarative data parallelism;
It features a lot of pretrained models and modular parts that are ready and easy to combine;
It supports distributed training.
It is production-ready since version 1.0.

The first stable version, 1.0, transforms PyTorch into a mature and production-ready tool.

New features and improvements

It now supports model serving with three strategies:

Direct embedding,
Model microservices,
Model servers,
Official support for Tensorboard,
Portable development improvements - JIT compiler tools and a C++ frontend.

3. Keras

This is a minimalistic Python-based library that can be run on top of TensorFlow, Theano, or CNTK. It was developed by a Google engineer, Francois Chollet, in order to facilitate rapid experimentation. It supports a wide range of neural network layers such as convolutional layers, recurrent layers, or dense layers.

One can make good use of it in areas of translation, image recognition, speech recognition, and so on.

The advantages...

Prototyping is really fast and easy;
It’s lightweight in terms of building DL models with a lot of layers;
It features fully-configurable modules;
It has a simplistic and intuitive interface – fantastic for newbies;
It has built-in support for training on multiple GPUs;
It can be turned into Tensorflow estimators and trained on clusters of GPUs on Google Cloud;
It can be run on Spark;
It supports NVIDIA GPUs, Google TPUs, and Open-CL-enabled GPUs such as AMD.

...can easily blot out small disadvantages:

It might be too high-level and not always easy to customize;
It is constrained to Tensorflow, CNTK, and Theano backends.

It also doesn’t provide as many functionalities as TensorFlow, and ensures less control over the network, so these could be serious limitations if you plan to build a special type of DL model.

The Keras interface format has become a standard in deep learning development world. That is why, as mentioned before, it is possible to use Keras as a module of Tensorflow. It makes development easier and reduces differences between these two frameworks. It also combines the advantages of using each of them.

4. MXNet

This is a DL framework created by Apache, which supports a plethora of languages, like Python, Julia, C++, R, or JavaScript. It’s been adopted by Microsoft, Intel, and Amazon Web Services.

The MXNet framework is known for its great scalability, so it’s used by large companies mainly for speech and handwriting recognition, NLP, and forecasting.

Some of the main pros...

It’s quite fast, flexible, and efficient in terms of running DL algorithms;,
It features advanced GPU support, including multiple GPU mode;
It can be run on any device;
It has a high-performance imperative API;
It offers easy model serving;
It’s highly scalable;
It provides rich support for many programming languages, such as Python, R, Scala, Javascript, and C++, among others;

...and the cons of MXNet:

It has a much smaller community behind it compared with Tensorflow;
It’s not so popular among the research community.

So, MXNet is a good framework for big industrial projects, but since it is still pretty new, there’s a chance that you won’t receive support exactly when you need it – keep that in mind.

5. CNTK

This is now called The Microsoft Cognitive Toolkit – an open-source DL framework created to deal with big datasets and to support Python, C++, C#, and Java.

CNTK facilitates really efficient training for voice, handwriting, and image recognition, and supports both CNNs and RNNs. It is used in Skype, Xbox and Cortana.

As always, experts have considered both its advantages...

It delivers good performance and scalability;
It features a lot of highly optimized components;
It offers support for Apache Spark;
It’s very efficient in terms of resource usage;
It supports simple integration with Azure Cloud;

...and one disadvantage:

Limited community support.

Next framework is a great example of significant variance in the world of Machine Learning frameworks:

6. Caffe and Caffe2

Caffe is a framework implemented in C++ that has a useful Python interface. It supports CNNs and feedforward networks, and is good for training models (without writing any additional lines of code), image processing, and for perfecting existing networks. However… it’s sometimes poorly documented, and difficult to compile. There is no sign of any bigger company deploying Caffe right now.

But here comes Caffe2 – introduced by Facebook in 2017, a natural successor to the old Caffe, built for mobile and large-scale deployments in production environments. At Facebook, it’s known as “the production-ready platform, (...) shipping to more than 1 billion phones spanning eight generations of iPhones and six generations of Android CPU architectures.”

In May 2018 Caffe2 has been merged into the PyTorch 1.0 stable version. The two fabulous engines joined forces. Now we can consider the pros below as a part of PyTorch. Anyway, we can see what were the good sides of using Caffe2.

The framework is praised for several reasons:

It offers pre-trained models for building demo apps;
It’s fast, scalable, and lightweight;
It works well with other frameworks, like PyTorch, and it’s going to be merged into PyTorch 1.0;
It has server optimized inference.

Those were the top 5 most famous Deep Learning Frameworks. The ones below are less popular, but still worth considering.

7. Deeplearning4j

If your core programming language is Java – you should definitely take a closer look at DL4J. It’s a commercial-grade, open-source framework written mainly for Java and Scala, offering massive support for different types of neural networks (like CNN, RNN, RNTN, or LTSM).

It’s a great framework of choice, with a lot of potential in areas of image recognition, natural language processing, fraud detection, and text mining. Plus:

It’s robust, flexible and effective;
It can process huge amounts of data without sacrificing speed;
It works with Apache Hadoop and Spark, on top of distributed CPUs or GPUs;
The documentation is really good;
It has a community version and an enterprise version.

Surprisingly, when talking about DL4J, experts do not focus on any particular drawback of the framework as much as they do on the general cons of using Java for machine learning. Because Java is not very popular among machine learning projects, the framework itself cannot rely on growing codebases. As a result, the costs of development for your project may be much higher, significantly slowing down your business…

8. Chainer

Another Python-based DL framework, supported by giants like IBM, Intel, Nvidia, and AWS. It can be run on multiple GPUs with little effort.

Chainer is leveraged mainly for speech recognition, machine translation, and sentiment analysis. It supports various network architectures, like CNNs, fast-forward, nets and RNNs, and has some significant advantages over its competitors:

It’s much faster than other leading Python frameworks;
It’s super flexible and intuitive;
Existing networks can be modified at runtime.

On the other hand:

It’s more difficult to debug;
The community is relatively small.

As other Python-oriented frameworks are much more popular, you may not receive as much help with Chainer as you would with more popular frameworks, such as TF or PyTorch.

Wrapping it all up… Which deep learning framework to use?

Choosing the perfect framework for a DL project can be a tough nut to crack. So while thinking what is the best framework for deep learning, you have to take several factors into consideration: