Welcome to the 3rd edition of MLguru – a bi-weekly update on the hottest news from the Machine Learning world.
Revisiting Image-Net Pretraining
Facebook AI Research (FAIR) published a new paper on rethinking Image-Net pre-training for Convolutional Neural Networks. Researchers have discovered that there are no fundamental obstacles preventing training from scratch if:
Normalization techniques are used appropriately for optimization,
The model is trained sufficiently long to compensate for the lack of pre-training.
Pre-training speeds up convergence especially in the early phase, but training from scratch can catch up later and its duration is comparable to the total time of pre-training and fine-tuning,
Pre-training does not necessarily guarantee better regularization,
Pre-training shows no benefit when the target tasks/metrics are more sensitive to spatially well localized predictions (high jaccard overlap in case of detection).
The authors also emphasize that they do not want to deviate from universal representation. On the contrary, they want to influence the community to more carefully rethink reusing pre-trained features.
Training Time in Coffee Time - ResNet50 Trained in 224 Seconds
Researches from Sony have trained ResNet50 on the ImageNet dataset in 224s without significant loss of accuracy, using 2176 Tesla V100 Gpus. Scaling distributed training of deep neural networks to large GPU clusters is difficult because of the instability of large mini-batch training. Researchers from Sony have used the 2D-Torus all-reduce technique that arranges GPUs in a logical 2D grid and performs a series of collective operations in different orientations in order to tackle this problem. Their implementation was written in Neural Network Libraries (NNL)
Last week, our Machine Learning engineers participated in PyData 2018, one of the biggest ML conferences in the region. Netguru was a sponsor there and Matthew Opala, our Machine Learning Tech Lead, delivered a speech titled “Can you trust neural networks?”, where he talked about model interpretability and AI fairness. Robert Kostrzewski, Netguru’s Senior Ruby on Rails Developer, was another speaker during PyData and talked about tackling imbalances in datasets.
Other interesting presentations:
"PyTorch 1.0: now and in the future" by Adam Paszke,
"Uncertainty estimation and Bayesian Neural Networks" by Marcin Możejko.
fastMRI Dataset - Largest Ever Repository of Raw MRI Data
The Center for Advanced Imaging Innovation and Research (CAI2R), in the Department of Radiology at NYU School of Medicine and NYU Langone Health, is partnering with Facebook AI Research (FAIR) on fastMRI – a collaborative research project to investigate the use of AI to make MRI scans up to 10X faster.
The collaboration aims at providing open-source AI models, baselines, and evaluation metrics.
Germany wants to close the gap in AI development by investing 3 billions of euros until 2025. But the number is not really impressive in comparison with Alphabet’s investment of almost 17 billion dollars in 2017 alone.
Not Enough Data? Tencent Releases an Imagenet-like Dataset
This open-source project dubbed Tencent ML-Images published a multilabel dataset with almost 18 million training examples in almost 12k categories. It was released along with the pretrained ResNet-101 that achieves 80,73% top-1 accuracy on ImageNet being pre-trained on ML-Images.