Welcome to the 14th edition of MLguru, a bi-weekly Machine Learning newsletter.
In this edition you will read about:
Also I’m happy to invite you to our next face-to-face workshops that will take place in… Dublin. Click here for more details.
Google’s open source ML library, TensorFlow 2.0, is now available to the general public! It comes with a number of changes made to improve the ease of use. It also promises 3x faster training performance when using mixed precision on Nvidia’s Volta and Turing GPUs.
Read more about TensorFlow 2.0 here.
Search engines for code are often frustrating and never fully understand what coders want, unlike regular web search engines. That is why the Microsoft Research Team and core contributors from GitHub decided to launch the CodeSearchNet challenge to evaluate the state of semantic code search. Does it sound interesting? Sure it does. Read more about the challenge here.
Facebook decided to create an AI system that proposes easy changes to a person’s outfit to make it more fashionable. Their Fashion++ system uses a deep image-generation neural network to recognize garments and offer suggestions on what to remove, add, or swap. It can also recommend ways to adjust a piece of clothing, such as tucking in a shirt or rolling up the sleeves. Read more.
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model size increases become harder due to GPU/TPU memory limitations, longer
training times, and unexpected model degradation. That is where a Lite BERT comes in. Learn more about it under this link.
Pre-trained deep neural network language models such as ELMo, GPT, BERT and XLNet have recently achieved state-of-the-art performance on a variety of language understanding tasks. Unfortunately, their size makes them impractical for a number of scenarios, especially on mobile and edge devices.
Sanqiang Zhao, Raghav Gupta, Yang Song, and Denny Zhou introduced a novel knowledge distillation technique for training a student model with a significantly smaller vocabulary as well as lower embedding and hidden state dimensions. Find out more about their solution here.
Daniel Ziegler, Nisan Stiennon, Jeffrey Wu, Tom Brown, Dario Amodei, Alec Radford, Paul Christiano and Geoffrey Irving have fine-tuned the 774M parameter GPT-2 language model using human feedback for various tasks, successfully matching the preferences of the external human labelers, though those preferences did not always match their own. Their motivation was to move safety techniques closer to the general task of “machines talking to humans,” which they believe is key to extracting information about human values. Read more about the process in their article on OpenAI.