Welcome to the 10th edition of MLguru - our machine learning newsletter. This time we give you a mix of news, online courses, and case studies to help you stay up to date with ML solutions. As always, I’m more than happy to hear your comments. Feel free to send me an email or a tweet (@matthewopala).
Facebook has just introduced a new approach that aims to reduce the memory footprint of neural network architectures by quantizing (or discretizing) their weights, while maintaining a short inference time thanks to its byte-aligned scheme. Its main goal is to help researchers in computer vision, who are continuously advancing the state of the art with models performing tasks ranging from image classification to instance detection. In the past, the memory required to store these high-performing neural networks and use them to perform inference was generally was than 100 MB, which prevented it from being used on embedded devices. Read more
According to an OpenAI announcement, they intend to license some of their pre-AGI technologies, with Microsoft becoming a preferred partner. Does it mean that this non-profit is getting ready to turn for-profit in the nearest future? Follow @Smerity on Twitter to get more information.
Google AI Team: Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, and Yonghui Wu created a universal neural machine translation (NMT) system capable of translating between any language pair. But what does ‘any’ mean? They included 103 languages trained on over 25 billion examples. Their system demonstrates effective transfer learning ability, significantly improving translation quality of low-resource languages, while keeping high-resource language translation quality on-par with competitive bilingual baselines. Great job guys! Learn more
Do you want to learn the basics of NLP? Fast.ai has just created their code-first introduction to Natural Language Processing. Applications covered include topic modeling, classification, language modeling, and translation. The course was originally taught in the University of San Francisco MS in Data Science program during May-June 2019. It is fully available for free on their YouTube channel. The code can be found on GitHub. Read more
Ever since a few years ago Apple introduced Siri to rival Android’s voice assistant, speech-to-text has been a staple tool in Apple’s and Google’s mobile ecosystems. After the initial introduction, Apple opened up the API to developers, allowing them to write apps that make heavy use of speech-to-text transcription. As part of a project for one of our clients, we implemented a speech-to-text transcription feature that takes advantage of the Apple transcription API. To our client’s delight, we were able to successfully integrate transcription into the app. The only shortcoming was the lack of any punctuation marks in the transcriptions produced by Apple’s SFSpeechRecognizer. Read our story do see how we made it happen.
Have you heard about LightTag? They create tools to annotate data for natural language processing. Read the story of Jane, the director of NLP at Automatic Pizza, whose company wants to improve its efficiency by letting customers order pizza through a chat interface. The example goes through various annotation projects to describe the seven distinct stages of the annotation life cycle that Jane will go through. It’s a great read to get to know a bit more about data annotation process. Read more