Perplexity is a common metric to use when evaluating language models. For example, scikit-learn’s implementation of Latent Dirichlet Allocation (a topic-modeling algorithm) includes perplexity as a built-in metric.In this post, I will define perplexity and then discuss entropy, the relation between the two, and how it arises naturally in natural language … [Read more...] about The Relationship Between Perplexity And Entropy In NLP
Technical Guide
ECCV 2020: Some Highlights
The 2020 European Conference on Computer Vision took place online, from 23 to 28 August, and consisted of 1360 papers, divided into 104 orals, 160 spotlights and the rest of 1096 papers as posters. In addition to 45 workshops and 16 tutorials. As it is the case in recent years with ML and CV conferences, the huge number of papers can be overwhelming at times. Similar to … [Read more...] about ECCV 2020: Some Highlights
Generating Synthetic Sequential Data Using GANs
Sequential data — data that has time dependency — is very common in business, ranging from credit card transactions to medical healthcare records to stock market prices. But privacy regulations limit and dramatically slow-down access to useful data, essential to research and development. This creates a demand for highly representative, yet fully private, synthetic sequential … [Read more...] about Generating Synthetic Sequential Data Using GANs
Programming Fairness in Algorithms
Being good is easy, what is difficult is being just. ― Victor HugoWe need to defend the interests of those whom we’ve never met and never will. ― Jeffrey D. SachsNote: This article is intended for a general audience to try and elucidate the complicated nature of unfairness in machine learning algorithms. As such, I have tried to explain concepts in an accessible way … [Read more...] about Programming Fairness in Algorithms
Guide to Interpretable Machine Learning
If you can’t explain it simply, you don’t understand it well enough. — Albert EinsteinDisclaimer: This article draws and expands upon material from (1) Christoph Molnar’s excellent book on Interpretable Machine Learning which I definitely recommend to the curious reader, (2) a deep learning visualization workshop from Harvard ComputeFest 2020, as … [Read more...] about Guide to Interpretable Machine Learning
An AI Researcher’s Exploration of 200 Machine Learning Tools
To better understand the landscape of available tools for machine learning production, I decided to look up every AI/ML tool I could find. The resources I used include:Full stack deep learningLF AI Foundation landscapeAI Data LandscapeVarious lists of top AI startups by the mediaResponses to my tweet and LinkedIn postPeople (friends, strangers, VCs) share … [Read more...] about An AI Researcher’s Exploration of 200 Machine Learning Tools
Handling Missing Data For Advanced Machine Learning
Throughout this article, you will become good at spotting, understanding, and imputing missing data. We demonstrate various imputation techniques on a real-world logistic regression task using Python. Properly handling missing data has an improving effect on inferences and predictions. This is not to be ignored.The first part of this article presents the framework for … [Read more...] about Handling Missing Data For Advanced Machine Learning
Single Stage Instance Segmentation – A Review
Instance segmentation is a challenging computer vision task that requires the prediction of object instances and their per-pixel segmentation mask. This makes it a hybrid of semantic segmentation and object detection.Ever since Mask R-CNN was invented, the state-of-the-art method for instance segmentation has largely been Mask RCNN and its variants … [Read more...] about Single Stage Instance Segmentation – A Review
Deep Transfer Learning for Image Classification
The following tutorial covers how to set up a state of the art deep learning model for image classification. The approach is based on the machine learning frameworks “Tensorflow” and “Keras”, and includes all the code needed to replicate the results in this tutorial.The prerequisites for setting up the model is access to labelled data, and as an example case I have used … [Read more...] about Deep Transfer Learning for Image Classification
The Best NLP Papers From ICLR 2020
I went through 687 papers that were accepted to ICLR 2020 virtual conference (out of 2594 submitted - up 63% since 2019!) and identified 9 papers with the potential to advance the use of deep learning NLP models in everyday use cases.Here are the papers found and why they matter.ELECTRA: Pre-training Text Encoders as Discriminators Rather Than GeneratorsKevin … [Read more...] about The Best NLP Papers From ICLR 2020