EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: BERT & OpenAI GPT-2
Language Models
Generalized Language Models: ULMFiT & OpenAI GPT
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: ULMFiT & OpenAI GPT
Generalized Language Models: CoVe, ELMo & Cross-View Training
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: CoVe, ELMo & Cross-View Training
OpenAI GPT-2: Understanding Language Generation through Visualization
Are you interested in receiving more in-depth technical education about language models and NLP applications? Subscribe below to receive relevant updates. In the eyes of most NLP researchers, 2018 was a year of great technological advancement, with new pre-trained NLP models shattering records on tasks ranging from sentiment analysis to … [Read more...] about OpenAI GPT-2: Understanding Language Generation through Visualization
Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention
This is the second part of a two-part series on deconstructing BERT. In part 1, Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters, I described how BERT’s attention mechanism can take on many different forms. For example, one attention head focused nearly all of the attention on the next word in the sequence; another focused on the previous … [Read more...] about Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention
Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters
The year 2018 marked a turning point for the field of Natural Language Processing, with a series of deep-learning models achieving state-of-the-art results on NLP tasks ranging from question answering to sentiment classification. Most recently, Google’s BERT algorithm has emerged as a sort of “one model to rule them all,” based on its superior performance over a wide variety of … [Read more...] about Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters
What Every NLP Engineer Needs to Know About Pre-Trained Language Models
Practical applications of Natural Language Processing (NLP) have gotten significantly cheaper, faster, and easier due to the transfer learning capabilities enabled by pre-trained language models. Transfer learning enables engineers to pre-train an NLP model on one large dataset and then quickly fine-tune the model to adapt to other NLP tasks. This new approach enables NLP … [Read more...] about What Every NLP Engineer Needs to Know About Pre-Trained Language Models