This notebook is used to pretrain transformers models using Hugging Face on your own custom dataset. What do I mean by pretrain transformers? The definition of pretraining is to train in advance. That is exactly what I mean! Train a transformer model to use it as a pretrained transformers model which can be used to fine-tune it on a specific … [Read more...] about Pretrain Transformers Models in PyTorch Using Hugging Face Transformers
Language Models
Linguistics Wisdom of NLP Models
This article is authored by Keyur Faldu and Dr. Amit Sheth. This article elaborates on a niche aspect of the broader cover story on “Rise of Modern NLP and the Need of Interpretability!”At Embibe, we focus on developing interpretable and explainable Deep Learning systems, and we survey the current state of the art techniques to answer … [Read more...] about Linguistics Wisdom of NLP Models
Discovering the Encoded Linguistic Knowledge in NLP Models
This article is authored by Keyur Faldu and Dr. Amit Sheth. This article elaborates on a niche aspect of the broader cover story on “Rise of Modern NLP and the Need of Interpretability!”At Embibe, we desiderate answers to the open questions while we build the NLP platform to solve numerous problems for the academic content. Modern NLP models (BERT, GPT, … [Read more...] about Discovering the Encoded Linguistic Knowledge in NLP Models
The Curious Case of Developmental BERTology
This essay is written for machine learning researchers and neuroscientists (some jargons in both fields will be used). Though it is not intended to be a comprehensive review of literature, we will take a tour through a selection of classic work and new results from a range of topics, in an attempt to develop the following thesis: Just like the fruitful interaction between … [Read more...] about The Curious Case of Developmental BERTology
The Dark Secrets Of BERT
This blog post summarizes EMNLP 2019 paper Revealing the Dark Secrets of BERT by researchers from the Text Machine Lab at UMass Lowell: Olga Kovaleva (LinkedIn), Alexey Romanov (LinkedIn), Anna Rogers (Twitter: @annargrs), and Anna Rumshisky (Twitter: @arumshisky). Here are the topics covered: A brief intro to … [Read more...] about The Dark Secrets Of BERT
Why Choosing a Heavier NLP Model Might Be a Good Choice?
From Google’s 43 rules of ML. Rule #4: Keep the first model simple and get the infrastructure right. With some opinions floating in the market, I feel it’s a good time to spark a discussion about this topic. Otherwise, the opinions of the popular will just drown other ideas. Note: I work in NLP and these opinions are more focussed towards NLP applications. Cannot … [Read more...] about Why Choosing a Heavier NLP Model Might Be a Good Choice?
Dissecting The Transformer
We saw how attention works and how it improved neural machine translation systems (see the previous blogpost), we are going to unveil the secrets behind the power of the most famous NLP models nowadays (a.k.a BERT and friends), the transformer. In this second part, we are going to dive into the details of this architecture with the aim of getting a solid … [Read more...] about Dissecting The Transformer
Decoding NLP Attention Mechanisms
Arguably more famous today than Michael Bay’s Transformers, the transformer architecture and transformer-based models have been breaking all kinds of state-of-the-art records. They are (rightfully) getting the attention of a big portion of the deep learning community and researchers in Natural Language Processing (NLP) since their introduction in 2017 by the … [Read more...] about Decoding NLP Attention Mechanisms
OpenAI’s GPT-2: Results, Hype, and Controversies
The nonprofit AI research company, OpenAI, recently released a new language model, called GPT-2, which is capable of generating realistic texts in a wide range of styles. In fact, the company stated that the model is so good at automatic text generation that it can be used for nefarious purposes; therefore, it did not publicize the trained model. The dangerous-to-release … [Read more...] about OpenAI’s GPT-2: Results, Hype, and Controversies
Generalized Language Models: Common Tasks & Datasets
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: Common Tasks & Datasets