next_word_webApril 18, 2019 by Kate KoidanNext-word attention pattern at Layer 2, Head 0 of the BERT-base pretrained model.
Leave a Reply