next_word_web April 18, 2019 by Kate Koidan Next-word attention pattern at Layer 2, Head 0 of the BERT-base pretrained model.
Leave a Reply
You must be logged in to post a comment.