Difference between revisions of "Bidirectional Encoder Representations from Transformers (BERT)"

Revision as of 22:27, 13 October 2019

@@ Line 18: / Line 18: @@
 * [http://medium.com/huggingface/distilbert-8cf3380435b5 Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT | Victor Sanh - Medium]
 * [http://arxiv.org/abs/1909.10351 TinyBERT: Distilling BERT for Natural Language Understanding | X. Jiao, Y. Yin, L. Shang, X. Jiang, X. Chen, L. Li, F. Wang, and Q. Liu] researchers at Huawei produces a model called TinyBERT that is 7.5 times smaller and nearly 10 times faster than the original. It also reaches nearly the same language understanding performance as the original.
+* [http://towardsdatascience.com/understanding-bert-is-it-a-game-changer-in-nlp-7cca943cf3ad Understanding BERT: Is it a Game Changer in NLP? | Bharat S Raj - Towards Data Science]
 * [[Google]]
+<img src="http://miro.medium.com/max/916/1*8416XWqbuR2SDgCY61gFHw.png" width="500" height="200">
 <img src="http://miro.medium.com/max/2070/1*IFVX74cEe8U5D1GveL1uZA.png" width="800" height="500">