Difference between revisions of "Bidirectional Encoder Representations from Transformers (BERT)"

From
Jump to: navigation, search
m
Line 15: Line 15:
 
** [http://github.com/pytorch/fairseq/tree/master/examples/roberta RoBERTa: A Robustly Optimized BERT Pretraining Approach | GitHub] - iterates on BERT's pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data.  
 
** [http://github.com/pytorch/fairseq/tree/master/examples/roberta RoBERTa: A Robustly Optimized BERT Pretraining Approach | GitHub] - iterates on BERT's pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data.  
 
** [http://venturebeat.com/2019/07/29/facebook-ais-roberta-improves-googles-bert-pretraining-methods/ Facebook AI’s RoBERTa improves Google’s BERT pretraining methods | Khari Johnson - VentureBeat]
 
** [http://venturebeat.com/2019/07/29/facebook-ais-roberta-improves-googles-bert-pretraining-methods/ Facebook AI’s RoBERTa improves Google’s BERT pretraining methods | Khari Johnson - VentureBeat]
* Google's BERT - built on ideas from [[ULMFiT]], [[ELMo]], and [http://openai.com/ OpenAI]
+
* Google's BERT - built on ideas from [[ULMFiT]], [[ELMo]], and [[OpenAI]]
 
* [[Attention]] Mechanism/[[Transformer]] Model
 
* [[Attention]] Mechanism/[[Transformer]] Model
 
** [[Generative Pre-trained Transformer (GPT)]]2/3
 
** [[Generative Pre-trained Transformer (GPT)]]2/3

Revision as of 13:23, 15 August 2020

Youtube search... ...Google search





BERT Research | Chris McCormick