Difference between revisions of "SMART - Multi-Task Deep Neural Networks (MT-DNN)"

From
Jump to: navigation, search
Line 10: Line 10:
 
* [[Natural Language Processing (NLP)]]
 
* [[Natural Language Processing (NLP)]]
 
* [[Bidirectional Encoder Representations from Transformers (BERT)]]
 
* [[Bidirectional Encoder Representations from Transformers (BERT)]]
 +
* [[Deep Distributed Q Network Partial Observability]]
 
* [http://arxiv.org/pdf/1911.03437.pdf SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization | H. Jaing, P. He, W. Chen, X. Liu, J. G, and T. Zhao]  
 
* [http://arxiv.org/pdf/1911.03437.pdf SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization | H. Jaing, P. He, W. Chen, X. Liu, J. G, and T. Zhao]  
 
* [http://www.profillic.com/search?query=Pengcheng%20Gao A Hybrid Neural Network Model for Commonsense Reasoning | P. He, X. Liu, W. Chen and J. Gao - Profillic]
 
* [http://www.profillic.com/search?query=Pengcheng%20Gao A Hybrid Neural Network Model for Commonsense Reasoning | P. He, X. Liu, W. Chen and J. Gao - Profillic]

Revision as of 07:53, 6 July 2020

Youtube search... | ...Google search

With our recently developed SMART technology, we jointly trained the tasks with Multi-Task Deep Neural Networks (MT-DNN) and hybrid neural network (HNN) models. Those models are initialized with RoBERTa large model. Parameter Description: All the tasks share the same model structure, while parameters are not shared across tasks. a new computational framework for robust and efficient finetuning pre-trained language models through regularized optimization techniques. Specifically, our framework consists of two important ingredients:

  1. Smoothness-inducing Adversarial Regularization, which can effectively manage the capacity of the pre-trained model.
  2. Bregman Proximal Point Optimization, which is a class of trust-region optimization methods and can prevent knowledge forgetting.