SMART - Multi-Task Deep Neural Networks (MT-DNN)

Natural Language Processing (NLP)
Bidirectional Encoder Representations from Transformers (BERT)
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization | H. Jaing, P. He, W. Chen, X. Liu, J. G, and T. Zhao
A Hybrid Neural Network Model for Commonsense Reasoning | P. He, X. Liu, W. Chen and J. Gao - Profillic
Multi-Task Deep Neural Networks for Natural Language Understanding - GitHub

With our recently developed SMART technology, we jointly trained the tasks with MT-DNN and HNN models. Those models are initialized with RoBERTa large model. Parameter Description: All the tasks share the same model structure, while parameters are not shared across tasks. a new computational framework for robust and efficient finetuning pre-trained language models through regularized optimization techniques. Specifically, our framework consists of two important ingredients:

Smoothness-inducing Adversarial Regularization, which can effectively manage the capacity of the pre-trained model. Popular examples include virtual adversarial training regularization proposed in Miyato et al. (2018), TRADES regularization in Zhang et al. (2019), and local linearity regularization in Qin et al. (2019).
Bregman Proximal Point Optimization, which is a class of trust-region optimization methods and can prevent knowledge forgetting. Popular examples includes proximal point method proposed in Rockafellar (1976), generalized proximal point method (Teboulle, 1997; Eckstein, 1993), accelerated proximal point methods, and other variants (Guler ¨ , 1991, 1992; Parikh et al., 2014).

SMART - Multi-Task Deep Neural Networks (MT-DNN)

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools