SMART - Multi-Task Deep Neural Networks (MT-DNN)

From
Revision as of 21:29, 21 December 2019 by BPeat (talk | contribs) (BPeat moved page MT-DNN-SMART to SMART - Multi-Task Deep Neural Networks (MT-DNN) without leaving a redirect)
Jump to: navigation, search

Youtube search... | ...Google search

With our recently developed SMART technology, we jointly trained the tasks with MT-DNN and HNN models. Those models are initialized with RoBERTa large model. Parameter Description: All the tasks share the same model structure, while parameters are not shared across tasks. a new computational framework for robust and efficient finetuning pre-trained language models through regularized optimization techniques. Specifically, our framework consists of two important ingredients:

  1. Smoothness-inducing Adversarial Regularization, which can effectively manage the capacity of the pre-trained model. Popular examples include virtual adversarial training regularization proposed in Miyato et al. (2018), TRADES regularization in Zhang et al. (2019), and local linearity regularization in Qin et al. (2019).
  2. Bregman Proximal Point Optimization, which is a class of trust-region optimization methods and can prevent knowledge forgetting. Popular examples includes proximal point method proposed in Rockafellar (1976), generalized proximal point method (Teboulle, 1997; Eckstein, 1993), accelerated proximal point methods, and other variants (Guler ¨ , 1991, 1992; Parikh et al., 2014).