SMART - Multi-Task Deep Neural Networks (MT-DNN)

From
Jump to: navigation, search

Youtube search... | ...Google search

With our recently developed SMART technology, we jointly trained the tasks with Multi-Task Deep Neural Networks (MT-DNN) and hybrid neural network (HNN) models. Those models are initialized with RoBERTa large model. Parameter Description: All the tasks share the same model structure, while parameters are not shared across tasks. a new computational framework for robust and efficient finetuning pre-trained language models through regularized optimization techniques. Specifically, our framework consists of two important ingredients:

  1. Smoothness-inducing Adversarial Regularization, which can effectively manage the capacity of the pre-trained model.
  2. Bregman Proximal Point Optimization, which is a class of trust-region optimization methods and can prevent knowledge forgetting.