Transfer Learning

From
Revision as of 07:43, 23 April 2023 by BPeat (talk | contribs) (FLAN-T5-XXL LLM)
Jump to: navigation, search

YouTube search... ...Google search


Who is the predator? Who is the prey? Have you ever seen a live dinosaur? Not ever seeing a live dinosaur and knowing the predator/prey answers is 'transfer' learning.

Transfer learning (aka Knowledge Transfer, Learning to Learn) is a machine learning technique where a model trained on one task is re-purposed on a second related task. Transfer learning is an optimization that allows rapid progress or improved performance when modeling the second task. A Gentle Introduction to Transfer Learning for Deep Learning | Jason Brownlee - Machine Learning Mastery

Transfer learning is a type of learning where a model is first trained on one task, then some or all of the model is used as the starting point for a related task. It is a useful approach on problems where there is a task related to the main task of interest and the related task has a large amount of data. It is different from multi-task learning as the tasks are learned sequentially in transfer learning, whereas multi-task learning seeks good performance on all considered tasks by a single model at the same time in parallel.An example is image classification, where a predictive model, such as an artificial neural network, can be trained on a large corpus of general images, and the weights of the model can be used as a starting point when training on a smaller more specific dataset, such as dogs and cats. The features already learned by the model on the broader task, such as extracting lines and patterns, will be helpful on the new related task.As noted, transfer learning is particularly useful with models that are incrementally trained and an existing model can be used as a starting point for continued training, such as deep learning networks. 14 Different Types of Learning in Machine Learning | Jason Brownlee - Machine Learning Mastery



FLAN-T5 LLM

T5 stands for “Text-To-Text Transfer Transformer” . It is a model developed by Google Research that converts every language problem into a text-to-text format. T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format. It was presented in a paper by Google called "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task. Large Language Model (LLM) are AI models that have been trained on large amounts of text data and can generate human-like text. - T5 | Hugging Face

FLAN-T5 is an improved version of T5 with some architectural tweaks. FLAN stands for “Fine-tuned LAnguage Net”. It was developed by Google Research and is pre-trained on C4 only without mixing in the supervised tasks. FLAN-T5 is designed to be highly customizable, allowing developers to fine-tune it to meet their specific needs. This means that developers can adjust the model’s parameters and architecture to better fit the data and task at hand. This can result in improved performance and accuracy on specific tasks. For example, a developer could fine-tune FLAN-T5 on a specific dataset to improve its performance on a particular language translation task. This flexibility makes FLAN-T5 a powerful tool for Natural Language Processing (NLP) tasks.