Difference between revisions of "Transfer Learning"

From
Jump to: navigation, search
m
m (Fine-tune T5 and Flan-T5 LLM Models)
Line 45: Line 45:
 
<youtube>KHobCqYS4C4</youtube>
 
<youtube>KHobCqYS4C4</youtube>
  
= Fine-tune T5 and Flan-T5 LLM Models =
+
= T5 & FLAN-T5 LLM Models =
 +
T5 stands for “Text-To-Text Transfer Transformer” . It is a model developed by Google Research that converts every language problem into a text-to-text format. T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format. It was presented in a paper by Google called "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task. [[Large Language Model (LLM)]] are AI models that have been trained on large amounts of text data and can generate human-like text.
 +
[https://huggingface.co/docs/transformers/model_doc/t5 T5 - Hugging Face]
 +
 
 +
FLAN-T5 is an improved version of T5 with some architectural tweaks. FLAN stands for “Fine-tuned LAnguage Net”.  It was developed by Google Research and is pre-trained on C4 only without mixing in the supervised tasks. FLAN-T5 is designed to be highly customizable, allowing developers to fine-tune it to meet their specific needs. This means that developers can adjust the model’s parameters and architecture to better fit the data and task at hand. This can result in improved performance and accuracy on specific tasks. For example, a developer could fine-tune FLAN-T5 on a specific dataset to improve its performance on a particular language translation task.
 +
 
 +
* To fine-tune FLAN-T5 on a specific dataset, a developer would first need to prepare the data in a format that is compatible with the model. This typically involves tokenizing the text data and converting it into numerical tensors that can be fed into the model.
 +
 
 +
* Once the data is prepared, the developer can then load a pre-trained FLAN-T5 model and adjust its parameters to better fit the data. This is done by training the model on the specific dataset, using an optimization algorithm to update the model’s weights based on the data.
 +
 
 +
* After fine-tuning, the model should have improved performance on the specific task for which it was fine-tuned. The developer can then use this fine-tuned model to make predictions or generate text on new data related to the specific task.
 +
 
 +
 
 +
This flexibility makes FLAN-T5 a powerful tool for natural language processing tasks. With its advanced features and high performance, FLAN-T5 is poised to become a major player in the field of natural language processing .
 +
 
 
<youtube>SHMsdAPo2Ls</youtube>
 
<youtube>SHMsdAPo2Ls</youtube>
 
<youtube>_Qf_SiCLzw4</youtube>
 
<youtube>_Qf_SiCLzw4</youtube>

Revision as of 07:23, 23 April 2023

YouTube search... ...Google search


Who is the predator? Who is the prey? Have you ever seen a live dinosaur? Not ever seeing a live dinosaur and knowing the predator/prey answers is 'transfer' learning.

Transfer learning (aka Knowledge Transfer, Learning to Learn) is a machine learning technique where a model trained on one task is re-purposed on a second related task. Transfer learning is an optimization that allows rapid progress or improved performance when modeling the second task. A Gentle Introduction to Transfer Learning for Deep Learning | Jason Brownlee - Machine Learning Mastery

Transfer learning is a type of learning where a model is first trained on one task, then some or all of the model is used as the starting point for a related task. It is a useful approach on problems where there is a task related to the main task of interest and the related task has a large amount of data. It is different from multi-task learning as the tasks are learned sequentially in transfer learning, whereas multi-task learning seeks good performance on all considered tasks by a single model at the same time in parallel.An example is image classification, where a predictive model, such as an artificial neural network, can be trained on a large corpus of general images, and the weights of the model can be used as a starting point when training on a smaller more specific dataset, such as dogs and cats. The features already learned by the model on the broader task, such as extracting lines and patterns, will be helpful on the new related task.As noted, transfer learning is particularly useful with models that are incrementally trained and an existing model can be used as a starting point for continued training, such as deep learning networks. 14 Different Types of Learning in Machine Learning | Jason Brownlee - Machine Learning Mastery



T5 & FLAN-T5 LLM Models

T5 stands for “Text-To-Text Transfer Transformer” . It is a model developed by Google Research that converts every language problem into a text-to-text format. T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format. It was presented in a paper by Google called "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task. Large Language Model (LLM) are AI models that have been trained on large amounts of text data and can generate human-like text.

T5 - Hugging Face

FLAN-T5 is an improved version of T5 with some architectural tweaks. FLAN stands for “Fine-tuned LAnguage Net”. It was developed by Google Research and is pre-trained on C4 only without mixing in the supervised tasks. FLAN-T5 is designed to be highly customizable, allowing developers to fine-tune it to meet their specific needs. This means that developers can adjust the model’s parameters and architecture to better fit the data and task at hand. This can result in improved performance and accuracy on specific tasks. For example, a developer could fine-tune FLAN-T5 on a specific dataset to improve its performance on a particular language translation task.

  • To fine-tune FLAN-T5 on a specific dataset, a developer would first need to prepare the data in a format that is compatible with the model. This typically involves tokenizing the text data and converting it into numerical tensors that can be fed into the model.
  • Once the data is prepared, the developer can then load a pre-trained FLAN-T5 model and adjust its parameters to better fit the data. This is done by training the model on the specific dataset, using an optimization algorithm to update the model’s weights based on the data.
  • After fine-tuning, the model should have improved performance on the specific task for which it was fine-tuned. The developer can then use this fine-tuned model to make predictions or generate text on new data related to the specific task.


This flexibility makes FLAN-T5 a powerful tool for natural language processing tasks. With its advanced features and high performance, FLAN-T5 is poised to become a major player in the field of natural language processing .