Difference between revisions of "Fine-tuning"

From
Jump to: navigation, search
m
m
Line 79: Line 79:
 
|}
 
|}
  
= Instruction Tuning =
+
= Methods For Fine-tuning an LLM =
 +
* [https://dr-bruce-cottman.medium.com/part-1-eight-major-methods-for-finetuning-an-llm-6f746c7259ee Part 1: Eight Major Methods For FineTuning an LLM | Bruce Cottman - Medium] ... Gradient-based, LoRA, QLoRA, and four others as advanced variations of ULMFiT: selecting a small subset of the available parameters in a trained LLM.
 +
* [https://www.lakera.ai/insights/llm-fine-tuning-guide The Ultimate Guide to LLM Fine Tuning: Best Practices & Tools | Lakera]
 +
* [https://www.simform.com/blog/completeguide-finetuning-llm/ A Complete Guide to Fine Tuning Large Language Models | Hiren Dhaduk]
 +
* [https://www.analyticsvidhya.com/blog/2023/08/fine-tuning-large-language-models/ A Comprehensive Guide to Fine-Tuning Large Language Models | Babina Banjara]
 +
* [https://research.aimultiple.com/llm-fine-tuning/ LLM Fine Tuning Guide for Enterprises in 2023 | Cem Dilmegani]
 +
* [https://www.unite.ai/understanding-llm-fine-tuning-tailoring-large-language-models-to-your-unique-requirements/ Understanding LLM Fine-Tuning: Tailoring Large Language Models to Your Unique Requirements | Aayush Mittal]
 +
 
 +
== Instruction Tuning ==
 
* [https://github.com/SinclairCoder/Instruction-Tuning-Papers Instruction-Tuning-Papers | GitHub]
 
* [https://github.com/SinclairCoder/Instruction-Tuning-Papers Instruction-Tuning-Papers | GitHub]
 
* [https://self-supervised.cs.jhu.edu/sp2023/files/Instruction%20tuning%20of%20LLMs%20-%20Talk@JHU.pdf Instruction Tuning of Large Language Models | Yizhong Wang - John Hopkins University (JHU)]
 
* [https://self-supervised.cs.jhu.edu/sp2023/files/Instruction%20tuning%20of%20LLMs%20-%20Talk@JHU.pdf Instruction Tuning of Large Language Models | Yizhong Wang - John Hopkins University (JHU)]
 
* [https://arxiv.org/abs/2304.03277 Instruction Tuning with GPT-4 | B. Peng, C. Li, P. He, M. Galley, & J. Gao - arXiv]
 
* [https://arxiv.org/abs/2304.03277 Instruction Tuning with GPT-4 | B. Peng, C. Li, P. He, M. Galley, & J. Gao - arXiv]
 
* [https://smilegate.ai/en/2021/09/12/instruction-tuning-flan/ Instruction tuning – FLAN | Convergence Research Team Hongmae Shim - Smilegate AI]
 
* [https://smilegate.ai/en/2021/09/12/instruction-tuning-flan/ Instruction tuning – FLAN | Convergence Research Team Hongmae Shim - Smilegate AI]
 +
* [https://www.analyticsvidhya.com/blog/2023/08/fine-tuning-large-language-models/
  
 
Instructional tuning is a technique that aims to teach [[Large Language Model (LLM)]] to follow natural language instructions, such as prompts, examples, and constraints, to perform better on various [[Natural Language Processing (NLP)]] tasks. Instructional tuning can improve the capabilities and controllability of LLMs across different tasks, domains, and modalities. It can also enable [[Large Language Model (LLM)|LLMs]] to generalize to unseen tasks by using instructions as a bridge between the pretraining objective and the user’s objective.
 
Instructional tuning is a technique that aims to teach [[Large Language Model (LLM)]] to follow natural language instructions, such as prompts, examples, and constraints, to perform better on various [[Natural Language Processing (NLP)]] tasks. Instructional tuning can improve the capabilities and controllability of LLMs across different tasks, domains, and modalities. It can also enable [[Large Language Model (LLM)|LLMs]] to generalize to unseen tasks by using instructions as a bridge between the pretraining objective and the user’s objective.
Line 89: Line 98:
 
Instructional tuning involves fine-tuning [[Large Language Model (LLM)|LLMs]] with instructional data, which consists of pairs of human-written instructions and desired outputs. For example, an instruction could be “Write a summary of the following article in three sentences” and an output could be “The article discusses the benefits of instructional tuning for [[Large Language Model (LLM)|large language models]]. It presents a survey paper that covers the fundamentals, challenges, and applications of this technique. It also introduces a new method called LoRA that leverages [[Large Language Model (LLM)|LLMs]] to generate instructional data for themselves.” Instructional data can be collected from various sources, such as existing NLP datasets, expert annotations, or even [[Large Language Model (LLM)|LLMs]] themselves.
 
Instructional tuning involves fine-tuning [[Large Language Model (LLM)|LLMs]] with instructional data, which consists of pairs of human-written instructions and desired outputs. For example, an instruction could be “Write a summary of the following article in three sentences” and an output could be “The article discusses the benefits of instructional tuning for [[Large Language Model (LLM)|large language models]]. It presents a survey paper that covers the fundamentals, challenges, and applications of this technique. It also introduces a new method called LoRA that leverages [[Large Language Model (LLM)|LLMs]] to generate instructional data for themselves.” Instructional data can be collected from various sources, such as existing NLP datasets, expert annotations, or even [[Large Language Model (LLM)|LLMs]] themselves.
  
== LoRA ==
+
=== LoRA ===
 
* [https://arxiv.org/abs/2106.09685 LoRA: Low-Rank Adaptation of Large Language Models | E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, & W. Chen]
 
* [https://arxiv.org/abs/2106.09685 LoRA: Low-Rank Adaptation of Large Language Models | E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, & W. Chen]
  
LoRA is a technique that leverages [[Large Language Model (LLM)]]s to generate instructional data for themselves. Instructional data consists of pairs of human-written instructions and desired outputs, which can be used to fine-tune [[Large Language Model (LLM)|LLMs]] to follow natural language instructions. LoRA involves prompting [[Large Language Model (LLM)|LLMs]] to generate instructions and instances for various [[Natural Language Processing (NLP)]] tasks, such as [[Summarization]], [[Sentiment Analysis]], question answering, etc. The generated instructional data can then be used to fine-tune the [[Large Language Model (LLM)|LLMs]] on the same or different tasks, improving their capabilities and controllability.
+
Low-Rank Adaptation (LoRA) is a technique that leverages [[Large Language Model (LLM)]]s to generate instructional data for themselves. Instructional data consists of pairs of human-written instructions and desired outputs, which can be used to fine-tune [[Large Language Model (LLM)|LLMs]] to follow natural language instructions. LoRA involves prompting [[Large Language Model (LLM)|LLMs]] to generate instructions and instances for various [[Natural Language Processing (NLP)]] tasks, such as [[Summarization]], [[Sentiment Analysis]], question answering, etc. The generated instructional data can then be used to fine-tune the [[Large Language Model (LLM)|LLMs]] on the same or different tasks, improving their capabilities and controllability.
  
 
LoRA was proposed by a team of researchers from [[Microsoft]] Research in a paper titled “LoRA: [[Self-Supervised]] Generation of Instructional Data for [[Large Language Model (LLM)|Large Language Models]]” . The paper introduces a framework for [[Self-Supervised]] generation of instructional data using [[Large Language Model (LLM)|LLMs]], such as [[GPT-4]]. The paper also demonstrates that LoRA can enable [[Large Language Model (LLM)|LLMs]] to learn new tasks and skills by generating instructions and instances for them.
 
LoRA was proposed by a team of researchers from [[Microsoft]] Research in a paper titled “LoRA: [[Self-Supervised]] Generation of Instructional Data for [[Large Language Model (LLM)|Large Language Models]]” . The paper introduces a framework for [[Self-Supervised]] generation of instructional data using [[Large Language Model (LLM)|LLMs]], such as [[GPT-4]]. The paper also demonstrates that LoRA can enable [[Large Language Model (LLM)|LLMs]] to learn new tasks and skills by generating instructions and instances for them.
Line 111: Line 120:
 
* Comparing the generated instances with human-written outputs from existing NLP datasets or expert annotations, and selecting the ones that have high similarity or agreement.
 
* Comparing the generated instances with human-written outputs from existing NLP datasets or expert annotations, and selecting the ones that have high similarity or agreement.
 
* Applying post-processing steps such as spelling correction, punctuation normalization, and capitalization to improve the readability and consistency of the generated instances.
 
* Applying post-processing steps such as spelling correction, punctuation normalization, and capitalization to improve the readability and consistency of the generated instances.
 +
 +
=== QLoRA ===
 +
 +
=== ULMFiT ===
 +
 +
=== Gradient-based ===

Revision as of 20:24, 26 September 2023

YouTube ... Quora ...Google search ...Google News ...Bing News


A process of retraining a language model on a new dataset of data. This can be used to improve the model's performance on a specific task, such as generating text, translating languages, or answering questions. Fine-tuning is a way to add new knowledge to an existing AI model. It’s a simple upgrade that allows the model to learn new information.

Here are some more detailed information on fine-tuning:

  • Fine-tuning is a relatively simple process. The first step is to select a pre-trained language model. There are many pre-trained language models available, such as GPT-3, RoBERTa, and XLNet. Once you have selected a pre-trained language model, you need to gather a dataset of data for fine-tuning. This dataset should be relevant to the task that you want the model to perform. For example, if you want to fine-tune a language model for question answering, you would need to gather a dataset of questions and answers.
  • The next step is to fine-tune the language model on the dataset of data. This is done by using a technique called supervised learning. In supervised learning, the model is given a set of labeled examples. In the case of fine-tuning, the labels are the answers to the questions in the dataset. The model is then trained to predict the labels for the unlabeled examples in the dataset.
  • Fine-tuning can be a time-consuming process, but it can significantly improve the performance of a language model on a specific task. For example, fine-tuning a language model on a dataset of question and answers can improve the model's ability to answer new questions.


Here are some examples of fine-tuning:

  • Fine-tuning OpenAI's base models such as Davinc, Curie, Babbage, and Ada to improve their performance on a variety of tasks, such as generating text, translating languages, and answering questions.
  • Fine-tuning a binary classifier to rate each completion for truthfulness based on expert-labeled examples.
  • Incorporating proprietary content into a language model to improve its ability to provide relevant answers to questions.

Fine-tuning is a powerful technique that can be used to improve the performance of language models on a variety of tasks. If you are looking to improve the performance of a language model on a specific task, fine-tuning is a good option to consider.


Large Language Model (LLM) Ecosystem Explained

The Large Language Model (LLM) ecosystem refers to the various commercial and open-source LLM providers, their offerings, and the tooling that helps accelerate their wide adoption. The functionality of LLMs can be segmented into five areas: Knowledge Answering, Translation, Text Generation, Response Generation, and Classification. There are many options to choose from for all types of language tasks.


LLM Ecosystem explained: Your ultimate Guide to AI | code_your_own_AI
Introduction to the world of LLM (Large Language Models) in April 2023. With detailed explanation of GPT-3.5, GPT-4, T5, Flan-T5 to LLama, Alpaca and KOALA LLM, plus dataset sources and configurations. Including ICL (in-context learning), adapter fine-tuning, PEFT LoRA and classical fine-tuning of LLM explained. When to choose what type of data set for what LLM job?

A comprehensive LLM /AI ecosystem is essential for the creation and implementation of sophisticated AI applications. It facilitates the efficient processing of large-scale data, the development of complex machine learning models, and the deployment of intelligent systems capable of performing complex tasks.

As the field of AI continues to evolve and expand, the importance of a well-integrated and cohesive AI ecosystem cannot be overstated.

A complete overview of today's LLM and how you can train them for your needs.

Methods For Fine-tuning an LLM

Instruction Tuning

Instructional tuning is a technique that aims to teach Large Language Model (LLM) to follow natural language instructions, such as prompts, examples, and constraints, to perform better on various Natural Language Processing (NLP) tasks. Instructional tuning can improve the capabilities and controllability of LLMs across different tasks, domains, and modalities. It can also enable LLMs to generalize to unseen tasks by using instructions as a bridge between the pretraining objective and the user’s objective.

Instructional tuning involves fine-tuning LLMs with instructional data, which consists of pairs of human-written instructions and desired outputs. For example, an instruction could be “Write a summary of the following article in three sentences” and an output could be “The article discusses the benefits of instructional tuning for large language models. It presents a survey paper that covers the fundamentals, challenges, and applications of this technique. It also introduces a new method called LoRA that leverages LLMs to generate instructional data for themselves.” Instructional data can be collected from various sources, such as existing NLP datasets, expert annotations, or even LLMs themselves.

LoRA

Low-Rank Adaptation (LoRA) is a technique that leverages Large Language Model (LLM)s to generate instructional data for themselves. Instructional data consists of pairs of human-written instructions and desired outputs, which can be used to fine-tune LLMs to follow natural language instructions. LoRA involves prompting LLMs to generate instructions and instances for various Natural Language Processing (NLP) tasks, such as Summarization, Sentiment Analysis, question answering, etc. The generated instructional data can then be used to fine-tune the LLMs on the same or different tasks, improving their capabilities and controllability.

LoRA was proposed by a team of researchers from Microsoft Research in a paper titled “LoRA: Self-Supervised Generation of Instructional Data for Large Language Models” . The paper introduces a framework for Self-Supervised generation of instructional data using LLMs, such as GPT-4. The paper also demonstrates that LoRA can enable LLMs to learn new tasks and skills by generating instructions and instances for them.

LoRA generates instructions and instances by using a two-step process:

  • First, LoRA prompts the LLM to generate instructions for various NLP tasks, such as summarization, sentiment analysis, question answering, etc. The instructions are natural language descriptions of what the LLM should do given an input text. For example, an instruction for summarization could be “Write a summary of the following article in three sentences”. LoRA uses a set of templates and heuristics to guide the LLM to generate diverse and valid instructions.
  • Second, LoRA prompts the LLM to generate instances for each instruction, which are pairs of input texts and desired outputs. The input texts are sampled from a large corpus of text, such as Wikipedia or Common Crawl. The desired outputs are generated by the LLM itself, following the instruction. For example, an instance for summarization could be:

Input: The 2023 FIFA World Cup is scheduled to be the 23rd edition of the FIFA World Cup, the quadrennial international men’s association football championship contested by the national teams of the member associations of FIFA. It is scheduled to take place in Qatar from 21 November to 18 December 2023. This will be the first World Cup ever to be held in the Arab world and the first in a Muslim-majority country. This will be the second World Cup held entirely in Asia after the 2002 tournament in South Korea and Japan. In addition, the tournament will be the last to involve 32 teams, with an increase to 48 teams scheduled for the 2026 tournament in Canada, Mexico and United States.

Output: The 2023 FIFA World Cup is a global football tournament that will take place in Qatar from November to December 2023. It will be the first World Cup in the Arab world and a Muslim-majority country, and the second in Asia. The tournament will feature 32 teams for the last time before expanding to 48 teams in 2026.

LoRA ensures the quality of generated instructions and instances by using several techniques, such as:

  • Filtering out invalid or duplicate instructions based on syntactic and semantic criteria, such as length, readability, specificity, and uniqueness.
  • Evaluating the quality of generated instances based on metrics such as fluency, coherence, relevance, and accuracy.
  • Comparing the generated instances with human-written outputs from existing NLP datasets or expert annotations, and selecting the ones that have high similarity or agreement.
  • Applying post-processing steps such as spelling correction, punctuation normalization, and capitalization to improve the readability and consistency of the generated instances.

QLoRA

ULMFiT

Gradient-based