In-Context Learning (ICL)

From
Jump to: navigation, search

YouTube ... Quora ...Google search ...Google News ...Bing News


In-contextcontext learning (ICL) is a paradigm in Natural Language Processing (NLP) where Large Language Model (LLM) make predictions based on contexts augmented with just a few training examples. LLMs are able to extract patterns from the examples provided in the context and use them to perform many complex NLP tasks. - In-context Learning - A New Paradigm in NLP? | The Global NLP Lab



Why can LLMs learn to do something entirely new by merely being shown a few input-output examples?

Prompt: Given the criteria for determining whether something is true or false, based on whether or not it is found in the sky.
bird is true, plane is true, car is false, truck is false,
is train true or false?



Tips

  • Use relevant and informative examples. The examples you provide to the model should be relevant to the task you want it to learn, and they should be informative enough for the model to learn from.
  • Provide a variety of examples. The more varied the examples you provide, the better the model will be able to generalize to new data.
  • Use clear and concise prompts. The prompts you give to the model should be clear and concise, and they should accurately reflect the task you want the model to perform.
  • Experiment with different prompt formats. There is no one-size-fits-all prompt format for ICL. Experiment with different formats to see what works best for your task and model.
  • Use a powerful computer. ICL can be computationally expensive, so it is important to use a powerful computer to train and deploy ICL models.
  • Use demonstrations that are similar to the target task. This will help the model to learn the task more quickly and efficiently.
  • Use a small number of demonstrations. A small number of well-chosen demonstrations can be more effective than a large number of irrelevant demonstrations.
  • Use demonstrations that are representative of the target data distribution. This will help the model to generalize to new data more effectively.
  • Use a pre-trained model. Pre-trained models have already learned a lot about the world, which can give them a head start on learning new tasks using ICL.

Why is ICL exciting?

In-context learning has enormous potential to unlock more flexible, general, and human-like intelligence in AI systems. Some reasons it is generating so much interest:

  • Versatility - With ICL, a single model can learn a wide variety of skills at the same time, instead of needing separate training for each one.
  • Generalization - ICL allows models to learn underlying rules and patterns from just a few examples, and generalize them to new situations.
  • Efficiency - No lengthy or costly re-training of models is needed. Skills can be acquired instantly.
  • Accessibility - ICL enables AI systems that can be taught by everyday users through simple demonstrations of the task.

In short, ICL enables LLMs to become powerful systems that can continually learn, reason, and adapt to new tasks.


Mystery: Why does ICL work so well?

There have been a few studies aiming to uncover this in the literature.

  • One factor that might play a role is the distribution of the training data. When training on a very large dataset, the ICL ability of LLMs seems to emerge when the data appears in clusters and there are a sufficient number of rare classes present.
  • Another factor is that Transformer models might be learning to encode learning algorithms implicitly during the training process, due to the properties of their architecture. During inference, transformer LLMs might be performing an implicit finetuning using the provided examples in the context.

In-context learning works well because it allows users to quickly build models for a new use case without worrying about fine-tuning and storing new parameters for each task. It typically requires very few training examples to get a prototype working, and the natural language interface is intuitive even for non-experts. How does in-context learning work? A framework for understanding the differences from traditional supervised learning | Sang Michael Xie and Sewon Min - The Stanford AI Lab



Pretraining vs Fine-tuning vs In-context Learning (ICL) of LLM

YouTube ... Quora ...Google search ...Google News ...Bing News


Method Description
Pretraining ... training a model on a large dataset to learn general language patterns. This is usually done using an unsupervised or self-supervised objective such as predicting the next word in a sequence.
Fine-tuning ... taking a pretrained model and further training it on a smaller dataset specific to a particular task. This allows the model to adapt to the specific task while still leveraging the general language patterns learned during pretraining.
In-context learning (ICL) ... involves using a pretrained model to make predictions based on contexts augmented with just a few training examples. The model is able to extract patterns from the examples provided in the context and use them to perform many complex NLP tasks without any additional training.


In-context Reinforcement Learning

YouTube ... Quora ...Google search ...Google News ...Bing News