Difference between revisions of "Few Shot Learning"

From
Jump to: navigation, search
Line 5: Line 5:
 
* [[Multitask Learning]]
 
* [[Multitask Learning]]
 
* [http://medium.com/quick-code/understanding-few-shot-learning-in-machine-learning-bede251a0f67 Understanding few-shot learning in machine learning | Michael J. Garbade]
 
* [http://medium.com/quick-code/understanding-few-shot-learning-in-machine-learning-bede251a0f67 Understanding few-shot learning in machine learning | Michael J. Garbade]
* [http://towardsdatascience.com/advances-in-few-shot-learning-a-guided-tour-36bc10a68b77 Advances in few-shot learning: a guided tour | Oscar Knagg - Towards Data Science]
+
 
** [http://towardsdatascience.com/advances-in-few-shot-learning-reproducing-results-in-pytorch-aba70dee541d Advances in few-shot learning: reproducing results in PyTorch - Towards Data Science]
 
 
   
 
   
 +
 +
<youtube>Q-agrHm-ztU</youtube>
 +
<youtube>LMfLRF9VKrc</youtube>
 +
 +
== [http://towardsdatascience.com/advances-in-few-shot-learning-a-guided-tour-36bc10a68b77 Advances in few-shot learning: a guided tour | Oscar Knagg] ==
 +
 +
* [http://towardsdatascience.com/advances-in-few-shot-learning-reproducing-results-in-pytorch-aba70dee541d Advances in few-shot learning: reproducing results in PyTorch | Oscar Knagg- Towards Data Science]
 +
 
* [http://arxiv.org/pdf/1606.04080.pdf Matching Networks: A differentiable nearest-neighbors classifier]
 
* [http://arxiv.org/pdf/1606.04080.pdf Matching Networks: A differentiable nearest-neighbors classifier]
 
* [http://arxiv.org/pdf/1703.05175.pdf Prototypical Networks: Learning prototypical representations]
 
* [http://arxiv.org/pdf/1703.05175.pdf Prototypical Networks: Learning prototypical representations]
Line 13: Line 20:
  
  
<youtube>Q-agrHm-ztU</youtube>
+
=== N-shot, k-way classification tasks ===
<youtube>LMfLRF9VKrc</youtube>
+
 
 +
The ability of a algorithm to perform few-shot learning is typically measured by its performance on n-shot, k-way tasks. These are run as follows:
 +
 
 +
# A model is given a query sample belonging to a new, previously unseen class
 +
# It is also given a support set, S, consisting of n examples each from k different unseen classes
 +
# The algorithm then has to determine which of the support set classes the query sample belongs to
 +
 
 +
 
 +
=== Matching Networks ===
 +
 
 +
combine both embedding and classification to form an end-to-end differentiable nearest neighbors classifier.
 +
 
 +
# Embed a high dimensional sample into a low dimensional space
 +
# Perform a generalized form of nearest-neighbors classification
 +
 
 +
The meaning of this is that the prediction of the model, y^, is the weighted sum of the labels, y_i, of the support set, where the weights are a pairwise similarity function, a(x^, x_i), between the query example, x^, and a support set samples, x_i. The labels y_i in this equation are one-hot encoded label vectors.
 +
 
 +
Matching Networks are end-to-end differentiable provided the attention function a(x^, x_i) is differentiable.
 +
 
 +
http://cdn-images-1.medium.com/max/800/1*OkiAPbdYq1utWUGlDGuBKw.png
 +
 
 +
=== Prototypical Networks ===

Revision as of 13:52, 4 January 2019

YouTube search... ...Google search


Advances in few-shot learning: a guided tour | Oscar Knagg


N-shot, k-way classification tasks

The ability of a algorithm to perform few-shot learning is typically measured by its performance on n-shot, k-way tasks. These are run as follows:

  1. A model is given a query sample belonging to a new, previously unseen class
  2. It is also given a support set, S, consisting of n examples each from k different unseen classes
  3. The algorithm then has to determine which of the support set classes the query sample belongs to


Matching Networks

combine both embedding and classification to form an end-to-end differentiable nearest neighbors classifier.

  1. Embed a high dimensional sample into a low dimensional space
  2. Perform a generalized form of nearest-neighbors classification

The meaning of this is that the prediction of the model, y^, is the weighted sum of the labels, y_i, of the support set, where the weights are a pairwise similarity function, a(x^, x_i), between the query example, x^, and a support set samples, x_i. The labels y_i in this equation are one-hot encoded label vectors.

Matching Networks are end-to-end differentiable provided the attention function a(x^, x_i) is differentiable.

1*OkiAPbdYq1utWUGlDGuBKw.png

Prototypical Networks