Difference between revisions of "Transformer"

From
Jump to: navigation, search
Line 25: Line 25:
 
http://skymind.ai/images/wiki/attention_mechanism.png
 
http://skymind.ai/images/wiki/attention_mechanism.png
  
<youtube>W2rWgXJBZhU</youtube>
 
<youtube>SysgYptB198</youtube>
 
<youtube>quoGRI-1l0A</youtube>
 
<youtube>omHLeV1aicw</youtube>
 
 
<youtube>IxQtK2SjWWM</youtube>
 
<youtube>IxQtK2SjWWM</youtube>
 
<youtube>XrZ_Y4koV5A</youtube>
 
<youtube>XrZ_Y4koV5A</youtube>
 
<youtube>OYygPG4d9H0</youtube>
 
<youtube>OYygPG4d9H0</youtube>
 
<youtube>QuvRWevJMZ4</youtube>
 
<youtube>QuvRWevJMZ4</youtube>

Revision as of 13:57, 29 June 2019

YouTube search... ...Google search

Transformer Model - The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an Autoencoder (AE) / Encoder-Decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Attention Is All You Need | A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, and I. Polosukhin

attention_mechanism.png