Difference between revisions of "Transformer-XL"

Revision as of 18:15, 2 February 2019

Combines the two leading architectures for language modeling:

Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN) to handles the input tokens — words or characters — one by one to learn the relationship between them
Attention Mechanism/Model - Transformer Model to receive a segment of tokens and learns the dependencies between at once them using an attention mechanism.

0*mrV1VMF_G2mhQ9Jj.png

@@ Line 1: / Line 1: @@
+{{#seo:
+|title=PRIMO.ai
+|titlemode=append
+|keywords=artificial, intelligence, machine, learning, models, algorithms, data, singularity, moonshot, Tensorflow, Google, Nvidia, Microsoft, Azure, Amazon, AWS
+|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
+}}
 [http://www.youtube.com/results?search_query=Transformer-XL+attention+model+ai+deep+learning+model YouTube search...]
 [http://www.google.com/search?q=Transformer+XL+attention+model+deep+machine+learning+ML ...Google search]