Recurrent Neural Network (RNN)
YouTube ... Quora ...Google search ...Google News ...Bing News
- State Space Model (SSM) ... Mamba ... Sequence to Sequence (Seq2Seq) ... Recurrent Neural Network (RNN) ... Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN) Variants:
- Long Short-Term Memory (LSTM)
- Gated Recurrent Unit (GRU)
- Bidirectional Long Short-Term Memory (BI-LSTM)
- Bidirectional Long Short-Term Memory (BI-LSTM) with Attention Mechanism
- Average-Stochastic Gradient Descent (SGD) Weight-Dropped LSTM (AWD-LSTM)
- Hopfield Network (HN)
- Attention Mechanism ...Transformer Model ...Generative Pre-trained Transformer (GPT)
- Multimodal Language Models ... Generative Pre-trained Transformer (GPT-4) ... GPT-5
- Sequence to Sequence (Seq2Seq)
- Reservoir Computing (RC) Architecture
- Bidirectional Encoder Representations from Transformers (BERT) ... a better model, but less investment than the larger OpenAI organization
- AI-Powered Search
- Memory ... Memory Networks ... Hierarchical Temporal Memory (HTM) ... Lifelong Learning
- Optimization Methods
- Embedding - projecting an input into another more convenient representation space; e.g. word represented by a vector
- Embedding ... Fine-tuning ... RAG ... Search ... Clustering ... Recommendation ... Anomaly Detection ... Classification ... Dimensional Reduction. ...find outliers
- Sentiment Analysis | Stanford’s Sentiment Analysis Demo using Recursive Neural Networks ... Sentiment Analysis
- Artificial Intelligence (AI) ... Generative AI ... Machine Learning (ML) ... Deep Learning ... Neural Network ... Reinforcement ... Learning Techniques
- Artificial General Intelligence (AGI) to Singularity ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning
- Gradient Descent Optimization & Challenges
- Neural Network Zoo | Fjodor Van Veen
- A Beginner's Guide to LSTMs and Recurrent Neural Networks | Chris Nicholson - A.I. Wiki pathmind
- Handwriting generation demo | Alex Graves
- Animated RNN, LSTM and GRU | Raimi Karim - Towards Data Science
- Large Language Model (LLM) ... Multimodal ... Foundation Models (FM) ... Generative Pre-trained ... Transformer ... GPT-4 ... GPT-5 ... Attention ... GAN ... BERT
- Natural Language Processing (NLP) ... Generation (NLG) ... Classification (NLC) ... Understanding (NLU) ... Translation ... Summarization ... Sentiment ... Tools
- How Wikimedia is using machine learning to spot missing citations | Seth Colander - VentureBeat
- The Unreasonable Effectiveness of Recurrent Neural Networks | Andrej Karpathy - Towards Data Science
- An Introduction to Recurrent Neural Networks for Beginners | Victor Zhou - Towards Data Science
- ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review
Recurrent nets are a type of artificial Neural Network designed to recognize patterns in sequences of data, such as text, genomes, handwriting, the spoken word, or numerical times series data emanating from sensors, stock markets and government agencies. They are arguably the most powerful and useful type of neural network, applicable even to images, which can be decomposed into a series of patches and treated as a sequence. Since recurrent networks possess a certain type of memory, and memory is also part of the human condition, we’ll make repeated analogies to memory in the brain. Recurrent neural networks (RNN) are FFNNs with a time twist: they are not stateless; they have connections between passes, connections through time. Neurons are fed information not just from the previous layer but also from themselves from the previous pass. This means that the order in which you feed the input and train the network matters: feeding it “milk” and then “cookies” may yield different results compared to feeding it “cookies” and then “milk”. One big problem with RNNs is the vanishing (or exploding) gradient problem where, depending on the activation functions used, information rapidly gets lost over time, just like very deep FFNNs lose information in depth. Intuitively this wouldn’t be much of a problem because these are just weights and not neuron states, but the weights through time is actually where the information from the past is stored; if the weight reaches a value of 0 or 1 000 000, the previous state won’t be very informative. RNNs can in principle be used in many fields as most forms of data that don’t actually have a timeline (i.e. unlike sound or video) can be represented as a sequence. A picture or a string of text can be fed one pixel or character at a time, so the time dependent weights are used for what came before in the sequence, not actually from what happened x seconds before. In general, recurrent networks are a good choice for advancing or completing information, such as autocompletion. Elman, Jeffrey L. “Finding structure in time.” Cognitive science 14.2 (1990): 179-211.
Bidirectional Recurrent Neural Network (BiRNN) look exactly the same as its unidirectional counterpart. The difference is that the network is not just connected to the past, but also to the future. Schuster, Mike, and Kuldip K. Paliwal. “Bidirectional recurrent neural networks.” IEEE Transactions on Signal Processing 45.11 (1997): 2673-2681.
From RNN to Long Short-Term Memory (LSTM) & Gated Recurrent Unit (GRU)