Animated RNN, LSTM and GRU

By Raimi Bin Karim

Recurrent neural networks are a class of artificial neural networks which are often used with sequential data. The 3 most common types of recurrent neural networks are vanilla recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent units (GRU).

There are many illustrated diagrams for recurrent neural networks out there. My personal favourite is the one by Michael Nguyen in this article published in Towards Data Science, because he provides us with intuition on these models and more importantly the beautiful illustrations that make it easy for us to understand. But the motivation behind my post is to have a better visualisation what happens in these cells, and how the nodes are being shared and how they transform to give the output nodes. I was also inspired by the Michael’s nice animations .

This article looks into vanilla RNN, LSTM and GRU cells. It is a short read and is for those who have read up on these topics. (I recommend reading Michael’s article before reading this post.) It is important to note that the following animations are sequential to guide the human eyes, but do not reflect the chronological order during vectorised machine computation.

Here is the legend that I have used for the illustrations.

Fig. 0: Legend for animations

In my animations, I have used an input size of 3 (green) and 2 hidden units (red) with a batch size of 1.

Let’s begin!

Fig. 1: Animated RNN cell
  • t — time step
  • X — input
  • h — hidden state
  • length of X — size/dimension of input
  • length of h — no. of hidden units. Note that different libraries call them differently, but they mean the same:
    - Keras — state_size ,units
    - PyTorch — hidden_size 
    - TensorFlow — num_units
Fig. 2: Animated LSTM cell

Note that the dimension of the cell state is the same as that of the hidden state.

Fig. 3: Animated GRU cell

Thanks to Derek and Ren Jie for ideas, suggestions and corrections to this article.

Follow me on Twitter @remykarem for digested articles and demos on AI and Deep Learning.