Data is the backbone of modern development, and not just in the tech industry. Shaping everything from healthcare to gaming to marketing, developers need to have the skills to excel in a data-driven future.
At Packt its always been our aim to make the latest ...
Understanding how Long Short-Term Memory (LSTMs) work
Let’s have a walk through the actual mechanism of LSTMs. We will first briefly discuss the overall view of an LSTM cell and then start discussing each of the operations taking place within an LSTM cell along with an example of text generation.
This article is an excerpt from the book Natural Language Processing with TensorFlow written by Thushan Ganegedara. The book provides an emphasis on both the theory and practice of natural language processing. It introduces the reader to existing TensorFlow functions and explains how to apply them while writing NLP algorithms. Specific examples are used to make the concepts and techniques concrete.
LSTMs are mainly composed of the following three gates:
Input gate: A gate which outputs values between 0 (the current input is not written to the cell state), and 1 (the current input is fully written to the cell state). Sigmoid activation is used to squash the output to between 0 and 1.