Solving toxicity on the internet using AI

What is an LSTM

An LSTM or Long Short Term Memory is an improvement to recurrent neural networks(RNN), which learn by passing a hidden state along with the input through each part of a sequence. LSTM’s are useful when our problem requires us to remember recent events and past events. For example, let’s say we have a dataset about forest organisms with a fox, bear, and an image that could either be a wolf or dog. We want our network to predict that the image is a wolf by getting a hint from the previous images.

Our example Credit: Udacity
A more likely version of events Credit: Udacity
How our RNN works Credit: Udacity
An overview of an LSTM Credit: Udacity

Basics of LSTM’s

Let’s expand a bit more on how the LSTM works with long and short term memory, let an elephant represent long term memory, a fish the short term memory, and the wolf/dog our event. Inside an LSTM there are four gates, the forget gate, the learn gate, the remember gate, and the use gate.

Our four gates Credit: Udacity
What our outputs are like after running them through an LSTM Credit: Udacity
Credit: Udacity

The architecture of LSTM’s

Note: The next few sections will require some calculus and linear algebra knowledge.

The learn gate

Let’s go back to our example of forest organisms, remember that the learn gate combines the event and short term memory, then forgets some of it. So how does it work mathematically?

What the equation looks like Credit: Udacity

Forget gate

Remember, the forget gate takes the long term memory and makes a decision on what to keep and what to forget. Mathematically, it works pretty similarly to the learn gate.

The equation for the forget gate Credit: Udacity

The remember gate

The remember gate takes the information from the learn and forget gate and combines them to output a new long term memory. This is the easiest concept yet mathematically!

Remember gate equation Credit: Udacity

The use gate

Once again the use gate combines the long and short-term memory, but instead, it’s producing a new short-term memory and an output. This one might be a bit complicated in terms of equations.

The use gate equation Credit: Udacity

LSTM’s on Toxic Comments

Note: This will require Python knowledge and some NLP experience would be helpful as I’m going to show screenshots of my work. I encourage you to recreate this code and play around with some parameters.

Building a vocabulary and dictionary
Labeling the words
Giving our sentences number values as discussed previously
Final steps before we generate the model
Our final AI
Calculating when to stop training our model
A pretty normal graph


LSTMS are great for natural language processing and in the end, we were able to achieve a 98.38% accuracy in classifying toxic and unwanted comments.


While I wrote the code for the LSTM myself, the original project idea was not mine. All credit goes to this GitHub project which gave links to the dataset I used and helped out a lot with data preprocessing. The visuals used when explaining LSTM’s were from this Udacity course.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store