ABSTRACT
The beauty of the Recurrent Neural Network(RNN) lies in its diversity of applications. While dealing with RNNs, they have a great ability to deal with various input and output types. A simple machine learning model, or an Artificial Neural Network, may help in learning to predict the stock price based on a number of features, such as the volume of the stock, the opening value, etc. Apart from these, the price also depends on how the stock fared in the previous days and weeks. If we take the case of a trader, this historical data is a major deciding factor for making predictions. When faced with such time sensitive data, we can lean on another concept called as Recurrent Neural Networks. RNN can also be used for sequence prediction Sequence prediction is a problem that involves using historical sequence information to predict the next value or values in the sequence.
INTRODUCTION
The Recurrent Neural Networks (RNN) is the class of Artificial Neural Networks that can process a sequence of inputs in deep learning and will also retain its state while processing the next sequence of processing. In the case of Traditional Neural Networks, it will process an input and move onto the next one without considering its sequence. Data such as time series have a sequential order that needs to be properly followed to understand. The Traditional feed forward networks cannot comprehend this as each input is assumed to be independent of each other whereas in a time series, setting each input is dependent on the previous input. This raises the need for RNN.
RNN is extremely popular in the deep learning space which makes learning them even more imperative. A few applications of RNN include :
- Speech Recognition
- Machine translation
- Music Composition
- Handwriting recognition
- Grammar learning
PROBLEM STATEMENT
Using RNN, we will focus on sequence prediction issues. For this article we are using Sine-wave. Sine-wave has a common trend and can be solved easily. Predict the 51st number of the series, given a sequence of 50 numbers belonging to a sine wave.
CODING
Preparing data is the basic step. Our model would expect data to be the input of a single sequence of length 50 as input. Therefore, the input data will have a shape of:
(no_of_record x len_of_sequence x type_of_sequence)
Since we have only one type of sequence, sine wave, the value of type_of sequence would be 1. The output will also have a single value i.e, the 51st value of the series.
First, import all the necessary libraries.
Using the sine function, create a sine wave of length 200 and then we visualize the generated sine wave. The wave starts at 0.
Two arrays are initialized, X and Y. the sequence length is given as 50. Here, the number of records is 150. For the first 100 records, the model is trained and the remaining 50 records are used for validation. Therefore, the shape of the X array is (100,50) and the shape of the Y array is (100).
On using np.expand_dims , the dimension of both arrays are increased by one. Therefore, the array X has shape(100,50,1) and array Y has shape (100,1). The expand_dims() function is used for expanding the shape of an array. Then, we can insert a new axis that will appear at the axis position in the expanded array shape.
Next, we use the remaining 50 records for validation. The length of both the array would be 50.
CREATE THE ARCHITECTURE FOR RNN MODEL
Next we create the RNN model. Define all the necessary variables and functions for using in the RNN model. The RNN model created will take in the input sequence and process it through a hidden layer of 100 units. And then it will produce a single valued output.
The number of epochs taken is 25.
Length of sequence is 50.
We have defined the weights where,
- U is the weight matrix for the weights between the input and hidden layers.
- V is the weight matrix for the weights between the hidden and output layers.
- W is the weight matrix for the weights in the RNN layer (hidden layer).
We use the sigmoid function as the activation function so that it can be used in the hidden layer.
TRAIN THE MODEL
The training process is divided into smaller steps:
- Check the loss on training data
a. Forward Pass
b. Calculate Error
2. Check the loss on validation data
a. Forward Pass
b. Calculate Error
3. Start actual training
a. Forward Pass
b. Backpropagate Error
c. Update weights
We need to repeat these steps until convergence. If the model starts to over fit, stop! Or simply pre-define the number of epochs.
Step 1) Check the loss of training data and validating data
First, we do a forward pass through the RNN model and calculate the squared error for the predictions to get the loss value for all records. Moreover calculate the loss on validation data in the same loop.
We get the following output when we run the above code
Step 2) Training Process
To calculate error, do a forward pass and to calculate the gradient, do a backward pass and update them.
In the forward pass:
- First, we multiply the input with the weights between input and hidden layers.
- Add this with the multiplication of weights. We do this to capture the knowledge of the previous timestep.
- Pass it through a sigmoid activation function.
- The products of the weights between hidden and output layers are taken.
- We have a linear activation of the values at the output layer so we do not explicitly pass the value through an activation layer.
- The state at the current layer is saved and also the state at the previous timestep in a dictionary.
Backpropagate error
Calculate the gradients at each layer, and backpropagate the errors generated.
Update weights
After calculating the gradients of weights update the weights again.
We get the following output when we run the code:
GET PREDICTIONS
To get our predictions, do a forward pass using the trained weights. This is done for the training data.
Plotting these predictions, we get
Now, we need to check the model for overfitting. So, we use the validation set
The output for the above code will be :
RMSE (Root Mean Square Error) score on validation data is calculated by the following code :
CONCLUSION
RNNs are useful for sequence prediction and for working with sequence data. We have created the RNN model from scratch. The type of sequence used is sine-wave. Given a single sequence of length 50, we need to find the 51st term of the series. The process is divided into four parts, namely, Data Preparation, Creating architecture for RNN model, Training the model and Getting the predictions. In the data preparation phase, sine wave is generated to get the input values. Then the RNN model created will take in the input sequence and process it through a hidden layer. Training of the model is done using forward pass to calculate the loss value for all records. With the error calculated, the weights are updated using backpropogation to get the desired output.
CODE LINK : RNN Program Code
AUTHORS :
Dilnaz N
Merlin Nissi Babu
Aswathi