*s*
*t* =
*t an h*(*W*
*h x*
*x*
*t* +
*W*
*h h*
*s*
*t*−1 +
*b*
*t* )
*o*
*t* =
*s oft maxi **p*
*t a r get iitiINTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTERS IN SIMULATION* Volume 11, 2017 ISSN: 1998-0159 8
* forget its previous state. This latter feature allows to catch more complex temporal patterns. The forward propagation equations characterizing the LSTM gates, read as follows:* and for the forget state update, we have: where x_t represents the input to the memory cell, while W_i ,W_f ,W_c ,W_o ,U_i ,U_f ,U_c ,U_o are the weight matrices, and bi bf ,b_c ,b_o are biases. Figure 2: Long-short term memory cell diagram IV. RNN ARCHITECTURES - C GRU In what follows we consider the Gated Recurrent Units (GRUs), see. Basically GRUs are supposed to solve the problem affecting the RNNs architectures. In particular, they use the same gates approach defining the LSTMs, but merging the input gate with the and forget gate, and the same holds for cell state as well as for the hidden state. The result is a lighter model which is supposed to be trained faster, also performing slightly better for some tasks, see [1]. The typical GRU cell diagram can be represented as in figure Figure 3: Gated recurrent unit network diagram The forward propagation equations of typical GRU gates, read as follows: where x_t ht ,z_t rt are, respectively, the input, the output, the update gate, and the reset gate vectors, while W,U represent the parameter matrices, and b_z,b_r,b_h are biases. We would likt to mention the comparison considered by in between RNNs, LSTMs, GRUs and other variants of RNNs architectures. The final result clearly show that they all three basically produce the same performances. V. DATA P REPROCESSING In what follows we focus our attention on the GOOGL stock prices, exploiting daily data for the last five years, i.e. 2012- 2016, see figure 4. Our goal is to forecast the movement’s direction of the stock we are interested in, on the basis of historical data. In particular we consider atypical time window of 30 days of open price, high price, low price, close price and volume (OHLCV) data. The first step consists in rescaling our windows, e.g., by normalizing them as follows or by a Min-Max type-scaling To perform our analysis we have consider the [−1;1]. This is because, as we better see later, NNs with hyperbolic tangent
*i*
*t* =
*σ *(*W*
*i*
*x*
*t* +
*U*
*i*
*h*
*t *−1 +
*b*
*i* )
*c*
*i*
*n*
*t* =
*t an h*(*W*
*c*
*x*
*t* +
*U*
*c*
*h*
*t *−1 +
*b*
*c*
*i*
*n* )
*f*
*t* =
*σ *(*W*
*f*
*x*
*t* +
*U*
*f*
*h*
*t*−1 +
*b*
*f* )
*o*
*t* =
*σ *(*W*
*o*
*x*
*t* +
*U*
*o*
*h*
*t*−1 +
*b*
*o* )
*;*
*c*
*t* =
*f*
*t* ⋅
*c*
*t *−1 +
*i*
*t* ⋅
*c*
*i*
*n*
*t*
*h*
*t* =
*o*
*t* ⋅
*t an h*(*c*
*t* )
*,*
*z*
*t* =
*σ *(*W*
*z*
*x*
*t* +
*U*
*z*
*h*
**Share with your friends:** |