Skip to content

Commit 40b45f9

Browse files
authored
Update README.rst
1 parent e9879a0 commit 40b45f9

1 file changed

Lines changed: 17 additions & 1 deletion

File tree

README.rst

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1060,7 +1060,7 @@ Deep Learning
10601060
Deep Neural Networks
10611061
-----------------------------------------
10621062

1063-
Deep Neural Networks' architecture is designed to learn by multi connection of layers that each single layer only receives connection from previous and provides connections only to the next layer in hidden part. The input is a connection of feature space (As discussed in Section Feature_extraction with first hidden layer. For Deep Neural Networks (DNN), input layer could be tf-ifd, word embedding, or etc. as shown in standard DNN in Figure~\ref{fig:DNN}. The output layer is number of classes for multi-class classification and only one output for binary classification. But our main contribution of this paper is that we have many training DNN for different purposes. In our techniques, we have multi-classes DNNs which each learning models is generated randomly (number of nodes in each layer and also number of layers are completely random assigned). Our implementation of Deep Neural Networks (DNN) is discriminative trained model that uses standard back-propagation algorithm using sigmoid or ReLU as activation function. The output layer for multi-class classification, should use Softmax.
1063+
Deep Neural Networks' architecture is designed to learn by multi connection of layers that each single layer only receives connection from previous and provides connections only to the next layer in hidden part. The input is a connection of feature space (As discussed in Section Feature_extraction with first hidden layer. For Deep Neural Networks (DNN), input layer could be tf-ifd, word embedding, or etc. as shown in standard DNN in Figure. The output layer is number of classes for multi-class classification and only one output for binary classification. But our main contribution of this paper is that we have many training DNN for different purposes. In our techniques, we have multi-classes DNNs which each learning models is generated randomly (number of nodes in each layer and also number of layers are completely random assigned). Our implementation of Deep Neural Networks (DNN) is discriminative trained model that uses standard back-propagation algorithm using sigmoid or ReLU as activation function. The output layer for multi-class classification, should use Softmax.
10641064

10651065

10661066
.. image:: docs/pic/DNN.png
@@ -1245,8 +1245,24 @@ Recurrent Neural Networks (RNN)
12451245

12461246
.. image:: docs/pic/RNN.png
12471247

1248+
Another neural network architecture that addressed with researchers for text miming and classification is Recurrent Neural Networks (RNN). RNN assigns more weights to the previous data points of sequence. Therefore, this technique is a powerful method for text, string and sequential data classification. Moreover, this technique could be used for image classification as we did in this work. In RNN the neural net considers the information of previous nodes in a very sophisticated method which allows for better semantic analysis of structures of dataset.
1249+
1250+
1251+
Gated Recurrent Unit (GRU)
1252+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1253+
1254+
Gated Recurrent Unit (GRU) is a gating mechanism for RNN which was introduced by `J. Chung et al. <https://arxiv.org/abs/1412.3555>`__ and `K.Cho et al. <https://arxiv.org/abs/1406.1078>`__. GRU is a simplified variant of the LSTM architecture, but there are differences as follows: GRU contains two gates, a GRU does not possess internal memory (as shown in Figure; and finally, a second non-linearity is not applied (tanh in Figure).
1255+
12481256
.. image:: docs/pic/LSTM.png
12491257

1258+
Long Short-Term Memory (LSTM)
1259+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1260+
1261+
Long Short-Term Memory~(LSTM) was introduced by `S. Hochreiter and J. Schmidhuber <https://www.mitpressjournals.org/doi/abs/10.1162/neco.1997.9.8.1735>`__ and developed by many research scientists.
1262+
1263+
To deal with these problems Long Short-Term Memory (LSTM) is a special type of RNN that preserve long term dependency in a more effective way in comparison to the basic RNN. This is particularly useful to overcome vanishing gradient problem. Although LSTM has a chain-like structure similar to RNN, LSTM uses multiple gates to carefully regulate the amount of information that will be allowed into each node state. Figure shows the basic cell of a LSTM model.
1264+
1265+
12501266

12511267
import packages:
12521268

0 commit comments

Comments
 (0)