Sequence Model
Used when input/output data is a sequence. Used on tasks such as speech recongition, music generation, sentiment classification, DNA sequence analysis, Machine translation, Video acivity recongition etc.
Notation:
: input. We use to denote the 's sequence of the input.
: output. We use to denote the 's sequence of the output.
can be used to denote length. , denote length of input, output.
can be used to denote training/testing data. denote its length.
Word representation: we can use a vocabulary vector, and then use one-hot encoding for all the words.
RNN
Problem with standard NN: input, output can have different length. Doesn't share features learned across different positions of text.
RNN:
Activation of previous input (word) can be used in the next one. So allow output of one node to affect subsequence nodes. (Unidirection RNN)
Forward prop: Initialize . , where is an activation function, is the weight for learning calculating activating layer from activation. is weight to learn calculate a from x, and so on...
Common activation - tanh, sometimes relu for . For , can be sigmoid, or anything.(depend on problem)
Last updated