Lecture 7
Generative Models
Supervised Learning: data and label, learn a function to map x -> y.
Unsupervised: just data, no labels. Learn some underlying hidden structure of data.
Discriminivative vs Generative Models
DIscrimininative Modle: learn a probability distribution
Generative Model: learn a probability distribution
Conditional Generative Model, learn
Density Function
assigns a positive number to each possible ; higher numbers mean is more likely.
Normalized:
Different values of comptete density.
Discriminative: Density function, assigns a positive number to each possible . Higher numbers mean is more likely. Possible labels for each input comptete for probability mass. But no competition between images.
No way for model to handle unreasonable inputs, it must give label distribution for all images.
Generative Model: all possible images compete with each other for probability mass.
Requires deep image understanding. Model can reject unreasonable input by assigning them small values.
Conditional Generative Model: each possible label induces a competition among all images.
Recall Bayes rule:

Discriminative -> Assign labels to data Feature Learning (with labels)
Generative -> Detect outliers. Feature learning (without labels). Sample to generate new data.
Conditional -> Assign labels, while rejecting outliers. Generate new data conditioned on input labels.
Taxonomy of Generative Models

Autoregressive Model
Goal: explicit function for
Given dataset , train the model by solving:
Maximize probability of training data
Log trick to exchange product for sum
Loss function, train for GD.
Assume consist of multiple subparts:
Break down probability using chain rule:
Probability of next subpart given all previous subparts.
Pixel RNN
Generate image pixels one at a time, starting at upper left corner.
Compute hidden state for each pixel that depends on hidden states and RGB from left and above.
At each pixel, predict red, then blue, then green, softmax over
Each pixel depends implciity on all pixels above and left.
Problem: slow during training and testing, N x N image requires 2N-1 sequential steps.
Last updated