# Lecutre 9

## GANs

Give up modeling p(x), but allows to draw samples from p(x)

Assume we have data xi drawn from distribution $$p\_{data}(x)$$, we want to sample from $$p\_{data}$$

Idea: Introduce a latent variable z with simple prior p(z). Sample $$z \sim p(z)$$ and pass to Generator Network $$x = G(z)$$. Then x is sample from the generator distribution $$p\_G$$. Want $$p\_G = p\_{data}$$

<figure><img src="/files/hO6nTUW9rxamy2743ldo" alt=""><figcaption><p>GANs</p></figcaption></figure>

Training Objective

<figure><img src="/files/vKdqG0e3SYkdpw11vgNE" alt=""><figcaption><p>Objective</p></figcaption></figure>

Train using alternating gradient updates.

$$= \min\_G \max\_D V(G,D)$$

For t in 1, ..., T:

1. update D. $$D = D + \alpha\_D \frac{\partial V}{\partial D}$$
2. update G. $$G = G - \alpha\_G \frac{\partial V}{\partial G}$$

At start of training, generator is very bad and discriminator can easily tell real/fake so D(G(z)) close to 0.

Problem: vanishing gradient for G.

Solution: G is trained to minimize log(1-D(G(z)). Instead, train G to minimize -log(D(G(z)). Then G gets strong gradient at start of training.

It achieves global minimum when $$p\_G = p\_{data}$$

$$\min\_G \max\_D (E\_{x \sim p\_{data}}\[\log D(x)] + E\_{z \sim p(z)} \[\log (1-D(G(z)))])$$

$$=\min\_G \max\_D (E\_{x\sim p\_{data}}\[\log D(x)] + E\_{x \sim p\_G} \[\log (1-D(x))])$$ Change of variable on second term.

$$= \min\_G \max\_D \int\_x (p\_{data}(x) \log D(x) + p\_G (x) \log (1-D(x))) dx$$ Definition of Expectation.

$$= \min\_G \int\_x max\_D(p\_{data}(x) \log D(x) + p\_G (x) \log (1-D(x))) dx$$ push max\_D inside integral.

$$f(y) = a\log y - b \log (1-y)$$ . a = p\_data (x) y = D(x), b = p\_G(x). (Side computation to compute max)

$$f^\prime (y) = \frac{a}{y} - \frac{b}{1-y}$$, $$f^\prime (y) = 0 \iff y = \frac{a}{a+b}$$ (local max)

Optimal Discriminator: $$D\_G^\* (x) = \frac{p\_{data}(x)}{p\_{data}(x) + p\_G(x)}$$

$$= \min\_G \int\_x (p\_data(x) \log D\_G^\* (x) + p\_G(x) \log (1-D\_G^\*(x)))dx$$

$$= \min\_G \int\_x (p\_data(x) \log \frac{p\_{data}(x)}{p\_{data}(x) + p\_G(x)} + p\_G(x) \log \frac{p\_G(x)}{p\_{data}(x) + p\_G(x)} dx$$

$$= \min\_G (E\_{x \sim p\_{data}} \[\log \frac{p\_{data}(x)}{p\_{data}(x) + p\_G(x)}] + E\_{x\sim p\_G} \[\log \frac{p\_G(x)}{p\_{data}(x) + p\_G(x)}])$$ (definition of expectation)

$$= \min\_G (E\_{x \sim p\_{data}} \[\log \frac{2}{2} \frac{p\_{data}(x)}{p\_{data}(x) + p\_G(x)}] + E\_{x\sim p\_G} \[\log \frac{2}{2} \frac{p\_G(x)}{p\_{data}(x) + p\_G(x)}])$$ (Multiply by a constant)

$$= \min\_G (E\_{x \sim p\_{data}} \[\log \frac{2 \cdot p\_{data}(x)}{p\_{data}(x) + p\_G(x)}] + E\_{x\sim p\_G} \[\log  \frac{2 \cdot p\_G(x)}{p\_{data}(x) + p\_G(x)}] - \log 4)$$

KL Divergence: $$KL(p,q) = E\_{x \sim p} \[\log \frac{p(x)}{q(x)}]$$

$$= \min\_G (KL(p\_{data}, \frac{p\_{data} + p\_G}{2}) + KL(p\_G, \frac{p\_{data} + p\_G}{2}) - \log 4)$$

Jensen-Shannon Divergence: $$JSD(p,q) = \frac{1}{2}KL(p, \frac{p+q}{2}) + \frac{1}{2} KL (q, \frac{p+q}{2})$$

$$=\min\_G (2\cdot JSD(p\_{data}, p\_G) - \log 4)$$

JDS is always nonnegative, and zero only when two distribution are equal, thus p\_data = p\_G is global min.&#x20;

Summary: Global min and max happens when:

1. $$D\_G^\* (x) = \frac{p\_{data}(x)}{p\_{data}(x) + p\_G(x)}$$ (Optimal discriminator for any G)
2. $$p\_G(x) = p\_{data}(x)$$ (Optimal generator for optimal D)

Caveats:

1. G and D are neural nets with fixed architecture, we don't know whether they can actually represent the optimal D and G.
2. This tells nothing about convergence to the optimal solution.

### Conditional GANs

Learn p(x|y) instead of p(y). Make generator and discriminator both take label y as additional input.

<figure><img src="/files/fTm9agVkT7Tt7Cf4vHWc" alt=""><figcaption><p>Batch Normalization</p></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://tianyi0216.gitbook.io/blog/course_notes/cs-839-notes/lecutre-9.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
