# Multivariate Distribution and Inference Applications

Summary: The paper introduce a multivariate family of distributions for multivariate count data with excess zeros,which is a multivariate extension of univariate zero-inflated bell distribution. Model parameters are estimated using traditional maximum likelihood estimation. They also develop a EM algorithm to compute the MLE of parameters with closed-form expressions.&#x20;

Univariate bell family's Probability Mass Function:

$$
f\_{Bell}(x; \alpha) := Pr(X = x) = \exp(1-e^{W(\alpha)})\frac{W(\alpha)^{x}B\_x}{x!}, x = 0,1,\dots
$$

where $$\alpha > 0$$, $$W(.)$$ denote the principal branch of Lambert function, $$B\_x$$ are the bell numbers. We use $$X \sim Bell(\alpha)$$to denote this univariate distribution. Its properties:

* Belongs to one-parameter exponential family of distributions
* It is infinitely divisible
* Variance larger than mean. $$E(X) = \alpha$$ and $$Var(X) = \alpha\[1+W(\alpha)]$$ and $$E(X) > Var(X)$$

Bell distribution might be suitable for data with overdispersion, unlike poisson.&#x20;

Univariate Zero-inflated bell distribution. Let $$D$$ denote a degenerate distribution at constant $$c$$ i.e $$Pr(D = c) = 1$$, $$D \sim Degenerate(c)$$.&#x20;

Let $$D \sim Degenerate(0)$$ and $$X \sim Bell(\alpha)$$. Assume $$D$$ and $$X$$ are independent. Thus, PMF of ZIB can be expressed in the form of&#x20;

$$
f\_{ZIB}(y; \alpha, \omega) = \begin{cases}
\omega + (1-\omega)e^{1-e^{W(\alpha)}} & y=0, \\
(1-\omega)e^{1-e^{W(\alpha)}}\frac{W(\alpha)^yB\_y}{y!} & y=1,2\dots \\
\end{cases} \\
\= \[\omega+(1-\omega)e^{1-e^{W(\alpha)}}]\mathbb{I}(y=0)+\[(1-\omega)e^{1-e^{W(\alpha)}}\frac{W(\alpha)^yB\_y}{y!}]\mathbb{I}(y\neq0)
$$

$$
f\_{ZIB}(y; \alpha; \omega)= \begin{cases}
$$

Where $$\alpha > 0 , \omega \in (0, 1)$$ and $$\mathbb{I}(.)$$ denote indicator function. ZIB is a mix a distribution degenerate at zero with Bell distriution.&#x20;

$$Y \sim ZIB(\alpha, \omega)$$ admit following stochastic representation:

$$
Y  \overset{D}= ZX = \begin{cases}
0 & \text{with probability }\omega\\
X & \text{with probability} (1-\omega)\\
\end{cases}
$$

Where $$Z \sim Bernoulli(1-\omega)$$, $$X \sim Bell(\alpha)$$ and $$Z\perp X$$. Random variables on both sides of the squality have the same distribution(Y, ZX). From above, it is immediate that $$E(Y) = E(Z)E(X)= (1-\omega)\alpha, E(Y^2)=E(Z)E(X^2)=(1-\omega)\alpha\[1+\alpha+W(\alpha)]$$

and $$VAR(Y) = (1-\omega)\alpha\[1+\alpha \omega + W(\alpha)]$$.&#x20;

Poisson can be effected by overdisperson due to zero-inflated dataset, thus bell distribution might be more suitable.

### Multivariate ZIB distribution.

<figure><img src="/files/RcYyk8znMOVkCdUtdCBE" alt=""><figcaption><p>Model (to lazy to type all again since it was typed before)</p></figcaption></figure>

**Remark 2** PMF of $$Y \sim MZIB(\alpha, \omega)$$ can be expressed as $$f(y;\alpha , \omega ) = \omega Pr(D = y) + (1-\omega )Pr(X = y)$$, where $$D = (D\_1, \dots, D\_m)^T$$ and $${D\_r}\_{r=1}^m \overset{iid}\sim Degenerate(0)$$.

**Distributional Properties**

From stochastic representation, we immediately obtain&#x20;

$$
E(Y) = (1-\omega)\alpha , E(YY^T) = (1- \omega)\[\text{diag} { \alpha } + \text{diag} { M } + \alpha \alpha^T] \\
VAR(Y) = (1-\omega)\[\text{diag} {\alpha} + \text{diag}{M} + \omega \alpha \alpha^T]
$$

Where $$M = (\alpha\_1W(\alpha\_1), \dots , \alpha\_mW(\alpha\_m))^T$$ and $$\text{diag}{\alpha }, \alpha = (\alpha\_1, \dots, \alpha\_m)^T$$ denotes an $$m \times m$$ diagonal matrix with element $$\alpha\_r (r= 1, \dots, m)$$. We have that&#x20;

$$
CORR(Y\_j, Y\_k) = \omega (\alpha\_j \alpha\_k)^{\frac{1}{2}} \frac{\[1 + W(\alpha\_j) + \omega \alpha\_j]^\frac{-1}{2}}{\[1 + W(\alpha\_k) + \omega \alpha\_k]^]\frac{1}{2}} , j\neq k
$$

If $$\alpha\_j = \alpha\_k = \alpha$$, we have $$CORR(Y\_j, Y+k) = \frac{\omega \alpha}{\[1 + W(\alpha) + \omega \alpha]} \text{ for } j \neq k$$. Also $$\omega \rightarrow 0 \implies CORR(Y\_j, Y\_k) \rightarrow 0 \text{ for } j\neq k.$$&#x20;

Let $$T\_n(\theta)$$ be the n-th Touchard polynomial (see vocabulary) - correspond to n-th moment of Poisson distribution with parameter $$= \theta > 0$$. Let $$B\_n(Z\_1, \dots, Z\_n)$$ denote the n-th compelte Bell polynomial. k-th moment of $$X \sim Bell(\alpha)$$ is given by $$E(X^k) = B\_k(k\_1, \dots, k\_k)$$ where $$k\_k = e^{W(\alpha)}T\_k(W(\alpha))$$. Thus, mixed moments of $$Y \sim MZIB(\alpha, \omega) \text{ for } k\_1, \dots, k\_m \in \mathbb{N}$$ are given by&#x20;

$$
E(\prod\_{r=1}^m Y\_r^{k\_r}) = (1-\omega) \prod\_{r=1}^mB\_{k\_r}(k\_{k\_1}, \dots, k\_{k\_r})
$$

where $$k\_{k\_r} = e^{W(\alpha\_r)}T\_{k\_r}(W(\alpha\_r))$$. We then got the following propositions.

<figure><img src="/files/iiVjAFNarqGx46o6vRt6" alt=""><figcaption><p>Proposition 2, 3</p></figcaption></figure>

<figure><img src="/files/aZ6y6mu3oWQvuc0ZTH9U" alt=""><figcaption><p>Corr3, Prop 4</p></figcaption></figure>

I'm going to skip few propositions (there are just too many, see paper for all and their proof)

### Parameter Estimation

Let $$Y\_1 = (Y\_{11}, \dots , Y\_{1m})^T, \dots , Y\_n = (Y\_{n1} , \dots, Y\_{nm})^T$$ be $$n$$ indenpendent identically distribute random vectors such that each $$Y\_i = (Y\_{i1}, \dots , Y\_{im})^T \sim MZIB(\alpha, \omega) \text{ for } i = 1, \dots, n.$$Let $$y\_i = (y\_{i1}, \dots , y\_{im})^T$$ be the realization of random vector $$Y\_i \text { for } i = 1, \dots, n$$. $$Y\_{data} = \[y\_1, \dots, y\_n]^T$$ be the $$n \times m$$ matrix that contains the observed multivariate data. Define $$I = {i : y\_1 = 0\_m, i= 1, \dots, n }$$ and let $$n\_0 = \sum\_{i=1}^n (y\_i=0\_m)$$ be the number of elements in $$I$$, number of lines in matrix $$Y\_{data}$$ where all elements are equal to zero. Likelihood function eliminationg constants can be expressed as:&#x20;

$$
L(\alpha, \omega) = \[\omega + (1-\omega)e^{\xi\_+}]^{n\_0}(1-\omega)^{n-n\_0}e^{n-n\_0\xi\_+}\prod\_{r=1}^mW(\alpha\_r)^{N\_r}
$$

where $$N\_r = \sum\_{i \notin I}y\_{ir} = \sum\_{i=1}^n y\_{ir}$$ and $$\xi\_{+}:=\xi\_+(\alpha)$$ and corresponding log-likelihood functions:

$$
l(\alpha, \omega) = n\_0\ln (\omega + (1-\omega)e^{\xi\_+}) + (n-n\_0)\ln (1-\omega) + (n-n\_0)\xi\_+ + \sum\_{r=1}^m N\_r \ln (W(\alpha\_r))
$$

Some assumptions on behavior of $$l(\alpha, \omega)$$ as $$n \rightarrow \infty$$, regularity of first two derivatives with respect to model parameters and existence and uniqueness of ML estimate of $$\alpha$$.

**Direct Maximization**

Can be from direct maximization of log-likelihood function.

<figure><img src="/files/m1kd70Ts2ZFvamq3IdHa" alt=""><figcaption><p>First two  derivatives</p></figcaption></figure>

No-closed form of ML estimate, non-linear optimization algorithm.

<figure><img src="/files/Y6VGy08IZrHBMeQSF4IK" alt=""><figcaption><p>Optimization Process</p></figcaption></figure>

It is simpler to otpimize $$l\_{profile}(\alpha)$$. - earlier is m dimensional later is (m+1) dimensional.

Fisher matrix - variance and covatiance matrix given by the inverse of it. See paper for details.

**EM Algorithm**

We partition $$I$$ defined earlier into $$I = I\_e \cup I\_s$$ , $$I\_e$$ denote the number of extra zeros corresponding to degenerate distribution at point zero, $$I\_s$$ denote the number of structural zero vectors corresponding to baseline Bell distribution.

Let $$V$$ be the latent variable that denotes number of elements in set $$I\_e$$ to split $$n\_0$$ into $$V$$ and $$n\_0 - V$$, $$V\in {0, 1, \dots, n}$$.&#x20;

<figure><img src="/files/dbMsUjqoejkgHEVPC1WF" alt=""><figcaption><p>Distribution</p></figcaption></figure>

2 steps of EM - E-step: compute the conditional expectation of complete log-likelihood function given $$Y\_{data}$$.

M-step: Maximization of the Q-function. (from step 1)

<figure><img src="/files/XUiwUXjEuvTMKWF96p6v" alt=""><figcaption><p>EM</p></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://tianyi0216.gitbook.io/blog/reading-notes/stats-other/multivariate-distribution-and-inference-applications.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
