Summary of Statistical Distribution#

Distribution Chart #

Useful function#

Power function#

Real functions of the form $f(x)=x^{a}$

Exponential function#

$f(x) = a^{x}$, for example, $f(x) = e^x$.

Logrithm function#

$log_b^y = x$

Gamma function#

$\Gamma(\alpha) = \int^{\infty}_0 t^{\alpha - 1} e^{-t} dt$

Univariate Distribution relationship#

Chart for univariate distribution relationship

Relationship between discrete and continuous distribution#

Table 1. Relationship abbreviation#

Discrete	Continuous	shorthand
Binomial	Poisson	BP
Negative Binomial	Gamma	NG
Geometric	Exponential	GE

Continuous distribution#

Exponential distribution#

The probability density function (pdf) of an exponential distribution is

\[ f(x;\lambda) = \left\{ \begin{array}{ @{}% no padding l@{\quad}% some padding r@{}% no padding >{{}}r@{}% no padding >{{}}l@{}% no padding } \lambda e^{ -\lambda x } & x \geq 0 \\ 0 & x < 0 \end{array} \right. \]

Or,

\[ f(x;\lambda) = \left\{ \begin{array}{ @{}% no padding l@{\quad}% some padding r@{}% no padding >{{}}r@{}% no padding >{{}}l@{}% no padding } \frac{1}{\beta} e^{ - x/\beta} & x \geq 0 \\ 0 & x < 0 \end{array} \right. \]

where $\lambda$ is rate parameter, $\beta$ is scale parameter.

Gamma distribution#

Let $t = \beta x$,

the probability density function is

\[ f(x;\lambda) = \left\{ \begin{array}{ @{}% no padding l@{\quad}% some padding r@{}% no padding >{{}}r@{}% no padding >{{}}l@{}% no padding } \frac{\beta^{\alpha}}{\Gamma{(\alpha)}} x^{\alpha -1} e^{-\beta x} & x \geq 0 \\ 0 & x < 0 \end{array} \right. \]

where $\alpha$ is shape parameter and $\beta$ is rate parameter.

Normal distribution#

The derivative of normal distribution (Ref: Tim )#

Suppose I throw a dart into a dartboard. I aim at the centre of the board (0,0) but I’m not all that good with darts so the dart lands in a random position (X,Y) which has a joint density function $f:\mathbb R^2\to\mathbb R^+$. Let’s make two assumptions about the way I play darts.

The density is rotationally invariant so the distribution of where my dart lands only depends on the distance of the dart to the centre.
The random variables $X$ and $Y$ are independent, how much I miss left and right makes no difference to the distribution of how much I miss up and down.

So by assumption one and Pythagoras I must be able to express the density

\[f(x,y) = g(x^2 + y^2)\]

Now as the random variable $X$ and $Y$ are independent and identically distributed I must be able to express

\[f(x,y) \propto f(x,0) f(0,y)\]

Combining these assumptions we get that for every pair (x,y), we have

\[g(x^2 + y^2) \propto g(x^2) g(y^2)\]

This means that $g$ must be an exponential function

\[g(t) = Ae^{-Bt}\]

So $A$ will be some normalising constant. $B$ somehow reflects the units I’m measuring in. ( So if I measure the distance in cm $B$ will be 10 times as big as if I measured in mm). $B$ must be negative because the density should be a decreasing function of distance(I’m not that bad at darts).

So to work out $A$, I need to integrate $f(.,.)$ over $\mathbb{R}^2$ a quick change of coordinates and

\[\iint_{\mathbb{R}} f(x,y)dxdy = 2\pi \int^{\infty}_{0} tg(t)dt=\frac{2\pi}{B^2}\]

So we should set $A = \frac{B^2}{2\pi}$ it’s convenient to choose $B$ in terms of the standart deviation, so we set $B = \frac{1}{2\sigma}$ and $A = \frac{1}{2\pi\sigma^2}$.

So if I set $\tilde{f}(x) = \frac{1}{\sqrt(2\pi)\sigma}e^{-\frac{x^2}{2\sigma}}$ then $f(x,y) = \tilde{f}(x) \tilde{f}(y)$.

The $e$ comes from the fact I wanted my $X$ and $Y$ coordinates to be independent and the $\pi$ comes from the fact that I wanted rotational invariance so I’m integrating over a circle.

The interesting thing happens if I throw two darts. Suppose I throw my first dart aiming at (0,0) which lands at ($X_1$, $Y_1$), I aim my next dart at the first dart, so this one lands at ($X_2$,:math:Y_2) with $X_2=X_1 + X$ and $X_2=Y_1+Y$.

So the position of the second dart is the sum of the two errors. But my sum is still rotationally invariant and the variables $X_2$ and $Y_2$ are still independent, so ($X_2$, $Y_2$) satisfies my two assumptions.

That means that when I add independent normal distributions together I get another normal distribution.

It’s this property that makes it so useful, because if I take the average of a very long sequence of random variables I should get something that’s the same shape no matter how long my sequence is and taking a sequence twice as long is like adding the two sequences together. It’s this property of the normal distribution that makes it so useful.

Other useful materials#

The evolution of the normal distribution

$\chi^2$ distribution#

The $\chi^2$ distribution is a special case of the gamma distribution, as a $\chi^2$ random variable with $n$ degrees of freedom follows a gamma distribution with shape parameter $\alpha = n/2$ and $\beta = 1/2$, namely, $\chi^2 \sim Gamma(n/2,1/2)$, giving the density function as

\[ f(x;\alpha,\beta) = \left\{ \begin{array}{ @{}% no padding l@{\quad}% some padding r@{}% no padding >{{}}r@{}% no padding >{{}}l@{}% no padding } \frac{2^{-n/2}}{\Gamma{(n/2)}} x^{n/2 -1} e^{-1/2 x} & x\geq 0 \\ 0 & x < 0 \end{array} \right. \]

The relationship between normal and $\chi^2$ distribution#

Firstly, we need prove $\Gamma(1/2) = \sqrt{\pi}$ (Ref: The Gamma Function)

\[ \begin{align} \Gamma(1/2)^2 &= \left(\int_0^{\infty}t^{-1/2}e^{-t}\,dt\right)^2 \\ &= \left(2\int_0^{\infty}e^{-x^2}\,dx\right)^2 && \text{let $t = x^2$} \\ &= \left(\int_{-\infty}^{\infty}e^{-x^2}\,dx\right)^2 && \text{from (0,$\infty$) to (-$\infty$,+$\infty$)} \\ &= \left(\int_{-\infty}^{\infty}e^{-x^2}\,dx\right)\left(\int_{-\infty}^{\infty}e^{-y^2}\,dy\right) \\ &= \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-x^2}e^{-y^2}\,dxdy \\ &= \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-(x^2+y^2)}\,dxdy \\ &= \int_{0}^{2\pi}\int_{0}^{\infty}e^{-r^2}\,rdrd\theta && \text{let $x= r \cdot \cos(\theta)$ and $y= r \cdot \sin(\theta) $} \\ &= \frac12\int_{0}^{2\pi}\int_{0}^{\infty}e^{-u}\,dud\theta \\ &= \frac12\int_{0}^{2\pi}\left.-e^{-u}\right|_0^{\infty}\,d\theta \\ &= \frac12\int_{0}^{2\pi}\,d\theta \\ &= \pi \end{align} \]

Secondly, we need to find the pdf of $X^2$ (Ref: Normal to Chi)

If $X \sim \mathcal{N}(0,1)$, then the pdf of $X$ is

\[ \begin{array}{ @{}% no padding l@{\quad}% some padding r@{}% no padding >{{}}r@{}% no padding >{{}}l@{}% no padding } \varphi(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2} \end{array} \]

Let $f$ be the pdf of $X^2$. Then,

\[ \begin{align} f(x) & = \frac{d}{dx} \Pr(X^2 \le x) \\ &= \frac{d}{dx} \Pr(-\sqrt{x}\le X\le\sqrt{x}) \\ & = \frac{d}{dx} \frac{1}{\sqrt{2\pi}} \int_{-\sqrt{x}}^{\sqrt{x}} e^{-u^2/2} \;du \\ & = \frac{2}{\sqrt{2\pi}}\frac{d}{dx} \int_0^{\sqrt{x}} e^{-u^2/2} \;du \\ & = \frac{2}{\sqrt{2\pi}} e^{-\sqrt{x}^2/2} \frac{d}{dx} \sqrt{x} = \frac{2}{\sqrt{2\pi}} e^{-x/2} \frac{1}{2\sqrt{x}} = \frac{e^{-x/2}}{\sqrt{2\pi x}} \\ & = \frac{2^{-1/2}}{\Gamma{(1/2)}} x^{\frac12 - 1}e^{-x/2} \sim Gamma(1/2,1/2) \end{align} \]

Noncentral $\chi^2$ distribution#

Wishart distribution#

Discrete distribution#

Bernoulli distribution $X \sim Bernoulli(p)$#

Bernoulli distribution is used to indicate that the random variable $X$ has the Bernoulli distribution with parameter $p$, where $0 < p < 1$. A Bernoulli random variable $X$ with success probability $p$ has probability mass function

\[ f(x) = \left\{ \begin{array}{ @{}% no padding l@{\quad}% some padding r@{}% no padding >{{}}r@{}% no padding >{{}}l@{}% no padding } p , & x = 1 \\ 1- p , & x = 0 \end{array} \right. \]

where $0 < p < 1$. The Bernoulli distribution is associated with the notion of a $Bernoulli \quad trial$, which is an experiment with two outcomes, generically referred to as $success$ (x = 1) and $failure$ (x = 0).

Binomial distribution $X \sim Binomial(n,p)$#

The binomial distribution models the number of successes in $n$ mutually independent Bernoulli trials, each with probability of success $p$. The random variable $X \sim binomial(n,p)$ has probability mass function

Possion distribution#

A Poisson random variable $X$ with scale parameter $\mu$ has probability mass function

The Poisson distribution can be used to approximate the binomial distribution when $n$ is large and $p$ is small. Besides, it can be also used to model the number of events in an interval associated with a process that evolves randomly over space or time. Applications include the number of typographical errors in a book, the number of customers arriving in an hour.

P value distribution#

ordered statistics, QQ plot

Summary of Statistical Distribution#

Distribution Chart #

Useful function#

Power function#

Exponential function#

Logrithm function#

Gamma function#

Univariate Distribution relationship#

Relationship between discrete and continuous distribution#

Table 1. Relationship abbreviation#

Continuous distribution#

Exponential distribution#

Gamma distribution#

Normal distribution#

The derivative of normal distribution (Ref: Tim )#

Other useful materials#

\(\chi^2\) distribution#

The relationship between normal and \(\chi^2\) distribution#

Noncentral \(\chi^2\) distribution#

Wishart distribution#

Discrete distribution#

Bernoulli distribution \(X \sim Bernoulli(p)\)#

Binomial distribution \(X \sim Binomial(n,p)\)#

Possion distribution#

P value distribution#

Summary of Statistical Distribution#

Distribution Chart#

Useful function#

Power function#

Exponential function#

Logrithm function#

Gamma function#

Univariate Distribution relationship#

Relationship between discrete and continuous distribution#

Table 1. Relationship abbreviation#

Continuous distribution#

Exponential distribution#

Gamma distribution#

Normal distribution#

The derivative of normal distribution (Ref: Tim )#

Other useful materials#

\(\chi^2\) distribution#

The relationship between normal and \(\chi^2\) distribution#

Noncentral \(\chi^2\) distribution#

Wishart distribution#

Discrete distribution#

Bernoulli distribution \(X \sim Bernoulli(p)\)#

Binomial distribution \(X \sim Binomial(n,p)\)#

Possion distribution#

P value distribution#

Distribution Chart #