normal distribution

Normal distribution

Normal_Distribution_PDF.svg.png
A normal (sometimes called Gaussian) distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is \[f(x)=\frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(x-\mu)^{2}}{2\sigma^{2}}}\]. The parameter \[\mu\] is the mean of the distribution while the parameter \[\sigma^{2}\] is the variance.

Standard normal distribution

This is a special case where \[\mu=0\] and \[\sigma^{2}=1\] and is described by the PDF, \[\varphi(z)=\frac{e^{-\frac{z^{2}}{2}}}{\sqrt{2\pi}}\].

General normal distribution

20250831-191133.png
We say a random variable \[X\] is a \[\mathcal{N}(\mu,\sigma^{2})\] if \[X=\sigma Z+\mu\] where random variable \[Z\sim \mathcal{N}(0,1)\] (a standard normal distribution). Note that the notation \[X\sim\mathcal{N}(\mu,\sigma^{2})\] means \[X\] is normally distributed with mean \[\mu\] and variance \[\sigma^{2}\].

\[X=\sigma Z+\mu\implies Z=\frac{X-\mu}{\sigma}\], then

\begin{align*} Z&=\frac{X-\mu}{\sigma}\\ &=\frac{1}{\sigma}X-\frac{\mu}{\sigma}\\ &=aX+b\\ &\sim N(a_{\mu}+b,a^{2}\sigma^{2})\quad\text{linear transformation}\\ &\sim N \left( \frac{\mu}{\sigma}-\frac{\mu}{\sigma},\frac{\sigma^{2}}{\sigma^{2}} \right)\\ &\sim N(0,1) \end{align*}

20250831-191332.png
A extremely common use of this transform is to express the cumulative distribution function of \[X\], \[F_{X}(x)\], in terms of the CDF of \[Z\], \[F_{Z}(x)\], or more commonly \[\Phi(x)\].

\begin{align*} F_{X}(x)&=P(X\le x)\\ &=P \left( \frac{X-\mu}{\sigma}\le \frac{x-\mu}{\sigma} \right)\\ &=P \left( Z\le \frac{X-\mu}{\sigma} \right)\\ &=\Phi \left( \frac{x-\mu}{\sigma} \right)\\ &=\Phi(z) \end{align*}

Empirical rule

fig-ch06_03_01.jpg
\[X\] is a random variable that is normally distributed, approximately

  • \[68\%\] of \[x\] values lie between \[-\sigma\] and \[\sigma\] of \[\mu\] (1 s.d.)
  • \[95\%\] of \[x\] values lie between \[-2\sigma\] and \[2\sigma\] of \[\mu\] (2 .s.d.)
  • \[99.7\%\] of the \[x\] values lie between \[-3\sigma\] and \[3\sigma\] of \[\mu\] (3 s.d.)
index