probability density function

Probability density function

A probability density function (sometimes called PDF for short), or the density of an absolute continuous random variable, is a function used to specify the probability of the random variable falling within a particular range of values. The "density" in question is quite similar to mass density in physics, here it's defined as the probability per unit length. This probability is given by the integral of a continuous variable's PDF over that range.

Intuitively, one can imagine this as, assuming that we have a probability distribution of temperature in a room, then \[P(\text{temperature between $T$ and $T+\Delta T$})=\rho(T)\cdot \Delta T\]. While this is an oversimplification, it conveys the meaning of the probability density or the probability per unit length. Also note that this look similar to the idea of integrals, since the shaded region represents \[\rho(T)\cdot \Delta T\], when these tiny rectangles becomes infinitesimally thin, by definition of integrals, \[\sum\rho(T)\Delta T\implies\int\rho(T)\,dT=\int f_{X}(u)\,du\].

Definition

Consider a continuous random variable \[X\] with an absolute continuous cumulative distribution function \[F_{X}(x)\]. The probability density function \[f_{X}(x)\] (sometimes written as \[\rho(x)\], similar to density in physics) is then defined by \[f_{X}(x)=\frac{dF_{X}(x)}{dx}=F^{\prime}_{X}(x)\] if \[F_{X}(x)\] is differentiable at \[x\].

To show where this definition comes from, define the function \[f_{X}(x)\] as follows: \[f_{X}(x)=\lim_{\Delta\to 0^{+}}\frac{P(x<X\le x+\Delta)}{\Delta}\]. We use \[\left( x,x+\Delta \right]\] here purely so that the expression lines up exactly with the CDF since for a continuous \[X\] it wouldn't matter. Isolated points have zero measure in continuous space, so the probability of an open interval is the same as the probability of a closed interval. That is, \[P(a< X< b)=P(a\le X\le b)=P(a\le X<b)=P(a<X\le b)\].

Anyway, the function \[f_{X}(x)\] gives us the probability density at point \[x\]. It is the limit of the probability of the interval \[\left( x,x+\Delta \right]\] divided by the length of the interval as the length goes to zero. Now this looks awfully similar to the definition of a derivative. Remember that by definition of CDF, \[P(x<X\le X+\Delta)=F_{X}(x+\Delta)-F_{X}(x)\]. Thus, \[f_{X}(x)=\lim_{\Delta\to 0}\frac{F_{X}(x+\Delta)-F_{X}(x)}{\Delta}\], which we can finally write as \[\frac{dF_{X}(x)}{dx}=F^{'}_{X}(x)\].

Absolutely continuous (single variable) probability distributions

A random variable \[X\] has probability density \[f_{X}\], if \[P(a\le X\le b)=\int_{a}^{b}f_{X}(x)\,dx\]. If \[F_{X}\] is the cumulative distribution function of \[X\], then \[F_{X}(x)=\int_{-\infty}^{x}f_{X}(u)\,du\]. Then, by fundamental theorem of calculus, \[P(a\le x\le b)=F_{X}(b)-F_{X}(a)=\int_{a}^{b}f_{X}(u)\,du\].

Piecewise definition of PDFs

Starting with a simple example, let random variable \[X\] denote the time a person waits for an elevator to arrive. Suppose the longest one would need to wait for the elevator is two minutes, so the possible values of \[X\] are given by the interval \[\left[ 0,2 \right]\]. A possible PDF for \[X\] is given by \[f_{X}(x)=\begin{cases}x,&\text{for $0\le x<1$}\\2-x,&\text{for $1<x\le2$}\\0,&\text{otherwise}\end{cases}\]. So, if we wish to calculate the probability that a person waits less than 0.5 minutes for the elevator to arrive, then we simply calculate \[P(0\le X\le 0.5)=\int_{0}^{0.5}f_{X}(x)\,dx=\int_{0}^{0.5}x\,dx=0.125\].

probability density function