law of the unconscious statistician
Law of the unconscious statistician
The law of the unconscious statistician is a theorem which expresses the expected value of a function \[g(X)\] of a random variable \[X\] in terms of \[g\] and the probability distribution of \[X\].
Discrete probability distribution
Let \[X\] be a discrete random variable, \[\mathcal{X}\] as the support and probability mass function \[f_{X}(x)\], then the expected value of \[g(X)\] is \[E(g(X))=\sum_{x\in \mathcal{X}}g(x)f_{X}(x)\].
Say we've computed \[E(X)\] for some distribution \[X\], \[E(X)=\sum_{x\in \mathcal{X}}xf_{X}(x)\]. Now we're looking to compute the variance, \[E(X^{2})-E(X)^{2}\]. Logically, we would need to compute \[E(X^{2})\], which is the expected value of a new distribution, \[Y=X^{2}\]. Well, the unconscious statistician often don't feel like computing another PMF, so instead just reasons by analogy that if \[E(X)=\sum_{x\in \mathcal{X}}xf_{X}(x)\], then surely we can simply replace the \[X\] with an \[X^{2}\], \[E(X^{2})=\sum_{x\in \mathcal{X}}x^{2}f_{X}(x)\]. This isn't very legitimate, but in general, this laziness turns out to be true, i.e. \[E(g(x))=\sum_{x\in\mathcal{X}}g(x)f_{X}(x)\]. This also explains the origins of the name, as some statisticians unknowingly present this as a definition of expected value rather than a theorem.
Now, we shall prove this theorem. Suppose \[g\] is differentiable and its inverse \[g^{-1}\] is monotonic. The expected value of \[Y=g(X)\] is defined as \[E(Y)=\sum_{y\in \mathcal{Y}}yf_{Y}(y)\]. Writing the PMF \[f_{Y}(y)\] in terms of \[y=g(x)\], we get, \[E(g(X))=\sum_{y\in \mathcal{Y}}y\cdot P(g(x)=y)\] where \[P\] is the probability measure. Then,
Note that since \[\forall y,x=g^{-1}(y)\] is equivalent to just saying \[\forall x\], we conclude that \[E(g(X))=\sum_{x\in \mathcal{X}}g(x)f_{X}(x)\].
Continuous probability distribution
Similarly, if \[X\] is continuous with probability density function \[f_{X}\], then the expected value of \[g(X)\] is \[E(g(X))=\int_{-\infty}^{\infty}g(x)f_{X}(x)\,dx\].
Let \[y=g(x)\] and \[F_{X}\] denote the cumulative distribution function of random variable \[X\]. By inverse function rule, \[\frac{d}{dy}g^{-1}(y)=\frac{1}{g^{\prime}(g^{-1}(y))}\]. Substitute \[x=g^{-1}(y)\], \[dx=\frac{1}{g^{\prime}(g^{-1}(y))}dy\], then \[E(g(X))=\int_{-\infty}^{\infty}g(g^{-1}(y))f_{X}(g^{-1}(y))\frac{1}{g^{\prime}(g^{-1}(y))}\,dy\]. Next, considering the relationship between the CDF of \[Y\] and \[X\], \[F_{Y}(y)=P(Y\le y)=P(g(X)\le Y)=P(X\le g^{-1}(y))=F_{X}(g^{-1}(y))\]. Then, differentiating \[f_{Y}(y)\],
With this result, we can conclude that \[E(g(X))=\int_{-\infty}^{\infty}g(g^{-1}(y))f_{X}(g^{-1}(y))\frac{1}{g^{\prime}(g^{-1}(y))}\,dy=\int_{-\infty}^{\infty}yf_{Y}(y)\,dy\]. Therefore, \[E(g(X))=\int_{-\infty}^{\infty}g(x)f_{X}(x)\,dx\].
Referenced by:
No backlinks found.