Demystification of Probability Notation

2020/04/26

It is quite common to abuse notation when working with probability distributions. E.g., when two random variables $X$ and $Y$ are independent it is common to see $$ \begin{align} & p(x, y) = p(x)p(y). \end{align} $$

When presented with this notation it can be difficult to understand where the random variable ($X$) is, the realization of this random variable ($x$), and that $p(x)$ and $p(y)$ are two different functions. A proper way to write the equation above is $$ \begin{align} & p_{X,Y}(x, y) = p_X(x)p_X(y), \end{align} $$

where it becomes clear that $p_{X,Y}$, $p_X$ and $p_Y$ are three different functions.

The notation gets more confusing when computing expectations. The expectation operator is applied to a function of a random variable. In many works one reads $$ \begin{align} &\mathbb{E}_{p(x)}\left[ f(x) \right] = \int p(x)f(x)\mathrm{d}x. \end{align} $$

Again we assume that $p(x) \equiv p_X(x)$ and that $f(x)$ on the LHS is not a function of the realization but of the random variable. An ambiguous way to write the above would be $$ \begin{align} &\mathbb{E}\left[ f(X) \right] = \int p_{X}(x)f(x)\mathrm{d}x. \end{align} $$

Summary

It is OK to abuse the notation when we know exactly where random variables and their realizations are. But in case we get confused, it can be easier to write everything explicitly for once.

comments powered by Disqus