The product of two gaussian pdfs is not a pdf, but it is gaussian (a.k.a. loving algebra)

09/11/2010  

(A brief note for the spanish readers: lo siento, tenía ganas desde hace mucho tiempo de añadir notas técnicas -dan menos disgustos que las entradas sobre irritaciones-, quería apuntar la solución a un problema que me ha surgido en más de una ocasión, y, además, resulta que los sitios de internet donde lo he encontrado ni se molestan en desarrollarlo -de ahí que lo ponga en inglés-. No sólo de tontás vive el ingeniero ;P )

Gaussian probability density functions (pdf), which have two parameters and this nice syntax -in the unidimensional case-:

p(x; \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2 \sigma^2}}

and, of course, with an integral over x \in \left [ -\infty,+\infty \right ] that yields 1 (as any pdf), are the most common formalization for the shape of uncertainty in science. When we do not know why and how some measurement varies over time in spite of the underlying system staying invariant (apparently), and in addition have no idea about the internals of the process that is producing that measurement, and in addition we do know that measurements move around a central value with high simmetry, it is usual to bring the familiar (and beautiful) graph of a gaussian to light:

The main reason for the prevalence of this pdf is the formidable evidence that supports the Central Limit Theorem, by which, for example, the sum of many measurements taken from a given random process -independently-, divided by the number of them, is a new measurement which tends to have a gaussian shape, more and more gaussian as that number of measurements increases. Actually, the CLT works very well in so many disciplines and problems, even when the measurements do not come from identically distributed processes, that gaussians are taken as the default probabilistic model of uncertainty in countless situations.

Not to mention the important mathematical simplifications that are enabled when the gaussian assumption is made 😉

Well, sometimes you find in your calculations products of gaussian functions. Some of those times what you actually find are products of gaussian pdfs, which are a subset of all the possible gaussian functions; for example, when you assume independent, multiple measurements in the likelihood factor of the bayes rule (ok, in estimation theory that factor is evaluated pointwise and it is common not to need any closed form for it -it is just a number after all-, but at other times you need to know that form in order to figure out its influence on the prior; or maybe it is just you are curious 😉 ).

Anyway, the difference between a gaussian function and a gaussian pdf is that the former does not need to integrate to 1 in its whole support.

You can find easily in Internet that the product of two gaussian functions is a gaussian function. That is true, although it is rare -I could’nt- to find an algebraic demonstrarion of why. Moreover, in some places you can find even the value for the parameters of the resulting gaussian pdf of the product of two gaussian pdfs, but again an algebraical deduction is missing, and, more importantly…

… it turns out that the product of two gaussian pdfs is not a gaussian pdf! It is a gaussian function, or, in other words, a scaled gaussian pdf (which cannot be called a pdf at all!).

The purpose of this post is to show why is that so, thus let us go for the demonstration.

Let p1 and p2 be two gaussian pdfs with the same support:

p1(x; \mu_1, \sigma_1)=\frac{1}{\sqrt{2 \pi {\sigma_1}^2}} e^{-\frac{(x-\mu_1)^2}{2 {\sigma_1}^2}} \quad p2(x; \mu_2, \sigma_2)=\frac{1}{\sqrt{2 \pi {\sigma_2}^2}} e^{-\frac{(x-\mu_2)^2}{2 {\sigma_2}^2}}

Their product is:

p1(x; \mu_1, \sigma_1) p2(x; \mu_2, \sigma_2) =     \frac{1}{\sqrt{2 \pi {\sigma_1}^2}} e^{-\frac{(x-\mu_1)^2}{2 {\sigma_1}^2}}    \frac{1}{\sqrt{2 \pi {\sigma_2}^2}} e^{-\frac{(x-\mu_2)^2}{2 {\sigma_2}^2}} =

    \frac{1}{\sqrt{2 \pi {\sigma_1}^2 {\sigma_2}^2 2 \pi }}         e^{ -\frac{(x-\mu_1)^2}{2 {\sigma_1}^2} - \frac{(x-\mu_2)^2}{2 {\sigma_2}^2} } =    \frac{1}{\sqrt{2 \pi {\sigma_1}^2 {\sigma_2}^2 2 \pi }}         e^{ \frac{ - {\sigma_2}^2 (x-\mu_1)^2 - {\sigma_1}^2 (x-\mu_2)^2 } { 2 {\sigma_1}^2 {\sigma_2}^2 } }

The first trick: since { {\sigma_{1}}^{2} + {\sigma_{2}}^{2} } is greater than zero (otherwise we are talking about the product of two Dirac’s delta functions, which would be 0 unless \mu_1 = \mu_2 ), we can complicate the previous product a little bit without altering its value:

p1 p2 =    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} ( {\sigma_1}^2 + {\sigma_2}^2 ) 2 \pi }}         e^{ \frac{ - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_1)^2 - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_2)^2 } { 2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} } }

Now we can separate a factor which resembles the scale factor of a gaussian pdf:

p1 p2 =    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} }}     \frac{1}{\sqrt{2 \pi ( {\sigma_1}^2 + {\sigma_2}^2 ) }}         e^{ \frac{ - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_1)^2 - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_2)^2 } { 2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} } } =

    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} }}     e^{ ln { \frac{1}{\sqrt{2 \pi ( {\sigma_1}^2 + {\sigma_2}^2 ) }} }  }        e^{ \frac{ - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_1)^2 - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_2)^2 } { 2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} } } =

    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} }}     e^{ ln { \frac{1}{\sqrt{2 \pi ( {\sigma_1}^2 + {\sigma_2}^2 ) }} }  +           \frac{ - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_1)^2 - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_2)^2 } { 2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} } } =

    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} }}     e^{  - \frac{1}{2} ln ( 2 \pi ( {\sigma_1}^2 + {\sigma_2}^2 ) )  +           \frac{ - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_1)^2 - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_2)^2 } { 2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} } }

Ok, it is time to focus only on the exponent of e. Firstly, we incorporate the previous scale factor into it:

    \frac        {             - ( \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2}) ln ( 2 \pi ( {\sigma_1}^2 + {\sigma_2}^2 ) )              - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_1)^2 - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x-\mu_2)^2          }          {             2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2}          }

Then we expand the square terms that include x:

    \frac        {             - ( \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2}) ln ( 2 \pi ( {\sigma_1}^2 + {\sigma_2}^2 ) )  +            - \frac { {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x^2 + {\mu_1}^2 -2x \mu_1)             - \frac{ {\sigma_1}^2 }{{\sigma_1}^2 + {\sigma_2}^2} (x^2 + {\mu_2}^2 -2x \mu_2)          }          {             2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2}          }

And then we collect the numerator as a polynomial in x:

    \frac        {            x^2 ( \frac {- {\sigma_1}^2 - {\sigma_2}^2 }{ {\sigma_1}^2 + {\sigma_2}^2 } )            - 2x ( \frac {- \mu_1 {\sigma_2}^2 - \mu_2 {\sigma_1}^2 }{ {\sigma_1}^2 + {\sigma_2}^2} )            + ( \frac { - {\mu_1}^2 {\sigma_2}^2 - {\mu_2}^2 {\sigma_1}^2 - {\sigma_1}^2 {\sigma_2}^2 ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) }{ {\sigma_1}^2 + {\sigma_2}^2 } )         }         {             2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{ {\sigma_1}^2 + {\sigma_2}^2 }          } =

    \frac        {            - x^2            + 2x ( \frac { \mu_1 {\sigma_2}^2 + \mu_2 {\sigma_1}^2 }{ {\sigma_1}^2 + {\sigma_2}^2} )            - ( \frac { {\mu_1}^2 {\sigma_2}^2 + {\mu_2}^2 {\sigma_1}^2 + {\sigma_1}^2 {\sigma_2}^2 ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) }{ {\sigma_1}^2 + {\sigma_2}^2 } )         }         {             2 \frac { {\sigma_1}^2 {\sigma_2}^2 }{ {\sigma_1}^2 + {\sigma_2}^2 }          }

The second trick: a polynomial of the form -x^2+2Ax-(A^2+C) turns out to be equivalent to -(x-A)^2-C, which is the form of the exponent of an unidimensional gaussian pdf plus a constant term. Casually, we had in the above expresion that polynomial if we do:

A=\frac { \mu_1 {\sigma_2}^2 + \mu_2 {\sigma_1}^2 }{ {\sigma_1}^2 + {\sigma_2}^2}

A^2+C=\frac { {\mu_1}^2 {\sigma_2}^2 + {\mu_2}^2 {\sigma_1}^2 + {\sigma_1}^2 {\sigma_2}^2 ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) }{ {\sigma_1}^2 + {\sigma_2}^2 }

We are just a few steps from knowing how close is our product of gaussian pdfs to yield another gaussian pdf! But first, let us take a short detour to obtain the value of C in our expression:

C=\frac { {\mu_1}^2 {\sigma_2}^2 + {\mu_2}^2 {\sigma_1}^2 + {\sigma_1}^2 {\sigma_2}^2 ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) }{ {\sigma_1}^2 + {\sigma_2}^2 } - A^2=

     \frac { {\mu_1}^2 {\sigma_2}^2 + {\mu_2}^2 {\sigma_1}^2 + {\sigma_1}^2 {\sigma_2}^2 ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) }{ {\sigma_1}^2 + {\sigma_2}^2 } -     ( \frac { \mu_1 {\sigma_2}^2 + \mu_2 {\sigma_1}^2 }{ {\sigma_1}^2 + {\sigma_2}^2} )^2 =

\frac  {      ( {\mu_1}^2 {\sigma_2}^2 + {\mu_2}^2 {\sigma_1}^2 + {\sigma_1}^2 {\sigma_2}^2 ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) ) ({\sigma_1}^2 + {\sigma_2}^2) -      (\mu_1 {\sigma_2}^2 + \mu_2 {\sigma_1}^2)^2  }  {      ({\sigma_1}^2 + {\sigma_2}^2)^2  }

Let \alpha be an abbreviation of that disturbing ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) and proceed with expanding terms:

C=\frac  {      ( {\mu_1}^2 {\sigma_2}^2 + {\mu_2}^2 {\sigma_1}^2 + {\sigma_1}^2 {\sigma_2}^2 \alpha ) ({\sigma_1}^2 + {\sigma_2}^2) -      (\mu_1 {\sigma_2}^2 + \mu_2 {\sigma_1}^2)^2  }  {      ({\sigma_1}^2 + {\sigma_2}^2)^2  } =

\frac  {      {\mu_1}^2 {\sigma_2}^4 + {\mu_2}^2 {\sigma_1}^2 {\sigma_2}^2 + {\sigma_1}^2 {\sigma_2}^4 \alpha + {\mu_1}^2 {\sigma_2}^2 {\sigma_1}^2 + {\mu_2}^2 {\sigma_1}^4 + {\sigma_1}^4 {\sigma_2}^2 \alpha - {\mu_1}^2 {\sigma_2}^4 - {\mu_2}^2 {\sigma_1}^4 - 2{\mu_1}{\mu_2} {\sigma_1}^2 {\sigma_2}^2  }  {      ({\sigma_1}^2 + {\sigma_2}^2)^2  } =

\frac  {      {\mu_2}^2 {\sigma_1}^2 {\sigma_2}^2 + {\sigma_1}^2 {\sigma_2}^4 \alpha + {\mu_1}^2 {\sigma_2}^2 {\sigma_1}^2 + {\sigma_1}^4 {\sigma_2}^2 \alpha - 2{\mu_1}{\mu_2} {\sigma_1}^2 {\sigma_2}^2  }  {      ({\sigma_1}^2 + {\sigma_2}^2)^2  } =

\frac  {      {\sigma_1}^2 {\sigma_2}^2 ( {\mu_2}^2 + {\sigma_2}^2 \alpha + {\mu_1}^2 + {\sigma_1}^2 \alpha - 2{\mu_1}{\mu_2} )  }   {      ({\sigma_1}^2 + {\sigma_2}^2)^2  } =

\frac  {      {\sigma_1}^2 {\sigma_2}^2 ( ( {\mu_1} - {\mu_2} )^2 + ({\sigma_1}^2 + {\sigma_2}^2) \alpha )  }   {      ({\sigma_1}^2 + {\sigma_2}^2)^2  } =

\frac { {\sigma_1}^2 {\sigma_2}^2 }{ ({\sigma_1}^2 + {\sigma_2}^2)^2 } ({\mu_1} - {\mu_2} )^2+\frac { {\sigma_1}^2 {\sigma_2}^2 }{ {\sigma_1}^2 + {\sigma_2}^2 } ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) )

Ook. We have already values for A and C. Using those letters, the resulting product of the two gaussian pdfs has become:

p1p2=    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} }}    e^{ \frac{-(x-A)^2-C}{2 \frac{{\sigma_1}^2 {\sigma_2}^2}{{\sigma_1}^2 + {\sigma_2}^2}} } =    \frac{1}{\sqrt{2 \pi \frac { {\sigma_1}^2 {\sigma_2}^2 }{{\sigma_1}^2 + {\sigma_2}^2} }}    e^{ \frac{-(x-A)^2}{2 \frac{{\sigma_1}^2 {\sigma_2}^2}{{\sigma_1}^2 + {\sigma_2}^2}} }   e^{- \frac {C}{2 \frac{{\sigma_1}^2 {\sigma_2}^2}{{\sigma_1}^2 + {\sigma_2}^2}} }

Notice that the first factor and the first exponential are a gaussian pdf. The whole result, however, is just a gaussian function, unless the second exponential equals 1…

Let us take a closer look to that second exponential, the reason why our result is not a gaussian pdf but a gaussian function:

   e^{- \frac {C}{2 \frac{{\sigma_1}^2 {\sigma_2}^2}{{\sigma_1}^2 + {\sigma_2}^2}} } =   e^{- \frac { \frac { {\sigma_1}^2 {\sigma_2}^2 }{ ({\sigma_1}^2 + {\sigma_2}^2)^2 } ({\mu_1} - {\mu_2} )^2+\frac { {\sigma_1}^2 {\sigma_2}^2 }{ {\sigma_1}^2 + {\sigma_2}^2 } ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) }        {2 \frac{{\sigma_1}^2 {\sigma_2}^2}{{\sigma_1}^2 + {\sigma_2}^2}} } =

   e^{- \frac { ({\mu_1} - {\mu_2} )^2 }{ 2 ({\sigma_1}^2 + {\sigma_2}^2) }         - \frac{1}{2} ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) )       } =   e^{- \frac { ({\mu_1} - {\mu_2} )^2 }{ 2( {\sigma_1}^2 + {\sigma_2}^2 )} }   e^{- \frac{1}{2} ln( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) ) } =

   \frac {1}{ \sqrt{2 \pi ({\sigma_1}^2 + {\sigma_2}^2)} }   e^{- \frac { ({\mu_1} - {\mu_2} )^2 }{ 2({\sigma_1}^2 + {\sigma_2}^2) } }

Surprisingly, this last expression is a gaussian pdf if we consider \mu_1 a variable, i.e., p(\mu_1; \mu_2, \sqrt{({\sigma_1}^2+{\sigma_2}^2)}) (we can also consider \mu_2 the variable). But what we are interested in is knowing under which conditions this expression equals 1, and, thus, the product of our original pdfs p1 and p2 is actually a gaussian pdf:

   \frac {1}{ \sqrt{2 \pi ({\sigma_1}^2 + {\sigma_2}^2)} }   e^{- \frac { ({\mu_1} - {\mu_2} )^2 }{ 2({\sigma_1}^2 + {\sigma_2}^2) } } = 1  \Leftrightarrow    - \frac { ({\mu_1} - {\mu_2} )^2 }{ 2({\sigma_1}^2 + {\sigma_2}^2) }   = ln ( \sqrt{2 \pi ({\sigma_1}^2 + {\sigma_2}^2)} )  \Leftrightarrow

   - \frac { ({\mu_1} - {\mu_2} )^2 }{ 2({\sigma_1}^2 + {\sigma_2}^2) }   = \frac {1}{2} ln ( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) )  \Leftrightarrow

   -  ({\mu_1} - {\mu_2} )^2    = ({\sigma_1}^2 + {\sigma_2}^2) ln ( 2 \pi ({\sigma_1}^2 + {\sigma_2}^2) )

Since { {\sigma_{1}}^{2} + {\sigma_{2}}^{2} } is greater than zero, and the expectations are independent from the variances, we will always find cases (infinite cases, actually) in which the equality does not hold. Thus, we will find infinite cases where the product of two gaussian pdfs is a gaussian function but not a gaussian pdf.

Nevertheless: the scaled gaussian function obtained as the result of the product has the following pdf-like parameters:

\mu_{1 x 2} = A = \frac { \mu_1 {\sigma_2}^2 + \mu_2 {\sigma_1}^2 }{ {\sigma_1}^2 + {\sigma_2}^2}

{\sigma_{1 x 2}}^2 = \frac{{\sigma_1}^2 {\sigma_2}^2}{{\sigma_1}^2 + {\sigma_2}^2}

which are the same you can find in the Internet… But remember: it is not a gaussian pdf even when we use these parameters to describe it!

facebooktwitterredditlinkedintumblrmail

Un comentario sobre “The product of two gaussian pdfs is not a pdf, but it is gaussian (a.k.a. loving algebra)

  1. Pingback: Lithographica » Blog Archive » Esto es un estadístico (gordo) que va con una Wii y…

Los comentarios están cerrados.