How to truncate a probability density function to a given interval while preserving its properties (well, some important properties at least)

Sometimes it occurs that one must find a probability density function (pdf) that should be “like” another one but should also be confined to a given interval of the support of the variable. However, maybe one is not entirely confident about whether the intuitive solution is mathematically justified. For example: does it preserve the moments? If not, does it preserve, at least, some properties of the “behaviour” of the original pdf?

Well, it is straightforward that there seems to be 2 ways of preserving (as much as we can imagine) the shape that the original pdf, let it be f_1, has over the interval, let it be [a,b] (1). That interval is a subset of the support of the r.v., with a,b \in \mathcal{R}  and b > a.

And what could we expect from “preserving the shape” but only good things!

Both of these solutions imply to treat different points of the support the same way:

a) to sum a constant K > 0 to f_1 within the interval and set the final pdf, let it be f_2, as equal to zero outside it, i.e.:

f_2(x) = \begin{cases} K + f_1(x) \quad \text{if } x \in [a,b] \\ 0 \quad \text{otherwise} \end{cases}

The value of K is chosen for f_2 to integrate to 1 within [a,b], i.e.:

\int_a^b{f_2(x)dx} = 1 \implies \int_a^b{\big( K+f_1(x) \big) dx} = 1 \implies

\implies \int_a^b{ Kdx} + \int_a^b{ f_1(x)dx} = 1 \implies K \int_a^b{ dx } + \int_a^b{ f_1(x)dx} = 1 \implies

\implies K(b-a) + \int_a^b{ f_1(x)dx} = 1 \implies

\implies K = \frac{ 1-\int_a^b{ f_1(x)dx} }{b-a }

b) to scale f_1 by a constant K > 1 within the interval and to set the final pdf, let it be f_3, as zero outside it:

f_3(x) = \begin{cases} K f_1(x) \quad \text{if } x \in [a,b] \\ 0 \quad \text{otherwise} \end{cases}

Again, K is chosen for f_3 to integrate to 1 in the interval:

\int_a^b{f_3(x)dx} = 1 \implies \int_a^b{K f_1(x) dx} = 1 \implies

\implies K \int_a^b{f_1(x)dx} = 1 \implies

\implies K = \frac { 1 }{ \int_a^b{f_1(x)dx} }

Both solutions are illustrated in the figure below with f_1 = \mathcal{N}(x; \mu = 0.3, \sigma^2 = 0.1^2), where the interval is, again, [a,b] (the other elements in the figure will enter the discussion soon).

truncategaussianIt can be observed in the figure, without much effort, that it is really difficult for the moments (expectation, variance…) to be preserved in any of these two solutions (the mode is indeed preserved, but that is not a moment!). Therefore we have to look for other probabilistic guarantees that f_2 and/or f_3 may offer.

The key here is to understand that the following property is the most important thing to preserve in many applications: that the probability of any random event in relation to the probability of any other random event (being both exclusive) should be unchanged from f_1 to f_2 and/or f_3. If that guarantee exists, then the truncated pdf will have a (probabilistic) behaviour that is really similar to the original pdf when looking at different places (=values of the variable) within the interval.

Let examine this assert with more detail.

A random event is, informally speaking(2), a set of possible outcomes of the variable; for our purposes, we will refer only to random events that are intervals on the support of the variable (the most common ones). In the previous figure, [c,d] and [e,f] are random events.

Two (or more) random events are exclusive if they cannot occur simultaneously(3). In the previous examples, both [c,d] and [e,f] are exclusive: the variable will take either a value within the first interval, a value within the second interval, or a value outside both, but never a value that belongs to the two intervals at the same time. If [c,d] would have intersected [e,f], then both would have been non-exclusive events (but would have been still valid).[a,b] and [e,f], for instance, are non-exclusive.

Finally, the probability of a random event (i.e., of the random variable to take a value within the interval of that event) is exactly the area of the pdf along that interval.

Therefore, the probability of an event in relation to the probability of another one (technically called the odds of both events when one is the complement of the other) can be expressed mathematically as the ratio between the former and the latter probabilities. For the two random events defined in the figure, the relation in probability under f_1 is:

\frac{P_{f_1}\big[x \in [c,d] \big]}{P_{f_1}\big[ x \in [e,f] \big]}

In summary, our goal is that the final, truncated pdf preserves the relation in probability of exclusive random events that lie within [a,b] with respect to the original pdf, f_1.

a) Does solution f_2 preserve the relation in probability?

Since f_2(x) = f_1(x) + K, \forall x \in [a,b] , we have that the probability of any random event [g,h] \subseteq [a,b], h > g is:

P_{f_2} \big[ x \in [g,h] \big] = \int_g^h {f_2(x) dx} = \int_g^h {\big( K + f_1(x) \big)dx} = K(h-g) + \int_g^h { f_1(x) dx }

Therefore, the relation in probability of two exclusive random events [c,d] \subseteq [a,b] and [e,f] \subseteq [a,b] is:

\frac{P_{f_2}\big[x \in [c,d] \big]}{P_{f_2}\big[ x \in [e,f] \big]} = \frac{ K(d-c) + \int_c^d { f_1(x) dx} }{ K(f-e) + \int_e^f { f_1(x) dx} }

As it is easily seen, there is no way that this equals the same relation in probability under f_1 unless K = 0, which is impossible if we want f_2 to integrate to 1.

Therefore f_2 does not preserve the relation in probability nor, consequently, the probabilistic behaviour of f_1 in [a,b].

b) Does solution f_3 preserve the relation in probability?

Using the same reasoning, the probability of any random event [g,h] \subseteq [a,b] under f_3 is:

P_{f_3} \big[ x \in [g,h] \big] = \int_g^h {f_3(x)} = \int_g^h {K f_1(x)dx } = K \int_g^h { f_1(x)dx }

Therefore, the relation in probability of two random events [c,d] \subseteq [a,b] and [e,f] \subseteq [a,b] is:

\frac{P_{f_3}\big[x \in [c,d] \big]}{P_{f_3}\big[ x \in [e,f] \big]} = \frac{ K \int_c^d { f_1(x) dx} }{ K \int_e^f { f_1(x) dx} } =  \frac{ \int_c^d { f_1(x) dx} }{ \int_e^f { f_1(x) dx} }

which is exactly the relation in probability under f_1.

Therefore f_3 does preserve the relation in probability and, consequently, the probabilistic behaviour of f_1 in [a,b].


We can create a new pdf from a given one by truncating the latter to an interval, and that preserves most(4) of the probabilistic behaviour of the former if we set the new pdf as a scaled version of the original pdf within the interval and zero outside it, being the scaling factor the one suitable for the new pdf to integrate to 1.

Alas, in general this solution will not preserve the moments of the original pdf, not even the first one (expectation)!

But the mode is preserved!

(1) For the sake of simplicity we only deal with univariate pdfs in this post. By the way: the author recommends not to read any more footnotes until you have read the entire text.

(2) Formally speaking, a random event is an element of the σ-algebra of the sample space of a probability space of a given stochastic process, sample space that is mapped by the random variable into a subset of the real numbers (in most cases). [ . . . ] Oook. You can forget about that stuff. You are getting it right at this moment if you think that, for practical uses, a random event is equivalent to a set of values that can be taken by the r.v., just like I say in the main text. Especially if you are an engineer and not a mathematician.

(3) It amuses me how the exact meaning of “simultaneous” is actually left to the user of the theory of probability (3.1). In engineering, simultaneity becomes “when outcomes of the stochastic process, which correspond to something that occurs in the physical world, occur in times that are indistinguishable from one another -you cannot order them-“. For being even more rigorous, random events do not occur; what occur are the mentioned outcomes, i.e., elements of the sample space of the probability space of the underlying stochastic process, sample space that I mentioned in the previous footnote. However, since it is difficult to think of more than one outcome occurring “simultaneously” in reality (stochastic processes usually only provide some result at a given time), the language is slightly stretched by assuming that if an outcome occurs, any random event that contains the outcome occurs too.

(3.1) I’m amused all the time by these tiny, apparently-no-one-caring-about details; they provide so much fun!

(4) Recall that, after all, we have only reasoned with random events that are intervals (ook! random events that are mapped to intervals . . . (4.1) ), not with any random event. Not to mention all other simplifications we have made for the sake of writing a minimally educationally efficient text. Extending this modest post to cover all these especial cases and refinements ignored here is left to the reader that still has enough sanity points.

(4.1) Hey! You really understood the second footnote, didn’t you?