next up previous
Next: About this document

Spring 1998 John Rust
Economics 551b 37 Hillhouse, Rm. 27

Empirical Process Proof of the Asymptotic Distribution of Sample Quantiles

Definition: Given tex2html_wrap_inline128 , the tex2html_wrap_inline130 quantile of a random variable tex2html_wrap_inline132 with CDF F is defined by:

displaymath100

Note that tex2html_wrap_inline136 is the median, tex2html_wrap_inline138 is the tex2html_wrap_inline140 percentile, etc. Further if we define the tex2html_wrap_inline142 quantile as tex2html_wrap_inline144 and define tex2html_wrap_inline146 similarly, it is easy to see that these are the lower and upper points in the support of tex2html_wrap_inline132 (i.e. the minimum and maximum possible values of tex2html_wrap_inline132 which might be tex2html_wrap_inline152 and tex2html_wrap_inline154 if tex2html_wrap_inline132 has unbounded support). Note also that if F is strictly increasing in a neighborhood of tex2html_wrap_inline160 , then tex2html_wrap_inline162 is the usual inverse of the CDF F. If F happens to have ``flat'' sections, say an interval of points x satisfying tex2html_wrap_inline170 , then tex2html_wrap_inline160 is the smallest x in this interval. The following lemma, a slightly modified version of a lemma from R. J. Serfling, (1980) Approximation Theorems of Mathematical Statistics Wiley, New York, provides some basic properties of the quantile function tex2html_wrap_inline176 :

Lemma 1: Let F be a CDF. The quantile function tex2html_wrap_inline176 , tex2html_wrap_inline128 is non-decreasing and left continuous, and satisfies:

1.
tex2html_wrap_inline184

2.
tex2html_wrap_inline186

3.
If F is strictly increasing in a neighborhood of tex2html_wrap_inline162 we have: tex2html_wrap_inline192 and tex2html_wrap_inline194 .

4.
tex2html_wrap_inline196 if and only if tex2html_wrap_inline198 .

Definition: Let tex2html_wrap_inline200 be a random sample of size N from a CDF F. Then the sample quantile tex2html_wrap_inline206 , tex2html_wrap_inline128 is defined by:

displaymath101

where tex2html_wrap_inline210 is the empirical CDF defined by:

displaymath102

Thus tex2html_wrap_inline212 is the sample median tex2html_wrap_inline214 , and tex2html_wrap_inline216 is the sample minimum, tex2html_wrap_inline218 , and tex2html_wrap_inline220 is the sample maximum, tex2html_wrap_inline222 . Since empirical CDF's have jumps of size 1/N (unless more than one of the tex2html_wrap_inline226 's take the same value), then we can bound the maximum difference between tex2html_wrap_inline228 and tex2html_wrap_inline230 in Lemma 1-2 as follows:

Lemma 2: Let tex2html_wrap_inline200 be a random sample from a CDF F and suppose that in this sample each tex2html_wrap_inline236 happens to be distinct, so that by reindexing we have tex2html_wrap_inline238 . Then for all tex2html_wrap_inline128 we have:

displaymath103

The following theorem shows that the asymptotic distribution of the sample quantiles tex2html_wrap_inline206 for tex2html_wrap_inline128 are normally distributed. It is important to note that we exclude the two cases tex2html_wrap_inline246 and tex2html_wrap_inline248 in this theorem since the asymptotic distribution of these extreme value statistics is very different and generally non-normal.

Theorem: Let tex2html_wrap_inline200 be IID draws from a CDF F with continuous density f. Then if tex2html_wrap_inline256 , we have:

displaymath104

where:

displaymath105

Proof: The Central Limit Theorem for IID random variables implies that for any x in the support of F we have:

displaymath106

where tex2html_wrap_inline262 . Letting tex2html_wrap_inline264 and using Lemma 1-3 we have:

displaymath107

Furthermore, the property of stochastic equicontinuity from the theory of empirical processes (see D. Andrews, (1996) Handbook of Econometrics (vol. 4) for an accessible introduction and definition of stochastic equicontinuity), we have that the result given above is unaffected if we replace tex2html_wrap_inline160 by a consistent estimate tex2html_wrap_inline206 :

displaymath108

Now note that Lemma 1-2 implies that

displaymath109

However since the true CDF F has a density, the probability of observing duplicate tex2html_wrap_inline226 's is zero, so Lemma 2 implies that with probability 1 we have:

displaymath110

which implies that:

displaymath111

Now we apply the Delta theorem, i.e. we do a Taylor series expansion of tex2html_wrap_inline274 about the limiting point tex2html_wrap_inline162 to get:

displaymath112

where tex2html_wrap_inline278 is a point on the line segment between tex2html_wrap_inline206 and tex2html_wrap_inline160 . Using the result above and Lemma 1-3 we have:

displaymath113

where we have used Slutsky's Theorem and the fact that tex2html_wrap_inline284 since tex2html_wrap_inline286 with probability 1.



next up previous
Next: About this document

John Rust
Sat Nov 1 17:19:01 CST 1997