Spring 1998 John Rust
Economics 551b 37 Hillhouse, Rm. 27
Empirical Process Proof of the Asymptotic Distribution of Sample Quantiles
Definition: Given , the
quantile of a random variable
with CDF F is defined by:
Note that is the median,
is the
percentile, etc.
Further if we define the
quantile as
and define
similarly, it is easy to see that these are
the lower and upper points in the support of
(i.e. the minimum
and maximum possible values of
which might
be
and
if
has unbounded support).
Note also that
if F is strictly increasing in a neighborhood of
, then
is the usual inverse
of the CDF F. If F happens to have ``flat'' sections, say an
interval of points x satisfying
, then
is the smallest x in this interval. The following lemma, a slightly
modified version of a lemma from
R. J. Serfling, (1980) Approximation Theorems of Mathematical Statistics
Wiley, New York, provides some basic properties of the
quantile function
:
Lemma 1: Let F be a CDF. The quantile function
,
is non-decreasing and left
continuous, and satisfies:
Definition: Let be
a random sample of size N from a CDF F. Then the sample
quantile
,
is defined by:
where is the empirical CDF defined by:
Thus is the sample median
,
and
is the sample minimum,
,
and
is the sample maximum,
.
Since empirical CDF's have jumps of size 1/N (unless more
than one of the
's take the same value), then
we can bound the maximum difference between
and
in Lemma 1-2 as follows:
Lemma 2: Let be
a random sample from a CDF F and suppose that in
this sample each
happens to be
distinct, so that by reindexing we have
. Then for all
we have:
The following theorem shows that the asymptotic distribution of the
sample quantiles for
are
normally distributed. It is important to note that we exclude the
two cases
and
in this theorem since the
asymptotic distribution of these extreme value statistics
is very different and generally non-normal.
Theorem: Let
be IID draws from a CDF F with continuous density f. Then if
, we have:
where:
Proof: The Central Limit Theorem for IID random variables implies that for any x in the support of F we have:
where . Letting
and using Lemma 1-3 we have:
Furthermore, the property
of stochastic equicontinuity from the theory
of empirical processes (see D. Andrews, (1996)
Handbook of Econometrics (vol. 4) for an accessible introduction
and definition of stochastic equicontinuity), we have that the
result given above is unaffected if we replace by
a consistent estimate
:
Now note that Lemma 1-2 implies that
However since the true CDF F has a density, the probability of
observing duplicate 's is zero, so Lemma 2 implies
that with probability 1 we have:
which implies that:
Now we apply the Delta theorem,
i.e. we do a Taylor series expansion of about
the limiting point
to get:
where is a point on the line segment between
and
. Using the result
above and Lemma 1-3 we have:
where we have used Slutsky's Theorem and
the fact that
since
with probability 1.