The Lorentz (or Cauchy) distribution
\begin{equation}
\label{eq:def}
f(x) = \frac{1}{\pi \gamma}\frac{1}{1+ \left (\frac{x - x_0}{\gamma} \right )^2 }
\end{equation}is pathological, in that it has no finite moments apart from the zero-order one (which ensures proper normalization of the density function.) Many results we usually take for granted (e.g. the central limit theorem) do not apply and, when sampled, the sample mean and sample variance are not good predictors for the position parameters \(x_0\) and \(\gamma\). This should not come as a surprise, since the distribution mean and variance do not even exist. How can we then estimate the position parameters?
Note that \(x_0\) is the median of the distribution and \(\gamma\) is its HWHM (and half the interquartile range); we can then hope that they are well described by the corresponding properties of the sample. This is indeed the case, and one can show that, for a sample of size \(n = 2k +1\) (with \(k \geq 0\) integer) the probability density function (pdf) of the sample median \(m\) is:
\begin{equation}\begin{equation}
\label{eq:def}
f(x) = \frac{1}{\pi \gamma}\frac{1}{1+ \left (\frac{x - x_0}{\gamma} \right )^2 }
\end{equation}is pathological, in that it has no finite moments apart from the zero-order one (which ensures proper normalization of the density function.) Many results we usually take for granted (e.g. the central limit theorem) do not apply and, when sampled, the sample mean and sample variance are not good predictors for the position parameters \(x_0\) and \(\gamma\). This should not come as a surprise, since the distribution mean and variance do not even exist. How can we then estimate the position parameters?
Note that \(x_0\) is the median of the distribution and \(\gamma\) is its HWHM (and half the interquartile range); we can then hope that they are well described by the corresponding properties of the sample. This is indeed the case, and one can show that, for a sample of size \(n = 2k +1\) (with \(k \geq 0\) integer) the probability density function (pdf) of the sample median \(m\) is:
\label{eq:result}
g(x) = \frac{n!}{(k!)^2 \, \pi ^{n} \, \gamma} \left [ \frac{\pi ^2}{4} - \arctan ^2 \left (\frac{x - x_0}{\gamma} \right )\right ]^k \frac{1}{1+ \left (\frac{x - x_0}{\gamma} \right )^2 }
\end{equation}
The proof of this result cannot be found e.g. in Rider [1] who refers to Fisher [2], whose only justification is the annoying comment "It is easy to see that...".
We start by noting that the median is the \((k+1)\)-th order statistic \(m = x_{(k+1)}\) , i.e. the \((k+1)\)-th element in the ordered sequence of observations. Rather than try to estimate the density function \(g(x)\), it is easier to work with the cumulative distribution function (cdf) \(G(x) = P(x_{(k+1)} \leq x)\), because \(x_{(k+1)} \leq x\) implies that at least \(k+1\) among the (unordered) observations \(x_j\) are below \(x\) [3, (2.1.3)]:
\begin{equation}We start by noting that the median is the \((k+1)\)-th order statistic \(m = x_{(k+1)}\) , i.e. the \((k+1)\)-th element in the ordered sequence of observations. Rather than try to estimate the density function \(g(x)\), it is easier to work with the cumulative distribution function (cdf) \(G(x) = P(x_{(k+1)} \leq x)\), because \(x_{(k+1)} \leq x\) implies that at least \(k+1\) among the (unordered) observations \(x_j\) are below \(x\) [3, (2.1.3)]:
\label{eq:Gk}
G(x) = \sum_{p=k+1}^{n} C_n^p\, F^p(x) \left [ 1 - F(x)\right ]^{n-p}
\end{equation}
where
\begin{equation}
\label{eq:Fdef}
F(x) = \frac{1}{2}+\frac{1}{\pi}\arctan \left ( \frac{x-x_0}{\gamma} \right )
\end{equation}
is the cdf of the Lorentz distribution \eqref{eq:def} and \(C_n^p = \left ( n \atop p \right )\) is the binomial coefficient, giving the number of ways one can choose \(p\) observations out of the total of \(n\). Thus, each term in \eqref{eq:Gk} gives the probability of having exactly \(p\) elements below \(x\), while \(n-p\) are above \(x\).
The pdf \(g(x)\) is simply the derivative of \(G(x)\). Let us derivate term by term the rhs of \eqref{eq:Gk}:
\begin{equation}The pdf \(g(x)\) is simply the derivative of \(G(x)\). Let us derivate term by term the rhs of \eqref{eq:Gk}:
\label{eq:Gderiv}
g(x) = G'(x) = f(x) \sum_{p=k+1}^{n} C_n^p\, \left \lbrace p F^{p-1}(x) \left [ 1 - F(x)\right ]^{n-p} - (n-p) F^p(x) \left [ 1 - F(x)\right ]^{n-p-1} \,\right \rbrace
\end{equation}
and every negative term \(p\) on the rhs of \eqref{eq:Gderiv} cancels the \(p+1\) positive term. We are then left with the first positive term (with \(p = k+1\)), since the last negative term vanishes.
\begin{equation}\label{eq:Gderivfin}
g(x) = f(x) \frac{n!}{(k!)^2} F^{k}(x) \left [ 1 - F(x)\right ]^{k}
\end{equation}
Plugging \eqref{eq:def} and \eqref{eq:Fdef} in \eqref{eq:Gderivfin} yields \eqref{eq:result}. For \(n=1\) (and hence \(k=0\)), \(g(x) = f(x)\), because the median of a sample of one is the unique observation. As \(n\) increases, \(g(x)\) becomes more and more narrow (and \(m\) better and better localized), as seen in the Figure below for \(x_0 = 0\) and \(\gamma = 1\):
We'll look at the behaviour of \(g(x)\) in a future post.
[1] P. R. Rider, Variance of the Median of Samples from a Cauchy Distribution. Journal of the American Statistical Association 55, 322–323 (1960).↩
[2] R. A. Fisher, Theory of Statistical Estimation. Mathematical Proceedings of the Cambridge Philosophical Society 22, 700-725 (1925).↩
[3] H. A. David and H. N. Nagaraja, Order Statistics (3rd edition), John Wiley & Sons (2003).↩
No comments:
Post a Comment