24 January 2016

Sample median for a Lorentz (Cauchy) distribution - part 2

In a previous post I derived the distribution function for the median of a sample of size \(n = 2k +1\) (with \(k \geq 0\) integer) drawn from a Lorentz (or Cauchy) distribution:
\begin{equation}
\label{eq:result}
g(x) = \frac{n!}{(k!)^2 \, \pi ^{n} \, \gamma} \left [ \frac{\pi ^2}{4} - \arctan ^2 \left (\frac{x - x_0}{\gamma} \right )\right ]^k \frac{1}{1+ \left (\frac{x - x_0}{\gamma} \right )^2 }
\end{equation}
I will now consider some of its properties.

The Figure below shows \(g(x)\) with \(x_0 = 0\) and \(\gamma = 1\) for a few values of \(n\):

As the sample size increases, the distribution is more and more localized, as its tails decay faster. The Lorentzian already behaves as \(x^{-2}\) at infinity, and the parenthesis in front of it adds a factor \(x^{-k}\). For \(k \geq 1\) the distribution has a first moment: \( \left \langle x \right \rangle = x_0\), so the median is an unbiased estimator of the position parameter \(x_0\). For \(k \geq 2\) it also has a variance \( V = \left \langle (x - x_0)^2 \right \rangle \), which quantifies the tightness of the estimation [1]. Following Rider, I'll use the substitution \(\phi = \text{arccot}\left ( \frac{x - x_0}{\gamma}\right)\), yielding:

\begin{eqnarray}
\label{eq:integ}
V &= \frac{n!}{(k!)^2 \, \pi ^{n} \gamma} \int _{0}^{\pi} \, \mathrm{d} \phi \, \gamma \, (1+ \cot ^2 \phi) \left [ \frac{\pi}{2} - \arctan \left (\frac{x - x_0}{\gamma} \right )\right ]^k \nonumber \\
& \left [ \frac{\pi}{2} + \arctan \left (\frac{x - x_0}{\gamma} \right )\right ]^k \frac{\gamma ^2 \cot ^2 \phi}{1+ \cot ^2 \phi} \nonumber \\
&= \gamma ^2 \frac{2 n!}{(k!)^2 \, \pi ^{n}} \int _{0}^{\pi /2} \, \mathrm{d} \phi \, \phi ^k (\pi - \phi)^k \cot ^2 \phi = \gamma ^2 I_k
\end{eqnarray}
where \(I_k \sim \frac{8}{3 n}\) can be obtained by numerical integration and the standard deviation of the median \(\sqrt{V}\) decreases as \(1/\sqrt{n}\), as per the central limit theorem.
TO DO: I found the behaviour of \(I_k\) by inspecting the list of numerical values; try to get a more rigorous result. Compare with the large-\(n\) limit \(\frac{\pi^2}{4 n}\).


[1] P. R. Rider, Variance of the Median of Samples from a Cauchy Distribution. Journal of the American Statistical Association 55, 322–323 (1960).

No comments:

Post a Comment