# Stein's Paradox
Given a single data point $x$ out of a distribution, it is intuitive that the best estimator for $\mu$ for the distribution is given by $x$
![[estimator_stein_example.png]]
However, when you have more than 2 distributions, which can be independent, $x$ is no longer the best estimator. In this case, the James-Stein estimator performs better.
## James-Stein Estimator (1961)
Take the follow three independent sets of distributions (following the Gaussian). Given these, what is the best estimate for $\mu_1$ given a single data point $x_1$?
![[stien_pdox.png]]
Given the data points:
$
\begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix}
$
The best estimate for $\mu_1$ is given by
$\bigg(1 - \frac{1}{x_1^2 + x_2^2 + x_3^2}\bigg) \begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix}$
This generalizes as follows:
$\bigg(1 - \frac{1}{x_1^2 + ... + x_p^2}\bigg) \begin{pmatrix}x_1\\\vdots\\x_p\end{pmatrix}$
## Relations
- [[Point Estimation]]