# Stein's Paradox Given a single data point $x$ out of a distribution, it is intuitive that the best estimator for $\mu$ for the distribution is given by $x$ ![[estimator_stein_example.png]] However, when you have more than 2 distributions, which can be independent, $x$ is no longer the best estimator. In this case, the James-Stein estimator performs better. ## James-Stein Estimator (1961) Take the follow three independent sets of distributions (following the Gaussian). Given these, what is the best estimate for $\mu_1$ given a single data point $x_1$? ![[stien_pdox.png]] Given the data points: $ \begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix} $ The best estimate for $\mu_1$ is given by $\bigg(1 - \frac{1}{x_1^2 + x_2^2 + x_3^2}\bigg) \begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix}$ This generalizes as follows: $\bigg(1 - \frac{1}{x_1^2 + ... + x_p^2}\bigg) \begin{pmatrix}x_1\\\vdots\\x_p\end{pmatrix}$ ## Relations - [[Point Estimation]]