Consider the following classical (unsupervised learning) problem of estimating the mean of an n-dimensional normally (with identity covariance matrix) or Poisson distributed vector under the squared loss. The framework of empirical Bayes (EB), put forth by Robbins'1956, combines Bayesian and frequentist mindsets by postulating that the mean's coordinates are sampled iid from an unknown prior and aims to use a fully data-driven estimator to compete with the Bayesian oracle that knows the true prior. The figure of merit is the total excess risk (regret) over the Bayes risk in the worst case (over the class of, e.g., priors with a given support). Although this paradigm was introduced more than 60 years ago, little was known about the asymptotic scaling of the optimal regret before our work established it to be $\Theta((\log n/\log \log n)^2)$ for the Poisson case. The same rate is shown to be a lower bound for the normal case, verifying and strengthening upon an old conjecture of $\omega(1)$ due to Singh'1979.
We will discuss practical implementation of EB estimators. The most performant of those are obtained by first running a non-parametric maximum likelihood (NPMLE) to estimate the unknown prior, and then computing (via Bayes) the posterior mean with respect to the estimated prior. We will discuss empirical results on sports data and short-term time series forecasting.
Based on joint works with Yihong Wu, Soham Jana and Anzo Teh.