Bayesian models are rooted in Bayesian statistics and easily benefit from the vast literature in the field. In contrast, deep learning lacks a solid mathematical grounding. Instead, empirical developments in deep learning are often justified by metaphors, evading the unexplained principles at play. These two fields are perceived as fairly antipodal to each other in their respective communities. It is perhaps astonishing then that most modern deep learning models can be cast as performing approximate inference in a Bayesian setting. The implications of this are profound: we can use the rich Bayesian statistics literature with deep learning models, explain away many of the curiosities with ad hoc techniques, combine results from deep learning into Bayesian modelling, and much more.
In this talk I will discuss a new class of methods that can capture uncertainty with a single forward pass. In the process I will share some recent results shedding light on why standard softmax neural nets cannot normally capture epistemic uncertainty reliably. I will then show how these insights allow us to propose minimal changes to the single softmax neural net, with which we can now beat the uncertainty predictions of a Deep Ensemble, but with the computational cost of a single standard neural net.