Upcoming Talks

Ista white

What functions do self-attention blocks prefer to represent?

Date
Monday, December 13, 2021 16:00 - 18:00
Speaker
Surbhi Goel (Microsoft)
Location
Zoom Link: https://istaustria.zoom.us/j/94397239114?pwd=Q0JDSTg1bkpVUDc5TXlZWG1paWpUdz09 Meeting ID: 943 9723 9114 Passcode: 621023
Series
Seminar/Talk
Tags
ELLIS talk
Host
Marco Mondelli
Contact
Ksenja Harpprecht

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond. In this talk, we will focus on studying the inductive bias of self-attention blocks by rigorously establishing which functions and long-range dependencies they statistically represent. Our main result shows that bounded-norm Transformer layers can represent sparse functions of the input sequence, with sample complexity scaling only logarithmically with the context length. Furthermore, we propose new experimental protocols to support this analysis, built around the large body of work on provably learning sparse Boolean functions.

Based on joint work with Benjamin L. Edelman, Sham Kakade and Cyril Zhang.


Qr image
Download ICS Download invitation
Back to eventlist