BEGIN:VCALENDAR
VERSION:2.0
PRODID:icalendar-ruby
CALSCALE:GREGORIAN
METHOD:PUBLISH
BEGIN:VTIMEZONE
TZID:Europe/Vienna
BEGIN:DAYLIGHT
DTSTART:20220327T030000
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20211031T020000
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260403T220429Z
UID:1639407600@ist.ac.at
DTSTART:20211213T160000
DTEND:20211213T180000
DESCRIPTION:Speaker: Surbhi Goel\nhosted by Marco Mondelli\nAbstract: Self-
 attention\, an architectural motif designed to model long-range interactio
 ns in sequential data\, has driven numerous recent breakthroughs in natura
 l language processing and beyond. In this talk\, we will focus on studyin
 g the inductive bias of self-attention blocks by rigorously establishing w
 hich functions and long-range dependencies they statistically represent. O
 ur main result shows that bounded-norm Transformer layers can represent sp
 arse functions of the input sequence\, with sample complexity scaling only
  logarithmically with the context length. Furthermore\, we propose new exp
 erimental protocols to support this analysis\, built around the large body
  of work on provably learning sparse Boolean functions.Based on joint work
  with Benjamin L. Edelman\, Sham Kakade and Cyril Zhang.
LOCATION:Zoom Link: https://istaustria.zoom.us/j/94397239114?pwd=Q0JDSTg1bk
 pVUDc5TXlZWG1paWpUdz09  Meeting ID: 943 9723 9114 Passcode: 621023\, ISTA
ORGANIZER:
SUMMARY:Surbhi Goel: What functions do self-attention blocks prefer to repr
 esent?
URL:https://talks-calendar.ista.ac.at/events/3431
END:VEVENT
END:VCALENDAR
