VOILA! Season 5 episode 3: AI & Agents
Descrizione
Seminar page: https://voila-seminars.tilda.ws/tpost/s5e3-ai-agents
Date and time: June 18 2026, 4:00-5:30 PM CEST
Speaker: Dr. Natalie Shapira
Affiliation: Khoury College of Computer Sciences, Northeastern University
________________________________________________________________________
Title: Agents of Chaos and Genuine Alignment
Topic AI & Agents
Abstract: Recent advances in AI have led to increasingly autonomous systems exhibiting what is often referred to as agentic behavior, capabilities that include goal-directed planning, adaptation of strategies, decision making, and interaction with complex environments. While such capabilities are promising, they also introduce potential risks, including misalignment and unintended emergent behaviors that are difficult to anticipate or control.
In this talk, I highlight how agentic models can exhibit failure modes that resemble “agents of chaos,” producing unpredictable, misaligned, or strategically opaque behavior.
I argue that such phenomena cannot be adequately addressed through behavioral evaluation alone, nor through existing training paradigms such as reinforcement learning from human feedback (RLHF). Instead, we require mechanistic accounts of how internal representations and computational circuits give rise to agentic behavior. I will survey recent progress in mechanistic interpretability, with a focus on efforts to reverse-engineer learned circuits associated with capabilities such as theory of mind, to develop predictive and causal models of model behavior.
I conclude by asking a broader question: to what extent is mechanistic interpretability necessary to tame agentic systems, and is it sufficient?
Speaker's BIO: Natalie Shapira is a postdoctoral researcher at Northeastern Khoury College of Computer Sciences, Interpretation of Deep Networks lab. In her PhD, she combined natural language processing, deep learning and clinical psychology. With over ten years in the industry, she most recently worked as a researcher at Amazon Science. Before that, she held a research position at IBM's research labs, where she served on the Patent Committee. Natalie also has entrepreneurial experience as a co-founder and CSO in projects funded by the Israel Innovation Authority.
Localizzazione
Indefinito