Responsive image

VOILA! Season 5 episode 3: AI & Agents

Le Giovedì 18 Giugno 2026 16:00
EFELIA Côte d'Azur

Descrizione

Seminar page: https://voila-seminars.tilda.ws/tpost/s5e3-ai-agents 

Date and time: June 18 2026, 4:00-5:30 PM CEST

Speaker: Dr. Natalie Shapira

Affiliation: Khoury College of Computer Sciences, Northeastern University

________________________________________________________________________

Title: Agents of Chaos and Genuine Alignment

Topic AI & Agents

Abstract: Recent advances in AI have led to increasingly autonomous systems exhibiting what is often referred to as agentic behavior, capabilities that include goal-directed planning, adaptation of strategies, decision making, and interaction with complex environments. While such capabilities are promising, they also introduce potential risks, including misalignment and unintended emergent behaviors that are difficult to anticipate or control.


In this talk, I highlight how agentic models can exhibit failure modes that resemble “agents of chaos,” producing unpredictable, misaligned, or strategically opaque behavior. 


I argue that such phenomena cannot be adequately addressed through behavioral evaluation alone, nor through existing training paradigms such as reinforcement learning from human feedback (RLHF). Instead, we require mechanistic accounts of how internal representations and computational circuits give rise to agentic behavior. I will survey recent progress in mechanistic interpretability, with a focus on efforts to reverse-engineer learned circuits associated with capabilities such as theory of mind, to develop predictive and causal models of model behavior. 


I conclude by asking a broader question: to what extent is mechanistic interpretability necessary to tame agentic systems, and is it sufficient?

Speaker's BIO: Natalie Shapira is a postdoctoral researcher at Northeastern Khoury College of Computer Sciences, Interpretation of Deep Networks lab. In her PhD, she combined natural language processing, deep learning and clinical psychology. With over ten years in the industry, she most recently worked as a researcher at Amazon Science. Before that, she held a research position at IBM's research labs, where she served on the Patent Committee. Natalie also has entrepreneurial experience as a co-founder and CSO in projects funded by the Israel Innovation Authority.

 

Calendario

Le Giovedì 18 Giugno 2026 16:00

Localizzazione

Indefinito

Contattare

EFELIA Côte d'Azur
Côte d'Azur
Centre Inria - 2004, route des Lucioles
06902, null
Francia