The Algorithmic Advantage: How Reinforcement Learning Generates Rich Communication
Loading...
Date
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
Luiss University, University of Waterloo, University of Rome Tor Vergata
Abstract
We analyze strategic communication when advice is generated by a reinforcement-learning algorithm rather than by a fully rational sender. Building on the cheap-talk framework of Crawford and Sobel (1982), an advisor adapts its messages based on payoff feedback, while a decision maker best-responds. We provide a theoretical analysis of the long-run communication outcomes induced by such reward-driven adaptation. With aligned preferences, we establish that learning robustly leads to informative communication even from uninformative initial policies. With misaligned preferences, no stable outcome exists; instead, learning generates cycles that sustain highly informative communication and payoffs exceeding those of any static equilibrium.