Algorithmic Collective Action under Differential Privacy
Loading...
Date
Authors
Advisor
Creager, Elliot
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
The rapid integration of AI into everyday life has generated considerable public attention and excitement, pointing to unprecedented gains in efficiency and personalization. However, it concurrently raises concerns about the potential for automating algorithmic harms, as well as re-entrenching existing social inequities. In response, the pursuit of "trustworthy AI" has become a critical goal for researchers, corporations, and governments alike. Achieving this objective is a complex challenge with many possible ways to realize it, ranging from policy and regulation to technical solutions such as algorithmic design, systematic evaluation, and enhanced model transparency.
Contemporary AI systems, particularly those leveraging large-scale models, are fundamentally trained on vast amounts of data, often sourced from social interactions and user-generated content. This dependence introduces a grassroots mechanism for autonomy: the possibility for everyday users, citizens, or workers to directly steer AI behavior. This concept, known as Algorithmic Collective Action (ACA), involves a coordinated effort where a group of individuals deliberately modifies the data they share with a platform, with the intention of driving the model’s learning process toward outcomes they regard as more favorable or equitable. We investigate the intersection between these bottom-up, user-driven efforts to influence AI and a growing class of methods that firms already implement to improve model trustworthiness, especially privacy protections. In particular, we focus on the setting in which an AI firm deploys a differentially private model, motivated by the growing regulatory focus on privacy and data protection.
Differential Privacy (DP) is a formal, mathematical framework that provides provable guarantees about the privacy of individuals whose data are used in a dataset. To operationalize these privacy guarantees in deep learning settings, we employ Differentially Private Stochastic Gradient Descent (DP-SGD), which is the de facto DP mechanism for deep learning, making it a natural choice for assessing the effectiveness of ACA under realistic conditions. Our findings reveal that while differential privacy offers substantive protection for individual data, it concurrently introduces challenges for effective algorithmic collective action. Theoretically, we characterize formal lower bounds on the success of ACA when a model is trained with differential privacy. These bounds are expressed as a function of key variables: the size of the acting collective and the firm’s chosen privacy parameters, which dictate the level of privacy the firm intends to enforce. Empirically, we verify these theoretical trends through extensive experimentation by simulating collective action during the training of a deep neural network classifier across several datasets. Moreover, we offer additional insight into how ACA affects empirical privacy, and we include a socio-technical discussion of the wider implications for responsible AI.