Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

Abstract

In this paper, we study endogeneity problems in algorithmic decision-making where data and actions are interdependent. When there are endogenous covariates in a contextual multi-armed bandit model, a novel bias (self-fulfilling bias) arises because the endogeneity of the covariates spills over to the actions. We propose a class of algorithms to correct for the bias by incorporating instrumental variables into leading online learning algorithms. These algorithms also attain regret levels that match the best known lower bound for the cases without endogeneity. To establish the theoretical properties, we develop a general technique that untangles the interdependence between data and actions.

Date
Nov 15, 2021 5:30 PM — 5:45 PM
Location
Online
Xiaowei Zhang
Xiaowei Zhang
Associate Professor

My research research focuses on methodological advances in stochastic simulation and optimization, decision analytics, and reinforcement learning, with applications in service operations management, FinTech, and digital economy.

Related