Papers

Questioning the Survey Responses of Large Language Models

In this work, we examine what we can learn from language models' survey responses on the basis of the well-established American Community Survey (ACS). We systematically establish two dominant patterns. First, smaller models have a significant position and labeling bias, for example, towards survey responses labeled with the letter "A". Second, when adjusting for this labeling bias through randomized answer ordering, models still do not trend toward US population statistics or those of any cognizable population. Rather, models across the board trend toward uniformly random aggregate statistics over survey responses. Our findings demonstrate that aggregate statistics of a language model's survey responses lack the signals found in human populations.

Ricardo Dominguez-Olmedo, Moritz Hardt, Celestine Mendler-Dünner

Preprint

arXiv | code

On Data Manifolds Entailed by Structural Causal Models

The geometric structure of data is an important inductive bias in machine learning. In this work, we characterize the data manifolds entailed by structural causal models. The strengths of the proposed framework are twofold: firstly, the geometric structure of the data manifolds is causally informed, and secondly, it enables causal reasoning about the data manifolds in an interventional and a counterfactual sense. We showcase the versatility of the proposed framework by applying it to the generation of causally-grounded counterfactual explanations for machine learning classifiers, measuring distances along the data manifold in a differential geometric-principled manner.

Ricardo Dominguez-Olmedo, Amir H. Karimi, Georgios Arvanitidis, Bernhard Schölkopf

ICML 2023

On the Adversarial Robustness of Causal Algorithmic Recourse

Recourse recommendations should ideally be robust to reasonably small uncertainty in the features of the individual seeking recourse. We formulate the adversarially robust recourse problem, show that recourse methods offering minimally costly recourse fail to be robust, and present methods for generating adversarially robust recourse. In order to shift part of the burden of robustness from the individual to the decision-maker, we propose a model regularizer for training classifiers such that the additional cost of seeking robust recourse is reduced.

Ricardo Dominguez-Olmedo, Amir H. Karimi, Bernhard Schölkopf

ICML 2022

arXiv | code