Skip to content

andreArtelt/DataPoisoningCounterfactuals

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The Effect of Data Poisoning on Counterfactual Explanations

This repository contains the implementation of the experiments as proposed in the paper The Effect of Data Poisoning on Counterfactual Explanations by André Artelt, Shubham Sharma, Freddy Lecué, and Barbara Hammer.

Abstract

Counterfactual explanations are a widely used approach for examining the predictions of black-box systems. They can offer the opportunity for computational recourse by suggesting actionable changes on how to alter the input to obtain a different (i.e., more favorable) system output. However, recent studies have pointed out their susceptibility to various forms of manipulation.

This work studies the vulnerability of counterfactual explanations to data poisoning. We formally introduce and investigate data poisoning in the context of counterfactual explanations for increasing the cost of recourse on three different levels: locally for a single instance, a sub-group of instances, or globally for all instances. In this context, we formally introduce and characterize data poisonings, from which we derive and investigate a general data poisoning mechanism. We demonstrate the impact of such data poisoning in the critical real-world application of explaining event detections in water distribution networks. Additionally, we conduct an extensive empirical evaluation, demonstrating that state-of-the-art counterfactual generation methods and toolboxes are vulnerable to such data poisoning. Furthermore, we find that existing defense methods fail to detect those poisonous samples.

Details

Data

The data sets used in this work are stored in Implementation/data/. Many of these .csv files in the data folder were downloaded from https://github.com/tailequy/fairness_dataset/tree/main/experiments/data.

The data sets for the case study on water distribution systems generated in Implementation/wdn-casestudy.py.

Experiments

Algorithm 1 for generating a poisoned training data set is implemented in Implementation/data_poisoning.py and the experiments from the paper are implemented in:

Requirements

License

MIT license - See LICENSE.

How to cite

You can cite the version on arXiv.

About

"The Effect of Data Poisoning on Counterfactual Explanations" by André Artelt et al.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages