Yongkai Wu

Yongkai Wu

Anti-Discrimination Learning: from Association to Causation - KDD 2018 Tutorial

08/19/2018

TL;DR

This tutorial introduces the causal modeling background, presents a causal modeling-based anti-discrimination framework, and covers the very latest research on causal modeling-based anti-discrimination learning.

Slides, Google Scholar

Abstract

Anti-discrimination learning is an increasingly important task in data mining and machine learning fields. Discrimination discovery is the problem of unveiling discriminatory practices by analyzing a dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data and/or the predictive algorithms. Discrimination is causal, which means that to prove discrimination one needs to derive a causal relationship rather than an association relationship. Although it is well-known that association does not mean causation, the gap between association and causation is not paid enough attention by many researchers. This tutorial first surveys existing association-based approaches and point out their limitations. Then, the tutorial introduces the causal modeling background, presents a causal modeling-based anti-discrimination framework, and covers the very latest research on causal modeling-based anti-discrimination learning. Finally, the tutorial suggests several potential future research directions.

Tutors

Dr. Lu Zhang is an assistant professor in the Computer Science and Computer Engineering Department, University of Arkansas. He received the BEng degree in computer science and engineering from the University of Science and Technology of China, in 2008, and the PhD degree in computer science from Nanyang Technological University, Singapore in 2013. He worked as a postdoctoral researcher at University of Arkansas during 2015 to 2018. His research interests include data mining algorithms, discrimination-aware data mining, and causal inference.
Yongkai Wu is a Ph.D. student in the Department of Computer Science and Computer Engineering at the University of Arkansas.
Dr. Xintao Wu is a Professor in the Department of Computer Science and Computer Engineering at the University of Arkansas. His major research interests include data privacy, bioinformatics and discrimination-aware data mining.

Target Audience, Prerequisite and Goal

The tutorial targets the researchers and practitioners who are interested in the issue of discovering and preventing discrimination caused by data mining and machine learning algorithms from the causal perspective. The audience is assumed to be familiar with the fundamental concepts in data mining and machine learning such as classification and predictive model training/testing.

The goal of this tutorial is to survey existing association-based approaches and point out their limitations, and introduce a causal modeling-based framework along with the very latest research on causal modeling-based anti-discrimination learning. By attending the tutorial, the attendees are expected to get acquainted with the anti-discrimination learning literature, have a deeper understanding of discrimination in machine learning and data mining from the causal perspective, and be enlighten about the future research directions in this field.

Outline

Introduction
Context
Literature and resource
Correlation based Anti-Discrimination Learning
Measuring discrimination:
- fairness through unawareness,
- disparate impact, individual fairness,
- statistical parity,
- equality of opportunity,
- calibration,
- conditional discrimination,
- alpha discrimination,
- multi-factor interaction,
- belift,
- preference.
Removing discrimination:
- pre-processing (data modification, fair data representation, fair data generaiton),
- in-processing (regularization, explicit constraints),
- post-processing.
From correlation to causation: discrimination as causal effect, literature, motivating examples
Causal modeling background
From statistics to causal modeling
Structural causal model and causal graph: Markovian model, conditional independence, d-separation, factorization formula
Causal inference: intervention and do-operator, truncated factorization formula, path-specific effect, identifiability and "kite" structure, counterfactual analysis
Causal modeling-based anti-discrimination learning
Direct and indirect discrimination
Counterfactual fairness
Data discrimination vs. model discrimination
Other works on causal modeling-based anti-discrimination learning
Challenges and directions for future research
Summary of existing works
Challenges: non-identifiablity of path-specific effects, causal modeling for mixed-type variables, Semi-Markovian and ADMG, uncertain causal models, group/individual-level indirect discrimination
Future directions: building discrimination-free predictors, discrimination in tasks beyond classification: ranking and recommendation, Generative Adversarial Network (GAN), dynamic data and time series, text and image

References

Tian, J., Pearl, J.: Probabilities of causation: Bounds and identification. In: UAI'00 (2000)
Avin, C., Shpitser, I., Pearl, J.: Identifiability of path-specific effects. In: IJCAI'05 (2005)
Madsen, A.L.: Belief update in CLG Bayesian networks with lazy propagation. Int. J. Approx. Reason. 49(2),503-521 (2008)
Shpitser, I., Pearl, J.: Complete identification methods for the causal hierarchy. J. Mach. Learn. Res. 9(Sep), 1941-1979 (2008)
Kamiran, F., Calders, T.: Classifying without discriminating. IC4'09 (2009)
Calders, T., Verwer, S.: Three naive bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21(2), 277-292 (2010)
Kamishima, T., Akaho, S., and Sakuma J.: Fairness-aware learning through regularization approach. In: ICDMW'11 (2011)
Luong, B.T., Ruggieri, S., Turini, F.: k-NN as an implementation of situation testing for discrimination discovery and prevention. In: SIGKDD'11 (2011)
Žliobaite, I., Kamiran, F., Calders, T.: Handling conditional discrimination. In: ICDM'11 (2011)
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: ITCS'12 (2012)
Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1-33 (2012)
Kamiran, F., Karim, A., Zhang, X.: Decision theory for discrimination-aware classification. In: ICDM'12 (2012)
Hajian, S., Domingo-Ferrer, J.: A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans. Knowl. Data Eng. 25(7), 1445-1459 (2013)
Magnani, L., Board, E., Longo, G., Sinha, C., Thagard, P.: Discrimination and privacy in the information society. Springer (2013)
Zemel, R. S., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representations. In: ICML'13 (2013)
Mancuhan, K., Clifton, C.: Combating discrimination using Bayesian networks. Artif. Intell. Law 22(2), 211-238 (2014)
Romei, A., Ruggieri, S.: A multidisciplinary survey on discrimination analysis. Knowl. Eng. Rev. 29(05), 582-638 (2014)
Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: SIGKDD'15 (2015)
Hajian, S., Domingo-Ferrer, J., Monreale, A., Pedreschi, D., Giannotti, F.: Discrimination-and privacy-aware patterns. Data Min. Knowl. Discov. 29(6), 1733-1782 (2015)
Edwards, H., Storkey, A.: Censoring representations with an adversary. In: ICLR'16 (2016)
Hardt M., Price E., Srebro N.: Equality of opportunity in supervised learning. In: NIPS'16 (2016)
Wu,Y., Wu,X.: Using loglinear model for discrimination discovery and prevention. In: DSAA'16 (2016)
Zhang, L., Wu, Y., Wu, X.: On discrimination discovery using causal networks. In: SBP-BRiMS 2016 (2016)
Zhang, L., Wu, Y., Wu, X.: Situation testing-based discrimination discovery: a causal inference approach. In: IJCAI'16 (2016)
Barocas, S., Hardt, M.: Fairness in machine learning. Tutorial. In: NIPS'17 (2017) ???? (2017)
Bonchi, F., Hajian, S., Mishra, B., Ramazzotti, D.: Exposing the probabilistic causal structure of discrimination. Int. J. Data Sci. Anal. 3(1), 1-21 (2017)
Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness. In: SIGKDD'17 (2017)
Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning. In: NIPS'17 (2017)
Kleinberg, J., Mullainathan, S., & Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. In: ITCS'17 (2017)
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: NIPS'17 (2017)
Lin, X., Zhang, M., Zhang, Y., Gu, Z., Liu, Y., Ma, S.: Fairness-aware group recommendation with pareto-efficiency. In: RecSys '17 (2017)
Pearl, J.: A linear “microscope” for interventions and counterfactuals. Journal of Causal Inference, 5(1). (2017)
Russell, C., Kusner, M.J., Loftus, J., Silva, R.: When worlds collide: integrating different counterfactual assumptions in fairness. In: NIPS'17 (2017)
Serbos, D., Qi, S., Mamoulis, N., Pitoura, E., Tsaparas, P.: Fairness in package-to-group recommendations. In: WWW'17 (2017)
Yang, K., Stoyanovich, J.: Measuring fairness in ranked outputs. In: SSDBM '17 (2017)
Yao, S., Huang, B.: Beyond Parity: Fairness objectives for collaborative filtering. In: NIPS'17 (2017)
Zafar, M. B., Valera, I., Gomez Rodriguez, M., Gummadi, K. P.: Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: WWW'17 (2017)
Zafar, M. B., Valera, I., Gomez Rodriguez, M., Gummadi, K. P.: Fairness constraints: Mechanisms for fair classification. In: AISTAS'17 (2017)
Zafar, M. B., Valera, I., Rodriguez, M., Gummadi, K., Weller, A. : From parity to preference-based notations of fairness in classification. In: NIPS'17 (2017)
Zehlike, M., Bonchi, F., Castillo, C., Hajian, S., Megahed, M., Baeza-Yates, R.: FA*IR: A fair top-k algorithm. In: CIKM '17 (2017)
Zhang, L., Wu, X.: Anti-discrimination learning: a causal modeling-based framework. Int. J. Data Sci. Anal. 4(1), 1-16 (2017)
Zhang, L., Wu, Y., Wu, X.: A causal framework for discovering and removing direct and indirect discrimination. In: IJCAI'17 (2017)
Zhang, L., Wu, Y., Wu, X.: Achieving non-discrimination in data release. In: SIGKDD'17 (2017)
Burke, R., Sonboli, N. Ordonez-Gauger, A.: Balanced neighborhoods for multi-sided fairness in recommendation. In: FAT*'18 (2018)
Celis, L. E., Straszak, D., Vishnoí, N. K.: Ranking with fairness constraints. In: ICALP'18 (2018)
Kocaoglu, M., Snyder, C., Dimakis, A.G., Vishwanath, S.: CausalGAN: Learning causal implicit generative models with adversarial training. In: ICLR'18 (2018)