Causality in Machine Learning
Posted by David Corfield
Back when we started the Café in 2006, I was working as a philosopher embedded with a machine learning group in the Max Planck Institute in Tübingen. Here I am reporting on my contribution to a NIPS workshop, held amongst the mountains of Whistler, on how one may still be able to learn when the distributions from which data is drawn for training and testing purposes differ. My proposal was that background knowledge, much of it causal, had to be deployed. It turns out that a video of the talk is still available – links to this and the resulting book chapter, Projection and Projectability, are here.
I was reminded of this work recently after seeing the strides taken by the machine learning community to integrate causal graphical models with their statistical techniques in Towards Causal Representation Learning and Causality for Machine Learning. Who knows? Perhaps my talk, which was after all addressed to some of these people, sowed a seed.
But another seed I was trying to sow around that time was Category Theory in Machine Learning (see also posts of mine from around that time on, e.g., kernels, infinite-dimensional exponential families, and probability theory). And I see things are happening on this front too, much summarised in Category Theory in Machine Learning.
I’m left wondering about the role of philosophy. Are we better advised sticking to the ‘making sense of what’s happened’ part of our jobs, often addressed to each other, or is there a place for a ‘you people might want to take a look at this’ approach, addressed outside?
Boryslaw
It seems to me that the central topic of this post is exactly what in causal inference is known as transportability, see for example Pearl, J., and Bareinboim, E. (2011) “Transportability of Causal and Statistical Relations: A Formal Approach”