Causation, Prediction, Explanation | Hendrik Erz

Causation, Prediction, Explanation

Sociology is in upheaval: After machine learning technologies brought about huge changes in the area of engineering, social sciences are catching up, introducing more and more machine learning methods into their research. But is machine learning really the next big thing, or rather something additional that you may or may not use? In this blog post I focus on initial research questions that popped up after the first weeks of my PhD, and which will guide my work in the future.

As a PhD student, you begin your journey with a bag full of good intentions, but as we know from Karl Marx himself, the road to hell is paved with good intentions (2013). As such, expectations to your PhD need to be managed. This can mean “manage” to the worse, if we can believe numerous accounts on social media, but it can also mean to the better. My initial research proposal focused on methodological questions, and additionally on a very low baseline. This is perfectly normal, I assume, because as a beginning researcher, you don’t have the amount of knowledge a senior researcher has, so it’s only natural that you can’t dig up the “big questions” early on.

Two meetings with my supervisor later the initial research proposal is pretty much superseded, but I am happy with this turn of events. After all, as he rightfully stated, simply looking at what GANs might be able to do in the realm of bot-detection on Twitter is somewhat low-key and most certainly not something that you can devote four years of your life to. At the same time, he gave me pretty strong hints as to what area is likely to be viable for further research. He highlighted that an interesting avenue of research will be to focus on the ability of traditional statistics and machine learning to explain causation and mechanisms, and to see how well they do with regard to prediction.

The last week was filled with not much productive output, and instead thinking about that problem and reading some important papers in this regard. The first interesting area is the comparison of traditional statistical methods with machine learning technologies. It is to a certain extent hard to differentiate these two, mainly because some social scientific subfields have made use of machine learning technologies for decades now (Evans and Aceves 2016), which puts natural language processing (NLP) – a machine learning technology – closer to the notion of “tradition.” Furthermore, as Molina and Garip (2019) put it, several methods that I personally thought of as “traditional” statistical methods actually feature a form of machine learning. So how do you distinguish between these fields?

This is the first really interesting question to think about. The second is the question of how their ability to predict and explain works out. Is machine learning basically always better than traditional statistical methods? Or is the interplay more complex? In a landmark study, Salganik et al. (2020) found that machine learning technologies are actually not superior to “manual” selection of features when it comes to prediction. Even more so, none of the methods were actually able to predict the variables to a satisfying degree. Does this mean that we can basically trash machine learning again and return back to regression models? Certainly not. But this highlights the importance of performing research in this area.

A lot is to uncover here. Even though Leo Breiman has formulated the “statistics vs. machine learning” problem almost two decades ago (2001), I am not completely convinced by his fundamental argument; namely that traditional statistics with their models to fit to the data make more assumptions about the underlying mechanism than machine learning methods. Although machine learning does not provide the function to map to the data, while traditional statistics always start from the question of the correct model, it does still make strong assumptions about the data.

Model-based classifiers, for example, commonly known as “neural networks”(for a discussion of the term “neural network” see Erz 2020), always need a precise specification of the input shape that is to be expected – images need to be cropped rigorously to fit a precise input space (e.g. 1024x1024 pixels), and text classifiers (RNN, recurrent neural networks) trained on English text in the ASCII-space won’t work with Chinese or Japanese Unicode characters due to their completely different codepoint mappings.

As we can see, there is a lot to uncover in the realm of explanation. And I am very happy to start digging into these problems as I continue my journey. While a lot is still unexplained, one thing becomes clearer everyday: the reports on the failures of “big data” are likely true, as the success of “artificial intelligence” can largely be explained by biased training sets, very limited areas of application, and a lot of duct tape.


Breiman, Leo. 2001.“Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author).” Statistical Science 16 (3): 199–231.

Erz, Hendrik. 2020. “Künstliche Intelligenz und Daten: Eine Evaluation softwarebasierter militärischer Informationsgewinnung.” Research Report 4. Hamburg: IFSH.

Evans, James A., and Pedro Aceves. 2016. “Machine Translation: Mining Text for Social Theory.” Annual Review of Sociology 42 (1): 21–50.

Marx, Karl. 2013. Der Produktionsprozeß des Kapitals. 40. Auflage, unveränderter Nachdruck der 11. Auflage 1962. Das Kapital, Kritik der politischen Ökonomie / Karl Marx. Inst. für Marxismus-Leninismus beim ZK d. SED ; 1. Band. Berlin: Karl Dietz Verlag.

Molina, Mario, and Filiz Garip. 2019. “Machine Learning for Sociology,” 22.

Salganik, Matthew J., Ian Lundberg, Alexander T. Kindel, Caitlin E. Ahearn, Khaled Al-Ghoneim, Abdullah Almaatouq, Drew M. Altschul, et al. 2020. “Measuring the Predictability of Life Outcomes with a Scientific Mass Collaboration.” Proceedings of the National Academy of Sciences 117 (15): 8398–8403.

Return to the post list