Today, I want to share something special with you: For the first time in my life, I have the honor to share a session at a conference. Specifically, together with my friend and colleague, Sebastian Gießler (IZEW, University of Tübingen), I will host the session “Pretrained models and sociology: ethical, methodological and theoretical considerations” (session number: 12) at the 2024 conference of the Nordic Sociological Association (NSA).
Despite its name, the conference is open and welcomes anyone from around the world to participate. The session organizers also come from a wide range of different countries.
I am especially pleased with hosting the session, as the NSA 2024 will be hosted by my own institute, the Institute for Analytical Sociology (IAS) at Campus Norrköping of Linköping University. To me, it is a big honor to carry this responsibility and prepare and lead this session.
This is where you come in: I know that quite a few readers of my blog work with deep (language) models and are sociologists or at least somewhere in the vicinity. This means that you probably have ideas and opinions about how, where, and why pretrained models should find their place within sociological inquiry (or social scientific inquiry more general).
In our session, we want to focus on three important pillars of the usage of pretrained models: ethics, methods, and theory.
For your convenience, here is the Call4Abstracts, copied from the conference website:
Novel digital technologies have brought society not just apps to interact with known and unknown others. It has also enabled us to publish content faster than ever. Just as data production has accelerated, so has data collection. Increasingly, the combined cultural output of humanity is refined into large data sets for training neural networks such as ChatGPT or Llama (text), and DALL-E or Midjourney (images).
Sociologists increasingly use models trained on these data to classify, map, and transform text and images. These models turn otherwise inaccessible data into variables that plug into well-understood statistical methods. In addition, they work in a “plug’n’play”-fashion. But what impact do these models have on established measurement strategies and theory?
Pretrained models raise questions regarding their usage for sociological inquiry that we want to address around three structural pillars: ethics, methods, and theory. We invite contributions that tackle one of these pillars, or any combination thereof.
Questions that submissions may ask include, but are not limited to:
- Ethics and Privacy
- Are the ethical frameworks of sociology appropriate to address problems of pretrained models, like quantifying proper human labor conditions?
- How can we manage privacy issues of data collection?
- Methods
- How do model selection criteria change with the advent of pretrained models? Is “generative sufficiency” sufficient?
- How does the selection of hyperparameters influence which questions we can ask?
- Theory
- Do computational methods of data production and data processing have implications for sociological paradigms?
- How can we ensure that pretrained models align with theoretically guided measurement strategies?
If you are interested in submitting an abstract to this session, please use the official submission system and submit your abstract to Open Session 12 [s12] until February 29th, 2024.
We are looking forward to receiving your contributions and to a fruitful discussion about the safe and proper use of pretrained language models in sociological research!