Web Design for Web Scrapers
Social scientists sometimes find themselves in the situation of having to scrape some data from websites since a corresponding dataset does not yet exist. In that case, they are often confronted with confusing HTML code and can struggle to extract the data they want. In this article, I want to provide a primer in web design, the profession of creating these websites. Understanding what web designers think will help disentangle the issues many of us face when looking at some HTML and trying to extract precious data from it.
Installing Linux: 18 years later
I installed Linux on my work computer. After having been unable to properly use it for the better part of the past three years, I asked IT whether I could wipe Windows off the machine. It turns out to be one of the best decisions I did regarding that computer, and now I have not one, but two workable computers. In this article I compare this experience to my first one 18 years ago, and conclude with whether I can recommend Linux for productive use.
The Telemetry Fallacy
Should you collect telemetry data from your apps? The short answer is no, but the reason why is more complex. It has to do with user's intentions, data poverty, and the simple fact that one does not collect telemetry in an Open Source app. In this article, I report on my (possibly half-baked) experiences with what a good app really needs.
Dedication, Social Pressure, and Spite
Today something a bit more personal: Recently I asked myself the very existential question of why I even continue developing Zettlr, given that there are so many alternatives out there. The reason turns out to come back to three fundamental driving forces: Dedication, Social Pressure, and Spite. But especially spite.
Join us at NSA 2024: Pretrained Language Models for Sociology
On my own behalf, I want to invite you – the readers of this blog – to submit abstracts to a session that I will host at the NSA 2024. My colleague Sebastian Gießler and I want to discuss in an open session the ethical, methodological, and theoretical pitfalls in utilizing pretrained neural network models in sociological or general social scientific inquiry.
Updates.
Some people maintain that with age comes serenity. But this is not always true, as I have been slowly finding out over the past year. The entire Windows and gaming ecosystem is full of mandatory updates. Coupled with a crawling slow internet connection, this means that you are either never going to enjoy any casual gaming session anymore, or you'll have to plan your entire leisure time around this fact.
LocalChat: Chat with an AI Assistant On Your Computer
Totay I want to introduce a new small app that I made. LocalChat allows you to run generative AI models that look and feel almost like ChatGPT locally on your computer. This means you can chat with a chatbot while your internet is down, without having to worry about leaking confidential information to some company, and more.
Academic Website VII: Dirty Tricks
Over the past weeks, I have shared many tips on how to set up your own, personal website. Today I want to finish up the series (for now) with a small and incomplete collection of "dirty tricks" that I find very appealing in the context of personal websites; from automating away chores to giving a better user interaction.
Academic Website VI: Keeping Your Website Up To Date, and Migrating
In this second-to-last article of the Academic Website series, I go over the maintenance routines that you'll need to keep tabs on when you create a personal academic website. It is important to ensure your website remains secure. Additionally, I share what you need to do if you want to switch out the website.
Academic Website V: Telling Machines Who You Are
In today's article, I want to look closer at the SEO-aspect of a website, Search Engine Optimization. Here I describe how you can make yourself and your publications machine-readable, and enrich your website with metadata so that search engines and other applications can make more use of your website.