[BA/MA/PA] Scaling human-coded data with supervised machine learning


The combination of qualitative and quantitative research methods is a very popular tool to rigorously investigate phenomena and to use the benefits of both approaches. The use of machine learning in science is opening new ways of combining these methods. This is, because machine learning can be used to scale qualitative data-sets, which enables the use of quantitative methods such as statistical analysis (Harrison et al., 2022). One widely used  qualitative approach is the coding of human generated text through a researcher (He, 2012). This process allows researchers to analyse complex constructs that are difficult to measure quantitatively.

Research gap/Problem statement

There are a variety of applications of machine learning in science. However, little code has been available publicly and data-sets are often unique. Before broadly applicable code can be created, new approaches have to be tested on different types of datasets and with different techniques. At the moment there are very view such tests that show the true value of machine learning for scaling qualitative data. In this thesis offering, you are setting sail to investigate these possibilities.

Subject of the thesis and directions for research

The goal of this thesis is to investigate possibilities to use machine learning for scaling qualitative datasets. The dataset used in this thesis contains a variety of publications that investigate general purpose technologies, which is a concept used to describe artificial intelligence. You will conduct a method developed by Harrison et al. (2022), for which the original dataset is available, as well. The exact research question is derived during the initial exposé writing and depends on your individual interests.


This thesis offering is in the field of information systems and requires business understanding, motivation to learn new applications and methods, as well as good skills in python. Experience with machine learning is not necessarily required, but very helpful.


The thesis and application material can be submitted in English or German.

Call for action

Please apply to Julius Kirschbaum following the guidelines for thesis applications on our chair’s website.

  1. Apply for this thesis by sending an e-mail with a short motivational text, your CV and current transcript to julius.kirschbaum@fau.de
  2. Initial meeting to discuss the topic and get to know each other
  3. Drafting an exposé [2-4 weeks, registration of thesis after 2 weeks]
    1. Refine the problem statement
    2. Demonstrate the relevance
    3. Find your research question
    4. Build your research design and methodology
  4. Feedback meetings with supervisor during development
  5. Hand-in your thesis


Harrison, J.S. et al. (2022) ‘Using supervised machine learning to scale human‐coded data: A method and dataset in the board leadership context’, Strategic Management Journal, (November), pp. 1–23. Available at: https://doi.org/10.1002/smj.3480.

He, H. (2012) Coding Interviews: Questions, Analysis & Solutions Copyright.