‘My pipeline can find new biomarkers for diseases’

  • Student project
Tessa Postma.jpeg

For doctors, it can be really hard to determine what illness a patient has. Many symptoms are related to many different diseases, so they have to dive deeper. One of the new methods to do this, is to look at biomarkers. For her research project in collaboration with the University Medical Centre Groningen and the University of Groningen, second year Data Science for Life Science student Tessa Postma used data science to find new biomarkers for various diseases.

Tessa chose the project because she finds it important to contribute to society. 'The project sounded really interesting and also quite challenging, but if we could manage it, it would really help doctors and people. The sooner we can diagnose diseases or see them coming, the more we can do to prevent damage. So I wanted to give it a try!'

Bacterial RNA

For her research, Tessa used sequences of so-called 16S rRNA. 'This specific part of bacterial RNA acts like a barcode,' she explains. 'It allows us to identify exactly which bacterial species are present in a patient's gut, giving us a complete overview of their microbiome.'

With the raw 16S rRNA data, she built a pipeline. 'I made an automated pipeline that would process this data from start to finish. First it would filter and clean the data, then it matches the sequences to known bacteria using AI classifiers, and eventually, it uses machine learning and statistics to output a list of specific bacteria that act as biomarkers for diseases.'

Because the resulting data was probably hard to interpret for doctors, Tessa also added a dashboard. 'On this dashboard, the results are shown in graphs that they can interpret in one look. And to make the AI’s decisions transparent, I also added a feature that visualizes its reasoning. It generates graphs showing doctors exactly which specific microbes led to the prediction, rather than just giving them an unexplained answer.'

A real kick

The first version of the dashboard was tested with dummy data, and showed great promise. So Tessa also tested her pipeline on real data, and got interesting results. 'I can’t share the exact results, but at least I can say it worked! It was quite a struggle so I was glad I could make it work. To see the real biomarkers was really a kick.' And her work won’t end up in a drawer. 'My supervisor will use the workflow for his research, and maybe more groups will be able to find biomarkers with my pipeline in the future.'

Tessa herself is now graduating, and afterwards she hopes to find a job where she can use all her skills. 'I have a bachelor in biotechnology, and I also really like doing research on the lab. It would be great if I could find a job where I could generate data on the lab, and also analyze it with the skills I picked up on the master.'

Fields of interest

  • Exact and Information Sciences
  • Health and Sports