Research
I am a professor of Bio-Data Science at the Faculty of Mathematics and Computer Science at the University of Leipzig, Germany.
Currently, I head the Bio-Data Science Group in the Department of Computational Biology at the Helmholtz Centre for Environmental Research in Leipzig, Germany.
My research centers on advancing Data Science methodologies—such as Statistical, Machine, Deep Learning, and Data Analysis and Integration—to unlock deeper insights from expansive Big Data in human and environmental health. By developing and applying cutting-edge computational techniques, I aim to broadly generate novel hypotheses and predictive models, particularly within ecological and health research domains. A core focus of my work is to bolster the credibility of AI-driven analyses by embedding explainability and quantifiable uncertainty measures into each application. Additionally, I strongly emphasize reproducible research, ensuring that findings are transparent, consistent, and valuable for the scientific community.
Examples of methods and technologies I use include:
- High-performance computing clusters for large-scale data processing.
- Training AI on GPUs to accelerate model performance.
- Graph and other neural networks for complex, interconnected data structures (for supervised, unsupervised, and reinforcement learning tasks).
- Knowledge graphs and graph databases for data organization and semantic relationships.
- Large language models to enhance interpretability and applications in research.
- Programming languages like R, Python, shell scripting, awk, Cypher, and SQL for versatile data manipulation and analysis.
In addition to my research, I am dedicated to teaching future data scientists and computer science students. I offer courses in statistical learning, R programming, and an interactive Data Science curriculum designed to prepare students comprehensively for the field. These courses include
- Hands-on training in R and Python,
- Version control with Git,
- Agile project and self-management practices,
- Storytelling with data,
- Crafting compelling and representative visuals and
- Developing strong presentation skills.
I aim to equip students with a robust, practical skill set that prepares them for success in real-world data science roles.