Claringbould Lab

Department Internal Medicine

Principal investigator Annique Claringbould

E-mail address a.claringbould@erasmusmc.nl

Website https://www.erasmusmc.nl/en/research/researchers/claringbould-annique

Testing the promise of AI in biology: re-analysing single-cell multiomics data in different disease contexts

Suitable as a BEP? No

Suitable as a MEP? Yes

Suitable as an Academic Research Project? No

Techniques:

Single cell analysis
RNA-seq
ATAC-seq
AI
Programming

Single-cell transcriptomics is rapidly advancing our understanding of cellular heterogeneity, while artificial intelligence (AI) is transforming data analysis. This project will evaluate CellVoyager (Alber et al., bioRxiv 2025), an autonomous AI agent that explores single-cell RNA-seq data and generates testable hypotheses. Using three multiomics single-cell datasets from our recently published SUM-seq study (Lobato-Moreno*, Yildiz*, Claringbould* et al., Nature Methods 2025), the student will assess CellVoyager’s performance on complex, multimodal data. The datasets span: (1) macrophage polarization over time, (2) CD4⁺ T-cell differentiation into T-helper subtypes, and (3) CRISPRi/a perturbations in hiPSCs across four differentiation stages. The aims are to determine whether CellVoyager can be effectively applied to multiomic data and to uncover previously missed transcriptional regulators or stronger perturbation effects.

Mapping transcriptional neighbors of pancreatic beta cells using SCimilarity

Suitable as a BEP? No

Suitable as a MEP? Yes

Suitable as an Academic Research Project? No

Techniques:

Single cell analyses
RNA-seq
AI
Programming

Single-cell reference atlases enable detailed characterization of cell identity, crucial for understanding diseases like diabetes. Yet, comparing similar cells across datasets remains difficult. This project applies SCimilarity (Heimberg et al., Nature 2025), a foundation model for fast comparison of single-cell gene expression profiles, to identify cells resembling pancreatic endocrine types (e.g., alpha and beta cells) defined in the hiPSC-derived pancreas dataset (Balboa et al., Nature Biotechnology 2022). Using these query profiles, the student will search large reference datasets to: (1) validate similar cell groups via differential and marker gene analysis, (2) test whether similar profiles occur beyond pancreatic tissues, and (3) compare diabetic and healthy pancreatic cells to identify disease-associated beta cell states. The project integrates foundation models and biological validation to probe cross-tissue transcriptional similarity.

Decoding regulatory mechanisms of cardiovascular risk using AlphaGenome predictions

Suitable as a BEP? No

Suitable as a MEP? Yes

Suitable as an Academic Research Project? No

Techniques:

Polygenic risk scores
Risk modeling
AI
Programming

Polygenic risk scores (PRS) capture inherited susceptibility to cardiovascular disease (CVD) but offer limited insight into underlying molecular mechanisms. AlphaGenome (DeepMind, bioRxiv 2025) predicts the regulatory consequences of genetic variants across molecular layers such as gene expression and chromatin accessibility. This project aims to partition CVD PRS by predicted regulatory impact, generating a regulatory impact score that integrates these multiomic effects. The goal is to determine whether weighting variants by their regulatory influence can reveal molecular processes that contribute to—or precede—the development of cardiovascular disease.

(2024-2025) Integrative genetic scoring for personalised risk assessment of Familial Hypercholesterolaemia

Supervisor: Annique Claringbould, a.claringbould@erasmusmc.nl

Familial hypercholesterolemia (FH) is a genetic disorder characterized by elevated blood cholesterol levels from birth, which increases the risk of cardiovascular disease (CVD). While FH is diagnosed by finding mutations in the LDLR, APOB, and PCSK9 genes, recent genome-wide association studies (GWAS) have highlighted the role of common genetic variants in lipid metabolism. When combined, these variants have a big impact on lipid levels.

In this project, we will develop a score that integrates rare mutations, common variants, and molecular variants to provide a comprehensive genetic risk assessment for FH patients. We will use data from the large-scale UK Biobank and a cohort of FH patients with early CVD. This approach aims to improve the prediction of clinical outcomes for FH patients and contribute to personalized treatment.

Techniques

Programming in R and on a server (linux-based) (no experience required)
Calculation of polygenic scores
Identifying strucural variation and calling rare variants in whole exome data

Testing the promise of AI in biology: re-analysing single-cell multiomics data in different disease contexts

Further reading (click to link to article)

Mapping transcriptional neighbors of pancreatic beta cells using SCimilarity

Further reading (click to link to article)

Decoding regulatory mechanisms of cardiovascular risk using AlphaGenome predictions

Further reading (click to link to article)

(Example) projects submitted by lab in past years

(2024-2025) Integrative genetic scoring for personalised risk assessment of Familial Hypercholesterolaemia

Techniques

Further reading

Social Media

Confidants