Claringbould Lab

Department                        Internal Medicine

Principal investigator      Annique Claringbould

E-mail address                   a.claringbould@erasmusmc.nl

Website                               https://www.erasmusmc.nl/en/research/researchers/claringbould-annique

 

Testing the promise of AI in biology: re-analysing single-cell multiomics data in different disease contexts

Suitable as a BEP? No

Suitable as a MEP? Yes

Suitable as an Academic Research Project? No

Techniques:

  • Single cell analysis
  • RNA-seq
  • ATAC-seq
  • AI
  • Programming

Single-cell transcriptomics is rapidly advancing our understanding of cellular heterogeneity, while artificial intelligence (AI) is transforming data analysis. This project will evaluate CellVoyager (Alber et al., bioRxiv 2025), an autonomous AI agent that explores single-cell RNA-seq data and generates testable hypotheses. Using three multiomics single-cell datasets from our recently published SUM-seq study (Lobato-Moreno*, Yildiz*, Claringbould* et al., Nature Methods 2025), the student will assess CellVoyager’s performance on complex, multimodal data. The datasets span: (1) macrophage polarization over time, (2) CD4⁺ T-cell differentiation into T-helper subtypes, and (3) CRISPRi/a perturbations in hiPSCs across four differentiation stages. The aims are to determine whether CellVoyager can be effectively applied to multiomic data and to uncover previously missed transcriptional regulators or stronger perturbation effects.

Further reading (click to link to article)

https://www.nature.com/articles/s41592-025-02700-8

Mapping transcriptional neighbors of pancreatic beta cells using SCimilarity

Suitable as a BEP? No

Suitable as a MEP? Yes

Suitable as an Academic Research Project? No

Techniques:

  • Single cell analyses
  • RNA-seq
  • AI
  • Programming

Single-cell reference atlases enable detailed characterization of cell identity, crucial for understanding diseases like diabetes. Yet, comparing similar cells across datasets remains difficult. This project applies SCimilarity (Heimberg et al., Nature 2025), a foundation model for fast comparison of single-cell gene expression profiles, to identify cells resembling pancreatic endocrine types (e.g., alpha and beta cells) defined in the hiPSC-derived pancreas dataset (Balboa et al., Nature Biotechnology 2022). Using these query profiles, the student will search large reference datasets to: (1) validate similar cell groups via differential and marker gene analysis, (2) test whether similar profiles occur beyond pancreatic tissues, and (3) compare diabetic and healthy pancreatic cells to identify disease-associated beta cell states. The project integrates foundation models and biological validation to probe cross-tissue transcriptional similarity.

Further reading (click to link to article)

https://www.nature.com/articles/s41586-024-08411-y

Decoding regulatory mechanisms of cardiovascular risk using AlphaGenome predictions

Suitable as a BEP? No

Suitable as a MEP? Yes

Suitable as an Academic Research Project? No

Techniques:

  • Polygenic risk scores
  • Risk modeling
  • AI
  • Programming

Polygenic risk scores (PRS) capture inherited susceptibility to cardiovascular disease (CVD) but offer limited insight into underlying molecular mechanisms. AlphaGenome (DeepMind, bioRxiv 2025) predicts the regulatory consequences of genetic variants across molecular layers such as gene expression and chromatin accessibility. This project aims to partition CVD PRS by predicted regulatory impact, generating a regulatory impact score that integrates these multiomic effects. The goal is to determine whether weighting variants by their regulatory influence can reveal molecular processes that contribute to—or precede—the development of cardiovascular disease.

Further reading (click to link to article)

https://www.biorxiv.org/content/10.1101/2025.06.25.661532v2

(Example) projects submitted by lab in past years

(2024-2025) Integrative genetic scoring for personalised risk assessment of Familial Hypercholesterolaemia

Supervisor: Annique Claringbould, a.claringbould@erasmusmc.nl

Familial hypercholesterolemia (FH) is a genetic disorder characterized by elevated blood cholesterol levels from birth, which increases the risk of cardiovascular disease (CVD). While FH is diagnosed by finding mutations in the LDLR, APOB, and PCSK9 genes, recent genome-wide association studies (GWAS) have highlighted the role of common genetic variants in lipid metabolism. When combined, these variants have a big impact on lipid levels.

In this project, we will develop a score that integrates rare mutations, common variants, and molecular variants to provide a comprehensive genetic risk assessment for FH patients. We will use data from the large-scale UK Biobank and a cohort of FH patients with early CVD. This approach aims to improve the prediction of clinical outcomes for FH patients and contribute to personalized treatment.

Techniques

  • Programming in R and on a server (linux-based) (no experience required)
  • Calculation of polygenic scores
  • Identifying strucural variation and calling rare variants in whole exome data

Further reading

Võsa U*, Claringbould A*, Westra HJ, et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat Genet 2021 539. 2021;53(9):1300-1310. doi:10.1038/s41588-021-00913-z