Sebastiano Cultrera di Montesano
@sebacultrera
Postdoc at @Schmidt_Center within @Broadinstitute
Excited to share our latest preprint, introducing the hierarchical cross-entropy (HCE) loss — a simple change that consistently improves performance in atlas-scale cell type annotation models. doi.org/10.1101/2025.0…

From time to time I like to video myself solving maths problems in real time. Since the IMO has just happened, I plan to try a few of this year's questions. Here I get through Question 1. More to come at some point. youtube.com/watch?v=1G1nyS…
Thank you @nalidoust for the kind shoutout! As single-cell datasets continue to grow, so does the need for tools that make it easier and faster to work with them, especially for training AI models at scale. We’re excited to see what models it will help the community build!
We love it when others build dev tools for building on Tahoe-100M. Today, we highlight scDataset in our @tahoe_ai blog. Developed by @davide_dascenzo & @sebacultrera, it makes it easier to train DL models on datasets with the daunting scale of Tahoe-100M (see poster in ICML '25)
🎉 Congratulations to the @Schmidt_Center fellows whose research is being highlighted at #ICML2025 in Vancouver! Be sure to check out their work. 🔗More information: linkedin.com/posts/ewsccent… @broadinstitute #SchmidtCenter #ML
🎉 Congratulations to our postdoctoral fellow @sebacultrera on his latest preprint! Learn more about his work below.👇 @broadinstitute
How can deep learning models be trained efficiently on massive single-cell datasets—without full in-memory loading or format conversion? We built scDataset to make data loading fast, memory-efficient, and accessible to everyone. Check it out!
How can deep learning models be trained efficiently on massive single-cell datasets—without full in-memory loading or format conversion? We built scDataset to make data loading fast, memory-efficient, and accessible to everyone. Check it out!
🚀Training deep learning models on massive single-cell datasets is now fast & easy!🧬 scDataset enables fast random sampling from disk—no memory blowup, no format conversion. On Tahoe 100M, up to 48× faster than AnnLoader! 🔗 github.com/Kidara/scDatas… 📄 arxiv.org/abs/2506.01883
⌨️ 😇 Drafting for NeurIPS? Submit to #ICML2025 workshop on Scaling Up Intervention Models (SIM) too! Let’s enjoy some fun science in Vancouver this July. 🌞🌳 🗓️Workshop submission due on May 20 AOE 🔗More info: sites.google.com/view/sim-icml2…
@sebacultrera’s latest preprint introduces hierarchical cross-entropy (HCE) — a simple yet powerful loss function that improves cell type annotation performance by 12-15% across models, from linear classifiers to transformers: biorxiv.org/content/10.110…
🎉 Congratulations to @Schmidt_Center postdoctoral fellow Sebastiano Cultrera di Montesano and his team on their latest preprint! Learn more: doi.org/10.1101/2025.0…
Excited to share our latest preprint, introducing the hierarchical cross-entropy (HCE) loss — a simple change that consistently improves performance in atlas-scale cell type annotation models. doi.org/10.1101/2025.0…
🎉@Schmidt_Center's @sebastianocult3 & team's paper was accepted to the Symposium on Computational Geometry! They implement the banana tree data structure, designed to maintain persistent homology, for dynamically changing time series data: arxiv.org/html/2405.17920 @broadinstitute