Carlos Miguel Patiño
@cmpatino_
🇨🇴 Post-Training @huggingface | MSc in AI Student @UvA_Amsterdam
Merry Christmas 🎁 GSPO is in TRL. Looking forward to see your reward curves 📈
Introducing SmolTalk2: the dataset behind SmolLM3's dual reasoning. - mid-training → 5M samples - SFT data → 3M samples - preferences for APO → 500k samples It combines open datasets with new ones curated for strong think and no_think performance. hf.co/datasets/Huggi…
My third week in the post-training team @huggingface coincided with the team’s offsite in Switzerland. Loved the opportunity to get to know the team while hiking and working together for upcoming releases. Thank you @_lewtun and @lvwerra for organizing it!

We’ll be in Singapore presenting this work at #ICLR! Join us if you want to chat about uncertainty quantification that also considers the prediction’s utility
Conformal prediction for high-stakes decisions: Exploration of value of leveraging downstream costs to identify prediction sets #ICLR2025, with Santiago Cortes-Gomez, @brwilder, Carlos Patino, Yewon Byun, and Steven Wu. tinyurl.com/hdy5b2fz
Thanks @fra_grisoni for a great talk at the University of Amsterdam! Learned a lot about how to apply AI effectively to molecular structure generation. Strong start to the #ChemAI events this week
It was a lot of fun to participate as a Language Ambassador in this launch. But most of all, it makes me proud to see the Colombian flag present at the launch of an AI model. Thanks @CohereForAI for driving this mission!
We create breakthroughs together. ✨ Aya Expanse Ambassadors represent 45 countries and 23 languages. Before the launch of Aya Expanse, we invited 110 ambassadors to join us to shape how Aya worked for communities all over the world. 🌍
My personal highlight of this project was seeing how our methods incorporated expert knowledge to make a model more useful for decision-makers in medical diagnosis. Another great work led by @sancortes_95 along with @yewonbyun_, @erichorvitz, and @brwilder
Excited to share our latest work, where we produce sets with both statistical coverage and high decision utility. Applied to dermatological diagnosis, our method yields sets with coherent diagnostic meaning 🏥. More details in the thread 🧵👇
Quantifying AI uncertainty & reasoning about sets of outcomes impt. for high-stakes decisions. Enjoyed collab. w/ @sancortes_95, @brwilder & colleagues on "backing the utility structure of diagnosis" into the learning process: Studies show significant reduction of decision loss.
Excited to share our latest work, where we produce sets with both statistical coverage and high decision utility. Applied to dermatological diagnosis, our method yields sets with coherent diagnostic meaning 🏥. More details in the thread 🧵👇
LLMs tend to be awesome in English but not so much in Spanish. It's great when you find LLMs like @CohereForAI's Aya Expanse that get the quirks of the language! #mysterybot

Couldn't attend #ICML2024, but @sancortes_95 is in Vienna presenting our work about overcoming selection biases & distribution shifts. Great discussions and work ahead to improve machine learning for societal decision-making! Link to the paper: openreview.net/forum?id=CiZN2…

We are presenting our work at Wednesday’s 1:30 pm poster session. Please, come by Hall C 4-9 #100 to discuss more about techniques to overcome selection bias/distribution shift when deploying ML systems for societal decision making!
Elated to announce our work on statistical inference under selection bias. (w @brwilder, @cmpatino_ and Mateo Dulce). Our work was motivated by the insidious problems present on datasets that are conditioned on the outcome, like for example healthcare data 🏥. 🧵
Had a great time working on this project led by @sancortes_95! He explains it more eloquently in the thread, but the work is useful in scenarios like healthcare when you only have data from people that go to the hospital but want to get an estimand for the whole population.
Elated to announce our work on statistical inference under selection bias. (w @brwilder, @cmpatino_ and Mateo Dulce). Our work was motivated by the insidious problems present on datasets that are conditioned on the outcome, like for example healthcare data 🏥. 🧵
After *months* of preparation - we're *extremely* excited to announce: The Private AI Series In partnership with: - @FHIOxford - @PyTorch - @FacebookAI - @UNBigData w/ guests: - Cynthia Dwork - @HNissenbaum - @dawnsongtweets -... Sign Up Here: courses.openmined.org
Universidad de los Andes in Colombia, is one the partner universities. #LATAM #AI
Our scholarship programme was established in an effort to help build a stronger and more inclusive AI community. This week, we welcomed our largest and most geographically diverse cohort yet. (1/3) bit.ly/3mOA750
The programme now supports scholars in countries currently underrepresented in AI, including Bulgaria, Colombia, Greece, Poland, Romania, South Africa & Turkey. We are also establishing new scholarships in Canada and France, and continuing our support in the UK & US. (2/3)