Arnav Goel
@_goel_arnav
MSML @mldcmu | Pre-training, alignment and memorization | ex - @MSFTResearch, @nlp_usc, @Mila_Quebec, IBM | CSAI @IIITDelhi '25
✈️I will be @iclr_conf in Singapore🇸🇬 next week to present our work on attributing the cultural knowledge of a LLM to its memorization or generalization of it's pre-training corpora. Looking forward to chatting with people 🙂 #ICLR2025 📜: arxiv.org/abs/2412.20760
The extreme classification team at MSR India (microsoft.com/en-us/research…) is looking for an undergraduate student interested in a 6-month internship beginning July 2025. Passionate about systems, SysML, and training optimizations? Apply here: forms.cloud.microsoft/r/zMThgrTcYB
Had some of the best months of my college life here. Highly recommend people to register for this!!
Announcing the Microsoft Research India Academic Summit 2025 Microsoft Research (MSR) India Academic Summit is an event aimed at strengthening ties between the Indian academic community and researchers at MSR India. 📅 Event Dates: June 24th & 25th
Proud of my student @huihan_li and intern Arnav presenting their #ICLR2025 work on attributing culture-conditioned generation to LLM’s training corpora. Fun time meeting many friends. Ping me if you want to chat about model security, interpretability and human-LM interaction!
Scouts from @yutori_ai is likely one of the best AI products I have used recently. It greatly simplifies tracking ("scouting") web items I want with impressive precision. Interesting to watch how this paves way for cooler interfaces for human-agent interaction.
Really cool work which paves way for pre-taining more trustworthy and controllable models!
1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵
I will be at #ICML2025 🇨🇦 from Wednesday through Saturday. My students have a lot of exciting papers - check them out and come talk to us! Especially thrilled to have received the Outstanding Paper Award🏆 for our work on creativity.
I just saw @_albertgu call the major AI labs as "Big Token" and it has to be the most hilarious shit ever lol
Anyone attending ICML 2025 looking to share accommodation? or have a place for another person?
Was lucky to had gotten an early peak at this and have been waiting for it to go public. Really cool work!
How can we unlock generalized reasoning? ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards. TLDR: - EBTs are the first model to outscale the…
Damn nice
We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO
PywhyLLM: Creating an API for language models to interact with causal methods and vice versa. v0.1 out, welcome feedback. If you are at #iclr2025, come check out our poster today at 10am-12:30pm. github.com/py-why/pywhyllm
Starting now in Hall 3 (#255). Drop by if you want to chat about memorization, culture or just LLMs in general :)
✈️I will be @iclr_conf in Singapore🇸🇬 next week to present our work on attributing the cultural knowledge of a LLM to its memorization or generalization of it's pre-training corpora. Looking forward to chatting with people 🙂 #ICLR2025 📜: arxiv.org/abs/2412.20760
What changes for causality research in the age of LLMs and what does not? Enjoyed this conversation with Alex Molak on how LLMs are accelerating causal discovery, how diverse environments can learn help causal agents, and how causality is critical for verifying AI actions. Link👇
I will be presenting our @iclr_conf paper on attributing culture conditioned generations to memorization of pretraining data, on Fri April. 25, Hall 3 + Hall 2B #255! DM me if you want to chat about memorization, culture, or anything else! #ICLR2025 #iclr #ICLR25
Accepted @iclr_conf🤩We build a pretraining corpora attribution framework that determines whether an entity is associated with a culture through memorization or other driving factors, and analyze whether such associations are related to pretraining data distribution. #ICLR2025