Ekin Akyürek
@akyurekekin
Research @OpenAI | MIT PhD '25
✨ Big life updates ✨ - @afeyzaakyurek and I welcomed our baby! - Successfully defended my PhD and graduated from MIT 🎓 - Joined @OpenAI 🍓 Excited for what's next!
Few years ago, common sense message is that you mainly need “scale” to achieve AGI — the curves are extrapolated without any argument. Retro respectively, how silly that is. We probably couldn’t get any of these results with just pre-training, with just vanilla RL. The…
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Do you believe continual learning is what is missing? Our TTT paper is in ICML! Go talk with @AdamZweiger and @jyo_pari!
Come check out our ICML poster on combining Test-Time Training and In-Context Learning for on-the-fly adaptation to novel tasks like ARC-AGI puzzles. I will be presenting with @jyo_pari at E-2702, Tuesday 11-1:30!
There are three types of storage: activations (in-context), external memory, and model weights. If the models will spend days for a task, then they should be really good at compiling their in-context work to ab external memory or to their weights! Here we try to learn weights…
What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.
trained a nanoGPT? feeling behind before o4-mini? 🚨🚨i'm open-sourcing beyond-nanoGPT, an internal codebase to help people go from LLM basics to research-level understanding. 🚨🚨 it contains thousands of lines of from-scratch, annotated pytorch implementing advanced…
Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵
Surprising new results: We finetuned GPT4o on a narrow task of writing insecure code without warning the user. This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis. This is *emergent misalignment* & we cannot fully explain it 🧵
Veri Tezgahı yeniden başlıyor. Geçtiğimiz iki sene neler yaptık, yapay zeka alanında neler oldu ve robotik nasıl etkileniyor kısaca konuştuk. Her zaman olduğu gibi podcast linkimiz open.spotify.com/episode/3Fq5CH… Ve artık youtube'dayız! youtu.be/WYo6WsWUM84?si…
deep research is suddenly bored working on my research and start researching wind turbines :)
Thrilled to share our latest findings on data contamination, from my internship at @Google! We trained almost 90 Models on 1B and 8B scales with various contamination types using machine translation as our task and analyze the impact of contamination. arxiv.org/abs/2501.18771
imo this is the only correct take in the last couple of days.
Really neat implications. The most pressing one is that we need to code up verifiers for absolutely everything that can be verified.
New paper: We train LLMs on a particular behavior, e.g. always choosing risky options in economic decisions. They can *describe* their new behavior, despite no explicit mentions in the training data. So LLMs have a form of intuitive self-awareness 🧵
Devastatingly, we have lost a bright light in our field. Felix Hill was not only a deeply insightful thinker -- he was also a generous, thoughtful mentor to many researchers. He majorly changed my life, and I can't express how much I owe to him. Even now, Felix still has so much…