Yukyung Lee

@yukyunglee_

Postdoc at Boston University 🇺🇸 | PhD at Korea University 🇰🇷 | #nlproc | Prev: intern at NAVER, HUFS

Boston, MA

Joined February 2022

150Following

152Followers

Pinned

Yukyung Lee@yukyunglee_ · Jul 2

Can coding agents autonomously implement AI research extensions? We introduce RExBench, a benchmark that tests if a coding agent can implement a novel experiment based on existing research and code. Finding: Most agents we tested had a low success rate, but there is promise!

yukyunglee_'s tweet image. Can coding agents autonomously implement AI research extensions?

We introduce RExBench, a benchmark that tests if a coding agent can implement a novel experiment based on existing research and code.

Finding: Most agents we tested had a low success rate, but there is promise!

136

19.0K

Pinned

Yukyung Lee@yukyunglee_ · Dec 15

📣 Calling #NeurIPS2024 participants 📣 While everyone enjoys the last day of the beautiful Vancouver 🏔️🇨🇦, consider join our initiative and contribute to building models with more inclusivity and diversity, and mitigating implicit and explicit bias. ‼️Cuz we are in the…

GGrassroots Science@GrassrootsSci · Dec 9

⭐️ We're going to launch Grassroots Science, a year-long ambitious, massive-scale, fully open-source initiative aimed at developing multilingual LLMs aligned to diverse and inclusive human preferences in Feb 2025. 🌐 Check our website: grassroots.science and be sure to…

1.0K

Yukyung Lee@yukyunglee_ · Jul 21

Check out @yulu_qin and @dhevarghese's paper to see how VLMs are better at deploying taxonomic knowledge than their LM counterparts 🤖❤️🪄

YYulu Qin@yulu_qin · Jul 21

Does vision training change how language is represented and used in meaningful ways?🤔 The answer is a nuanced yes! Comparing VLM-LM minimal pairs, we find that while the taxonomic organization of the lexicon is similar, VLMs are better at _deploying_ this knowledge. [1/9]

490

Yukyung Lee Retweeted

Aaron Mueller@amuuueller · Apr 23

Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work? We propose 😎 𝗠𝗜𝗕: a Mechanistic Interpretability Benchmark!

171

28.0K

Yukyung Lee@yukyunglee_ · Apr 15

wow amazing demo 😲🎶

YYewon Kim@haiyewon · Apr 14

What role can AI play in unlocking musical creativity? At #CHI2025 (🥇 Best Paper Award 🥇), we present Amuse: a songwriting tool for musicians that turns photos, sounds, and stories into chords 🎶 📄 arxiv.org/abs/2412.18940 🎧 yewon-kim.com/amuse/

432

Yukyung Lee@yukyunglee_ · Apr 3

cooking something related but complementary... hopefully will be out very soon!

OOpenAI@OpenAI · Apr 2

We’re releasing PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research, as part of our Preparedness Framework. Agents must replicate top ICML 2024 papers, including understanding the paper, writing code, and executing experiments.

5.0K

Yukyung Lee Retweeted

Jungsoo Park@jungsoo___park · Mar 3

🚨 Just Out Can LLMs extract experimental data about themselves from scientific literature to improve understanding of their behavior? We propose a semi-automated approach for large-scale, continuously updatable meta-analysis to uncover intriguing behaviors in frontier LLMs. 🧵

4.0K

Yukyung Lee Retweeted

ACL Anthology@aclanthology · Feb 11

If you're working on an ACL paper and citing any papers with extremely long author lists, consider updating your acl_natbib.bst file, which truncates these lists at 20 authors. github.com/acl-org/acl-st…

115

9.0K

Yukyung Lee Retweeted

Najoung Kim 🫠@najoungkim · Dec 18

RT appreciated! 🙏 ACL 2025 Ling theory & Cognitive modeling track is looking for emergency reviewers. The emergency review period is between 3/18-26, and these reviewers will be excluded from the ARR cycle. If you're interested, please sign up here! docs.google.com/forms/d/1fH7kD…

5.0K

Yukyung Lee@yukyunglee_ · Nov 16

so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! 🎉🪅🎉 I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!

HHayley Ross@HayleyRossLing · Oct 24

New paper with @najoungkim and @TeaAnd_OrCoffee testing if LLMs can draw adjective-noun inferences like humans! Turns out they often can, and even generalize to unseen combinations. But they're more optimistic about "artificial intelligence" than humans. arxiv.org/abs/2410.17482

3.0K

Yukyung Lee Retweeted

Niloofar (✈️ ICML)@niloofar_mire · Nov 15

It’s the season of “finding ur next adventure” and folks have been asking me about my experience doing a postdoc, so I did a blog post/video w/ @srush_nlp 's help! Should I do a postdoc? What is a postdoc anyway? youtu.be/O7psT1ftwqs Blog: homes.cs.washington.edu/~niloofar/blog…

168

22.0K