Yining Lu
@Yining__Lu
First year CS PhD student @NotreDame | Intern: @amazon | Prev: @JHUCLSP 🦋: http://yininglu.bsky.social
Thrilled to share that I'll start my Ph.D. at @ND_CSE this fall, working with @Meng_CS. I am so grateful for the sincere guidance from my current advisor, @DanielKhashabi, and for the unconditional support I received from my family, friends, and collaborators over the past years!
Introducing🔥torch-molecule🔥: A single line of code for molecular property prediction, generation & representation learning: > 30 deep learning methods + models, sklearn-style. All available at: `pip install torch-molecule` Code: github.com/liugangcode/to…
Exciting news at the @lucy_institute! The Foundation Models and Applications Lab has launched with co-directors @Meng_CS and @xiangliangzhang from @NotreDame's Department of Computer Science and Engineering. Learn more: lucyinstitute.nd.edu/news-events/20…
Now accepted by #ACL2025! Thrilled to see our paper also referenced in @lilianweng's latest blog post on reasoning in LLMs! Check it out: lilianweng.github.io/posts/2025-05-…
Process supervision for reasoning is 🔥! While previous approaches often relied on human annotation and struggled to generalize across different reasoning tasks, we're now asking: Can we improve this? Introducing 𝐑𝐀𝐓𝐈𝐎𝐍𝐀𝐋𝐘𝐒𝐓: a new model pre-trained on implicit…
Pleased to share that two papers were accepted to #ACL2025 main! Huge congratulations to all collaborators for the hard work and time we put in together! Both works study the multi-model collaboration. I’ll leave it to @Dongwei__Jiang to share more about his first-author paper:…
📣 New Preprint 📣 Did you realize there is a hidden misalignment between decomposer and verifier in long-form text factuality evaluation—an NP-hard puzzle for current methods? 🤔 We tackle this with an online RL solution called Dynamic Decomposition 👇 huggingface.co/papers/2503.15…
Excited to be presenting our paper on training language models under heavily imbalanced data tomorrow at #NAACL2025! If you want to chat about data curation for both pre- and post-training, feel free to reach out! 📝 arxiv.org/abs/2410.04579 📅 11-12:30am, Fri, May 2 📍 Hall 3
"Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets" arxiv.org/abs/2410.04579 TLDR—When pre-training on imbalanced data, "Upsampling" and loss "Upweighting" are often assumed equivalent. (1)We show they behave differently. (2) Using this, we propose…
Excited to present two papers today and tomorrow at #NAACL2025! Look out for our oral sessions: TurkingBench: arxiv.org/abs/2403.11905 📅 4-5:30pm, Thur, May 1 📍 Ballroom A (R&E.4) Verifiable by Design: arxiv.org/abs/2404.03862 📅 9-10:30am, Fri, May 2 📍 Ballroom A (HC.1)
Highlighting our #NAACL2025 papers 🧵🧵🧵
Quick reminder that our paper, Benchmarking Language Model Creativity: A Case Study on Code Generation, will be presented today! 📅 11AM-12:30PM, Fri, May 2 📍 Hall 3 📝 arxiv.org/abs/2407.09007 🎥 youtube.com/watch?v=v1cHyC…

Current copyright mitigation methods for LLMs typically focus on average-case risks, but overlook worst-case scenarios involving long verbatim copying ⚠️. We propose BloomScrub 🧽, a method providing certified mitigation of worst-case infringement while preserving utility.
I will be at #NAACL2025 to present our LLM creativity benchmark work. Feel free to drop by if interested (Poster Session 8, Fri, May 2)! I'd love to chat about RL and its interpretability, data influence for post-training, CogSci for LLM, and any other NLP-related topics. Feel…
"Benchmarking Language Model Creativity: A Case Study on Code Generation" arxiv.org/abs/2407.09007 TLDR— Proposed a framework for benchmarking LLMs' 𝒄𝒓𝒆𝒂𝒕𝒊𝒗𝒊𝒕𝒚.
A video teaser of @Yining__Lu 's paper: youtube.com/watch?v=v1cHyC…
Midwest Speech and Language Days is in full swing at @NotreDame! #NLProc #MSLD2025
Had a great time chatting with Prof. Xifeng Yan about some of my ongoing research ideas. I learned so much from his valuable insights. Thanks to Prof. @Meng_CS and @ND_CSE for organizing this!
We were so happy having Prof. Xifeng Yan (UCSB) at Notre Dame today. He chatted with our @ND_CSE graduate students, met many old and new friends, and gave a wonderful talk about his recent thoughts on language models. Thanks, Xifeng! 😃
We were so happy having Prof. Xifeng Yan (UCSB) at Notre Dame today. He chatted with our @ND_CSE graduate students, met many old and new friends, and gave a wonderful talk about his recent thoughts on language models. Thanks, Xifeng! 😃