Linlu Qiu

@linluqiu

PhD student @MIT_CSAIL | formerly AI resident @Google

Joined March 2019

654Following

869Followers

Linlu Qiu Retweeted

Adam Zweiger@AdamZweiger · Jul 14

Come check out our ICML poster on combining Test-Time Training and In-Context Learning for on-the-fly adaptation to novel tasks like ARC-AGI puzzles. I will be presenting with @jyo_pari at E-2702, Tuesday 11-1:30!

4.0K

Linlu Qiu Retweeted

Songlin Yang@SonglinYang4 · May 24

📢 (1/16) Introducing PaTH 🛣️ — a RoPE-free contextualized position encoding scheme, built for stronger state tracking, better extrapolation, and hardware-efficient training. PaTH outperforms RoPE across short and long language modeling benchmarks arxiv.org/abs/2505.16381

506

302

67.0K

Linlu Qiu Retweeted

MIT NLP@nlp_mit · Mar 27

Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the latest research from our labs as we dive deep into language, learning, and logic 🤖📚🧠

547

100.0K

Linlu Qiu Retweeted

Alisa Liu @ ICML 🚀@alisawuffles · Mar 21

We created SuperBPE🚀, a *superword* tokenizer that includes tokens spanning multiple words. When pretraining at 8B scale, SuperBPE models consistently outperform the BPE baseline on 30 downstream tasks (+8% MMLU), while also being 27% more efficient at inference time.🧵

329

3.0K

1.0K

361.0K

Linlu Qiu Retweeted

Zhaofeng Wu ✈️ ACL@zhaofeng_wu · Mar 18

Robust reward models are critical for alignment/inference-time algos, auto eval, etc. (e.g. to prevent reward hacking which could render alignment ineffective). ⚠️ But we found that SOTA RMs are brittle 🫧 and easily flip predictions when the inputs are slightly transformed 🍃 🧵

167

29.0K

Linlu Qiu Retweeted

Isha Puri@ishapuri101 · Feb 6

[1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint @MIT_CSAIL / @RedHat AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out …abilistic-inference-scaling.github.io

231

153

42.0K

Linlu Qiu Retweeted

Yung-Sung Chuang@YungSungChuang · Feb 14

(1/5)🚨LLMs can now self-improve to generate better citations✅ 📝We design automatic rewards to assess citation quality 🤖Enable BoN/SimPO w/o external supervision 📈Perform close to “Claude Citations” API w/ only 8B model 📄arxiv.org/abs/2502.09604 🧑‍💻github.com/voidism/SelfCi…

310

193

38.0K

Linlu Qiu@linluqiu · Nov 11

It was a great pleasure working on this project with amazing collaborators! Excited to see more opportunities opened up by scaling test-time compute!

EEkin Akyürek@akyurekekin · Nov 10

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! @arcprize

6.0K

Linlu Qiu Retweeted

Ekin Akyürek@akyurekekin · Nov 10

336

2.0K

492.0K

Linlu Qiu Retweeted

Yung-Sung Chuang@YungSungChuang · Jul 10, 2024

🚨Can we "internally" detect if LLMs are hallucinating facts not present in the input documents? 🤔 Our findings: - 👀Lookback ratio—the extent to which LLMs put attention weights on context versus their own generated tokens—plays a key role - 🔍We propose a hallucination…

448

382

53.0K

Linlu Qiu Retweeted

Nathaniel Weir@Nathaniel_Weir · May 28, 2024

LLMs can generate executable code to solve algorithmic tasks, but what about “softer” NL tasks such as commonsense reasoning? We present CoGEX, which trains LLMs to generate Python programs with undefined function calls, and then emulates their execution. arxiv.org/abs/2405.16337

11.0K