Jane Pan @ ACL 2025 (@JanePan_)

Pinned

Jane Pan @ ACL 2025 Retweeted

J

Jacob Andreas@jacobandreas · Jul 25, 2024

I really like the paper from Jane Pan (w @danqi_chen) abt this: arxiv.org/abs/2305.09731. ICL in big models is clearly a mix of task recognition and "real learning" (you're not learning to translate from 3 examples, but you're not getting an arbitrary label mapping from the prior)

2

7

30

13

3.0K

J

Jane Pan @ ACL 2025@JanePan_ · 10 m

Bored of seeing pristine, perfect posters? Come see me at Hall X5, Board 105 at 6pm to witness my masterpiece, featuring bonus Sharpie scribbles and a QR code that betrayed me at the last moment 😤

JJane Pan @ ACL 2025@JanePan_ · Jul 24

I'll be at ACL Vienna 🇦🇹 next week presenting this work! If you're around, come say hi on Monday (7/28) from 18:00–19:30 in Hall 4/5. Would love to chat about code model benchmarks 🧠, simulating user interactions 🤝, and human-centered NLP in general!

0

2

6

1

90

J

Jane Pan @ ACL 2025@JanePan_ · Jul 24

I'll be at ACL Vienna 🇦🇹 next week presenting this work! If you're around, come say hi on Monday (7/28) from 18:00–19:30 in Hall 4/5. Would love to chat about code model benchmarks 🧠, simulating user interactions 🤝, and human-centered NLP in general!

JJane Pan @ ACL 2025@JanePan_ · Feb 26

When benchmarks talk, do LLMs listen? Our new paper shows that evaluating that code LLMs with interactive feedback significantly affects model performance compared to standard static benchmarks! Work w/ @RyanShar01, @jacob_pfau, @atalwalkar, @hhexiy, and @valeriechen_! [1/6]

1

2

38

4

4.0K

Jane Pan @ ACL 2025 Retweeted

V

Vishakh Padmakumar@vishakh_pk · Apr 29

What does it mean for #LLM output to be novel? In work w/ @jcyhc_ai, @JanePan_, @valeriechen_, @hhexiy we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

2

26

84

34

7.0K

J

Jane Pan @ ACL 2025@JanePan_ · Apr 16

We're excited to receive wide attention from the community—thank you for your support! We release code, trained probes, and the generated CoT data👇 github.com/AngelaZZZ-611/… We have labeled answer data on its way. Stay tuned!

YYulin Chen@YulinChen99 · Apr 10

Reasoning models overthink, generating multiple answers during reasoning. Is it because they can’t tell which ones are right? No! We find while reasoning models encode strong correctness signals during chain-of-thought, they may not use them optimally. 🧵 below

1

12

45

19

5.0K

J

Jane Pan @ ACL 2025@JanePan_ · Apr 10

Do reasoning models know when their answers are right?🤔 Really excited about this work led by Anqi and @YulinChen99. Check out this thread below!

YYulin Chen@YulinChen99 · Apr 10

Reasoning models overthink, generating multiple answers during reasoning. Is it because they can’t tell which ones are right? No! We find while reasoning models encode strong correctness signals during chain-of-thought, they may not use them optimally. 🧵 below

0

7

68

35

9.0K

Jane Pan @ ACL 2025 Retweeted

N

Nicholas Lourie@NickLourie · Mar 19, 2024

How do you know if a method is better, or just has better hyperparameters? @hhexiy, @kchonyc, and I give a new tool to answer this in our #NAACL2024 paper: "Show Your Work with Confidence" arxiv.org/abs/2311.09480. Use it in your own work with just a "pip install opda"! 🧵 1/8

4

26

104

56

22.0K