Amanda Bertsch

@abertsch72

PhD student @LTIatCMU / @SCSatCMU, researching long context + decoding | she/her | also @ abertsch on bsky or http://sigmoid.social or by email (http://cs.cmu.edu)

Joined August 2014

855Following

2KFollowers

Pinned

Amanda Bertsch@abertsch72 · May 3, 2024

In-context learning provides an LLM with a few examples to improve accuracy. But with long-context LLMs, we can now use *thousands* of examples in-context. We find that this long-context ICL paradigm is surprisingly effective– and differs in behavior from short-context ICL! 🧵

abertsch72's tweet image. In-context learning provides an LLM with a few examples to improve accuracy. But with long-context LLMs, we can now use *thousands* of examples in-context.

We find that this long-context ICL paradigm is surprisingly effective– and differs in behavior from short-context ICL! 🧵

109

579

436

151.0K

Amanda Bertsch Retweeted

Abhilasha Ravichander@lasha_nlp · Jul 22

Life update: I’m excited to share that I’ll be starting as faculty at the Max Planck Institute for Software Systems(@mpi_sws_) this Fall!🎉 I’ll be recruiting PhD students in the upcoming cycle, as well as research interns throughout the year: lasharavichander.github.io/contact.html

527

47.0K

Amanda Bertsch Retweeted

Sedrick Keh@sedrickkeh2 · Jul 18

📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀 OpenThinker3-1.5B is a smaller version of our previous 7B model, trained on the same OpenThoughts3-1.2M dataset.

112

11.0K

Amanda Bertsch@abertsch72 · Jul 18

Excited to talk about long-context models / eval at this panel on Saturday! I'm also looking for a postdoc / PhD students to work on related topics, happy to chat with anyone interested at #ICML2025!

ZZexue He@ZexueHe · Jul 11

💡 Curious about long-context foundation models (LFCM)? 🧠 We’re hosting a panel at the LCFM workshop at #ICML2025 on “How to evaluate long-context foundation models?” — We’d love to feature your question! Anything on long-context evaluation or modeling — drop it below / DM me🎤

3.0K

Amanda Bertsch Retweeted

Zexue He@ZexueHe · Jul 11

5.0K

Amanda Bertsch Retweeted

John Hewitt@johnhewtt · Jun 24

I’m beginning to share notes from my upcoming fall 2025 NLP class, Columbia COMS 4705. First up, some notes to help students brush up on math. Vectors, matrices, eigenstuff, probability distributions, entropy, divergences, matrix calculus cs.columbia.edu/~johnhew/coms4…

437

401

30.0K

Amanda Bertsch Retweeted

Zirui Liu@ziruirayliu · Jun 12

🔥Exited to share our new work on reproducibility challenges in reasoning models caused by numerical precision. Ever run the same prompt twice and get completely different answers from your LLM under greedy decoding? You're not alone. Most LLMs today default to BF16 precision,…

13.0K

Amanda Bertsch Retweeted

Ori Ernst@oriern1 · Jun 10

🧵 New paper at Findings #ACL2025 @aclmeeting! Not all documents are processed equally well. Some consistently yield poor results across many models. But why? And can we predict that in advance? Work with Steven Koniaev and Jackie Cheung @Mila_Quebec @McGill_NLP #NLProc (1/n)

914

Amanda Bertsch Retweeted

Lindia Tjuatja @ ACL 2025@lltjuatja · Jun 9

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9

139

27.0K

Amanda Bertsch Retweeted

Han Guo@HanGuo97 · Jun 6

We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between? Introducing Log-Linear Attention with: - Log-linear time training - Log-time inference (in both time and memory) - Hardware-efficient Triton kernels

197

1.0K

850

258.0K

Amanda Bertsch Retweeted

Yapei Chang@YapeiChang · May 20

🤔 Can simple string-matching metrics like BLEU rival reward models for LLM alignment? 🔍 We show that given access to a reference, BLEU can match reward models in human preference agreement, and even train LLMs competitively with them using GRPO. 🫐 Introducing BLEUBERI:

193

134

79.0K

Amanda Bertsch Retweeted

Zexue He@ZexueHe · May 19

🚀 News! Our 2nd Workshop on Long-Context Foundation Models (LCFM), to be held at ICML 2025 in Vancouver 🇨🇦! If you're working on long-context models, consider submitting your work! 🗓️ DDL: May 22, 2025 (AOE) 🌐 Web: longcontextfm.github.io 🔗 OpenReview: bit.ly/lcfmworkshop

9.0K

Amanda Bertsch@abertsch72 · May 2

Did I just accidentally order 35 sopapillas to my airbnb? Perhaps

LLindia Tjuatja @ ACL 2025@lltjuatja · Apr 30

Hanging around NAACL and presenting this Thurs, 4:15 @ ling theories oral session (ballroom 🅱️). Come say hi, will also be eating many a sopapilla

3.0K

Amanda Bertsch Retweeted

Ai2@allen_ai · May 1

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

417

126

65.0K

Amanda Bertsch@abertsch72 · Apr 29

Super honored to win the Language Modeling SAC award! I'll be presenting this work Wednesday in the 2pm poster session in Hall 3-- would love to chat with folks there or at the rest of the conference about long context data, ICL, inference time methods, New Mexican food, etc :)

AAmanda Bertsch@abertsch72 · May 3, 2024

106

11.0K

Amanda Bertsch Retweeted

Ian Wu@ianwu97 · Apr 26

I will be presenting my #ICLR2025 Spotlight work “Better Instruction-Following Through Minimum Bayes Risk” today (Sat) at 3pm! Swing by #205 in hall 3 to chat with me and @psanfernandes

4.0K

Amanda Bertsch Retweeted

Aston Zhang@astonzhangAZ · Apr 5

Our Llama 4’s industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I’d been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates! 🚀Llama 4 Scout 🔹17B…

136

1.0K

371

210.0K