EdinburghNLP

@EdinburghNLP

The Natural Language Processing Group at the University of Edinburgh. BFF with @Imperial_NLP

Edinburgh, Scotland

Joined May 2017

156Following

13KFollowers

Pinned

EdinburghNLP@EdinburghNLP · Mar 25

Join our PhD programme in Designing Responsible Natural Language Processing at the UKRI AI Centre for Doctoral Training, University of Edinburgh. Applications are now re-opened for Home fee status candidates (past candidates need not re-apply). responsiblenlp.org

3.0K

EdinburghNLP@EdinburghNLP · 1 h

🚨 New AI Threat Alert Multilingual LLMs can secretly transfer backdoors from one language to many others. Spanish in, Chinese out—maliciously. Come see how at our poster: 🗓 Today (07/28), 18:00–19:30 📍 Hall 4/5 #ACL2025 #AIsecurity #LLMsafety

XXuanli He@zodiacJRH · May 2, 2024

🚨 New Paper! (arxiv.org/abs/2404.19597)🚨 We uncover significant vulnerabilities in Multilingual LLMs (MLLMs) (e.g., BLOOM, Llama2, Llama3, Gemma, and GPT-3.5-turbo) to cross-lingual transferable backdoor attacks. #AIsafety #LLMs #backdoors

156

EdinburghNLP Retweeted

Tanishq Abraham back from ICML@iScienceLuvr · Jul 22

Inverse Scaling in Test-Time Compute "We identify five distinct failure modes when models reason for longer: 1) Claude models become increasingly distracted by irrelevant information; 2) OpenAI o-series models resist distractors but overfit to problem framings; 3) models shift…

180

111

14.0K

EdinburghNLP Retweeted

机

机器之心 JIQIZHIXIN@jiqizhixin · Jul 22

Anthropic just released a research paper. Inverse Scaling in Test-Time Compute This study shows that longer reasoning in Large Reasoning Models (LRMs) can hurt performance—revealing a surprising inverse scaling between reasoning length and accuracy. According to this paper,…

572

496

56.0K

EdinburghNLP@EdinburghNLP · Jul 22

#Anthropic’s new paper on inverse scaling at test time is a must-read! 👏 @aryopg @PMinervini @yanda_chen_ @EthanJPerez In our recent work, we found a twist (on math task): it’s not inverse: performance goes up ⬆️ then down ⬇️. Parallel thinking might fix it. Curious? Link 👇

SSouradip Chakraborty@SOURADIPCHAKR18 · Jul 22

Recent paper by #Anthropic @aryopg @PMinervini @yanda_chen_ @EthanJPerez Inverse Scaling in Test-Time Compute : arxiv.org/abs/2507.14417 Validates our existing findings in the work published last month: Does test-time scaling always help? x.com/SOURADIPCHAKR1…

2.0K

EdinburghNLP Retweeted

Simone Scardapane@s_scardapane · Jul 25

*The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs* by @p_nawrot @PontiEdoardo @cheeesio @seb_ruder They study sparse attention techniques at scale, comparing to small dense models at the same compute budget. arxiv.org/abs/2504.17768

178

117

9.0K

EdinburghNLP Retweeted

Pasquale Minervini@PMinervini · Jul 25

The amazing folks at @EdinburghNLP will be presenting a few papers at ACL 2025 (@aclmeeting); if you're in Vienna, touch base with them! Here are the papers in the main track 🧵

7.0K

EdinburghNLP Retweeted

Joshua Ong @ ACL2025@joshuaongg21 · Jul 23

'Theorem Prover as a Judge for Sythetic Data Generation' has been accepted to ACL (Main) 🚀. Do check us out at July 30th (Wednesday) 11:00- 12:30pm at Hall 4/5! A huge thank you to my amazing collaborators: Shay @GiwonHong413849 @WendaLi8 📝: aclanthology.org/2025.acl-long.…

4.0K

EdinburghNLP Retweeted

Aryo Pradipta Gema@aryopg · Jul 22

New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵

160

1.0K

599

187.0K

EdinburghNLP Retweeted

Zheng Zhao@ACL2025@zhengzhao97 · Jul 21

Why do AI assistants feel so generic? Our new #ACL2025 paper, PersonaLens🔎, tackles this head-on. We built a new benchmark to test personalization in ways that matter. I'll be presenting our work at the poster session in Vienna next week! 🧵[1/4]

2.0K

EdinburghNLP Retweeted

Alessio Devoto@devoto_alessio · Jul 21

🏆 Our @nvidia KV Cache Compression Leaderboard is now live! Compare state-of-the-art compression methods side-by-side with KVPress. See which techniques are leading in efficiency and performance. 🥇 huggingface.co/spaces/nvidia/…

254

104

17.0K

EdinburghNLP@EdinburghNLP · Jul 21

Many thanks to the @ActInterp organisers for highlighting our work - and congratulations to Pedro, Alex and the other awardees! Sad not to have been there in person, it looked like a fantastic workshop. @AmsterdamNLP @EdinburghNLP

AActionable Interpretability Workshop ICML2025@ActInterp · Jul 20

Big congrats to Alex McKenzie, Pedro Ferreira, and their collaborators on receiving Outstanding Paper Awards!👏👏 and thanks for the fantastic oral presentations! Check out the papers here 👇

2.0K

EdinburghNLP Retweeted

Akari Asai@AkariAsai · Jul 15

I'll be hiring a couple of Ph.D. students at CMU (via LTI or MLD) in the upcoming cycle! If you are interested in joining my group, please read the FAQ before reaching out to me via email :) docs.google.com/document/d/12V…

9.0K

EdinburghNLP Retweeted

Leonardo Ranaldi@l__ranaldi · Jun 24

Are you interested in the intersection of Mathematics and NLP? Consider submitting your paper to #MathNLP 2025: The 3rd Workshop on Mathematical NLP. #EMNLP2025. Submissions will open on June 25! Take a look here for more details sites.google.com/view/mathnlp20…

1.0K

EdinburghNLP Retweeted

Mattia Opper@zvez11 · Jul 17

Transformers struggle with length generalization and long context. What can we do about it? Our new #TMLR paper with @rolandalong , @paul_smolensky and @JianfengGao0217 shows how to handle the issue. Using a new attention mechanism called TRA. Curious? Read the 🧵 for more 🤓

1.0K

EdinburghNLP@EdinburghNLP · Jul 17

Thanks to everyone who stopped by our work! If you missed it and want to know more, just drop me a message! #ICML2025

CCosimo Gregucci@ICML@c_gregucci · Jul 10

Spotlight poster coming soon at #ICML2025 @icmlconf! 📌East Exhibition Hall A-B E-1806 🗓️Wed 16 Jul 4:30 p.m. PDT — 7 p.m. PDT 📜arxiv.org/pdf/2410.12537 Let’s chat! I’m always up for conversations about knowledge graphs, reasoning, neuro-symbolic AI, and benchmarking.

2.0K

EdinburghNLP@EdinburghNLP · Jul 18

We blend imitation (SFT) and exploration (RLVR) in post-training with a simple idea: Sample a prefix of an SFT demonstration, let your policy model complete it, and mix it with other RLVR rollouts Intuitively, the model relies more on hints for problems currently out of reach

ZZeyu Huang@ZeroyuHuang · Jul 18

🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!

4.0K

EdinburghNLP@EdinburghNLP · Jul 19

I hope somebody mentioned pixel-based models: arxiv.org/abs/2401.03321 @tetraduzione

LLuca Soldaini 🎀@soldni · Jul 18

most controversial statement so far from @alisawuffles: "tokenization research is not as cool" **very vocals disagreements from crowd of tokenization nerds**

1.0K

EdinburghNLP Retweeted

Zeyu Huang@ZeroyuHuang · Jul 18

184

136

20.0K

EdinburghNLP Retweeted

Neel Rajani @ICML'25@NeelRajani_ · Jul 16

🚨New paper alert!🚨 "Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them" @ActInterp ICML'25 @deepseek_ai popularised RLVR and distillation for 'reasoning training'! But how do they differ under the hood? Details in 🧵: (1/8)

4.0K

EdinburghNLP@EdinburghNLP · Jul 10

aantonio vergari ⚔️ not at #ICML2025@tetraduzione · Oct 17

🚨Is complex query answering really complex?🚨 unfortunately not! the current benchmarks boil down to link prediction 98% of the time... how to fix this??? 👇👇👇 📜arxiv.org/abs/2410.12537 with @c_gregucci @BoXiongs @loreloc_ @PMinervini @ststaab

4.0K