Lisa Alazraki (@LisaAlazraki)

Pinned

L

Lisa Alazraki@LisaAlazraki · May 22

Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) 🚀 We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning 🧵⬇️

LisaAlazraki's tweet image. Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) 🚀

We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning 🧵⬇️

1

37

165

143

20.0K

Pinned

L

Lisa Alazraki@LisaAlazraki · Jul 25

The best review I have ever seen

YYiping Lu@2prime_PKU · Jul 25

Anyone knows adam?

0

1

5

0

379

L

Lisa Alazraki@LisaAlazraki · Jul 24

I’ll be at #acl2025 next week in Vienna🇦🇹 presenting our work. Looking forward to meeting old and new friends! If you like to discuss chatbot, code generation, multi-agents, robustness, hate speech detection, or just life😆, feel free to reach out and I’d love to chat!

ZZhenhao Li @ ACL25✈️@ZhenhaoLi1 · May 23

🙌Happy to share our paper, “DiffuseDef: Improved Robustness to Adversarial Attacks via Iterative Denoising” is accepted to #ACL2025! Great thanks to my co-authors Huichi Zhou, @MarekRei, @lspecia! Arxiv: arxiv.org/abs/2407.00248 GitHub: github.com/Nickeilf/Diffu…

0

2

7

0

988

Lisa Alazraki Retweeted

I

Imperial NLP @ACL25 ✈️@imperial_nlp · Jul 23

ACL is almost here 🎉 Our Imperial NLP community will be presenting several papers at the conference next week. We look forward to seeing everyone in Vienna!

0

6

44

5

3.0K

L

Lisa Alazraki@LisaAlazraki · Jul 17

If you are attending ICML 2025 and interested in Causal Discovery, I will be presenting our work today at 11am! See you at the East Exibition Hall - Poster 1303. #icml #icml2025

CCarles Balsells Rodas@BalsellsRodas · May 1

Excited to share that our paper "Causal discovery from Conditionally Stationary Time Series" has been accepted to ICML 2025!🥳 Pre-print: arxiv.org/abs/2110.06257 Thank you very much to all my collaborators, persistence pays off! #icml #icml2025

0

1

8

0

376

Lisa Alazraki Retweeted

A

Arianna Muti@arianna_muti · Jun 25

A new paper with @aliceiannntn and @MatteoCasiragh3 analyzing Italian tweets on feminism, tracking toxicity and shifting connotations over time is being presented today at the LLM for political science workshop at @europsa 📉📈

0

4

10

3

747

Lisa Alazraki Retweeted

L

Laura Ruis@LauraRuis · Jun 24

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

4

58

316

252

33.0K

Lisa Alazraki Retweeted

O

Oana-Maria Camburu @ACL2025 🇦🇹@oanacamb · Jun 19

🥳Thrilled to announce that I started as an Assistant Professor @imperialcollege! Grateful to all my mentors and collaborators from @ucl, @UniofOxford, and worldwide! I will continue working on 🌟Alignment and A(G)I Safety🌟 and 🚨 I have funding for PhD students 🚨. I look…

33

25

427

73

32.0K

Lisa Alazraki Retweeted

M

Matthieu Meeus@matthieu_meeus · Jun 11

(1/9) LLMs can regurgitate memorized training data when prompted adversarially. But what if you *only* have access to synthetic data generated by an LLM? In our @icmlconf paper, we audit how much information synthetic data leaks about its private training data 🐦🌬️

3

4

25

16

5.0K

Lisa Alazraki Retweeted

L

Lorenz Wolf@lorenz_wlf · Jun 11

At @RLDMDublin2025 this week to present our work on incorporating diverse prior knowledge in RL (sample efficiency, safety, interpretability,...) Poster #94 on Thursday Full paper here: arxiv.org/abs/2306.01158 #RLDM2025

0

3

11

1

494

L

Lisa Alazraki@LisaAlazraki · Jun 10

How good can privacy attacks against LLM pretraining get if you assume a very strong attacker? Check it out in our preprint ⬇️

IIlia Shumailov🦔@iliaishacked · Jun 10

Are modern large language models (LLMs) vulnerable to privacy attacks that can determine if given data was used for training? Models and dataset are quite large, what should we even expect? Our new paper looks into this exact question. 🧵 (1/10)

1

4

20

3

1.0K

Lisa Alazraki Retweeted

C

Chiara Di Bonaventura@c_dibonaventura · Jun 4

🚨 New preprint alert – accepted at #ACL2025 Findings!🚨

2

6

24

5

1.0K

L

Lisa Alazraki@LisaAlazraki · May 31

If anyone has done any work improving the robustness of NLI models and we didn't cite you in our appendix, please share a link to your work - I would love to include it Our appendix related work on NLI robustness is a bit of a monster 😅🧌 x.com/_joestacey_/st…

JJoe Stacey@_joestacey_ · May 27

This work was really fun and a great last paper for my PhD. Check it out 🙂 arxiv.org/abs/2505.20209 P.S. if you know about a paper improving NLI model robustness not already in our related work appendix, I would love to hear about it🥰

2

3

10

1

2.0K

L

Lisa Alazraki@LisaAlazraki · May 27

If you're interested in NLI or model robustness, this is a nice and fun paper - check it out 🙂 x.com/_joestacey_/st…

JJoe Stacey@_joestacey_ · May 27

We have a new paper up on arXiv! 🥳🪇 The paper tries to improve the robustness of closed-source LLMs fine-tuned on NLI, assuming a realistic training budget of 10k training examples. Here's a 60 second rundown of what we found!

0

2

14

2

988

Lisa Alazraki Retweeted

J

Joe Stacey@_joestacey_ · May 27

We have a new paper up on arXiv! 🥳🪇 The paper tries to improve the robustness of closed-source LLMs fine-tuned on NLI, assuming a realistic training budget of 10k training examples. Here's a 60 second rundown of what we found!

3

16

77

24

9.0K

Lisa Alazraki Retweeted

Z

Zhenhao Li @ ACL25✈️@ZhenhaoLi1 · May 23

🙌Happy to share our paper, “DiffuseDef: Improved Robustness to Adversarial Attacks via Iterative Denoising” is accepted to #ACL2025! Great thanks to my co-authors Huichi Zhou, @MarekRei, @lspecia! Arxiv: arxiv.org/abs/2407.00248 GitHub: github.com/Nickeilf/Diffu…

1

8

16

1

3.0K

L

Lisa Alazraki@LisaAlazraki · May 22

Can LLMs be incentivised to generate token sequences (in this case preambles) that condition downstream models to improve performance when judged by reward models? Yes! ✅

LLisa Alazraki@LisaAlazraki · May 22

Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) 🚀 We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning 🧵⬇️

1

5

18

3

3.0K