Lei Li

@lileics

Generative AI for language and science. MT, LLM, GenAI Safety, Drug Discovery

Joined April 2010

437Following

6KFollowers

Lei Li@lileics · Jul 18

The show is on. Welcome to 2025 Generative AI for Biology workshop. 7 invited talks + a panel with 5 panelists + 14 spotlight talks + 121 poster presentations! Huge thanks to the workshop sponsors: Genesis Therapeutics, Genbio AI, and Tencent! genbio-workshop.github.io/2025/

lileics's tweet image. The show is on. Welcome to 2025 Generative AI for Biology workshop.
7 invited talks + a panel with 5 panelists + 14 spotlight talks + 121 poster presentations!

Huge thanks to the workshop sponsors: Genesis Therapeutics, Genbio AI, and Tencent!

genbio-workshop.github.io/2025/

868

Lei Li@lileics · Jul 18

We have an excellent lineup of distinguished speakers at the Gen AI for Bio workshop! Join us in the East Exhibition Hall A on July 18, starting at 8:45am. #GenBio2025 #ICML2025

GGenBio Workshop @ ICML25@genbio_workshop · Jul 17

Hope to see you all tomorrow at the GenAI & Bio workshop!! #ICML2025 Schedule: genbio-workshop.github.io/2025/

755

Lei Li@lileics · Jul 17

We are presenting PPDiff for protein complex design at #ICML2025 west exhibition hall B2 #W-119 at 11am-1:30pm today 7/17. Come visit. @ZhenqiaoSong Key idea: sequence structure co-design + hybrid diffusion Paper: arxiv.org/abs/2506.11420

lileics's tweet image. We are presenting PPDiff for protein complex design at #ICML2025 west exhibition hall B2 #W-119 at 11am-1:30pm today 7/17.
Come visit.
@ZhenqiaoSong

Key idea: sequence structure co-design + hybrid diffusion

Paper: arxiv.org/abs/2506.11420

930

Lei Li@lileics · Jul 16

DISCO paper website: avduarte333.github.io/projects/disco/

LLei Li@lileics · Jul 16

#ICML2025 Andre @avduarte3333 and I are presenting DISCO: a new method to discover copyrighted content from VLM’s training data (without accessing to it). Welcome to visit our poster at Vancouver Convention Center East Exhibition Hall A#900 at 3pm 7/16. arxiv.org/abs/2502.17358

712

Lei Li@lileics · Jul 16

1.0K

Lei Li@lileics · Jun 13

Just delivered 4 lectures (50mins each, a total of 3hours 20mins) in a roll at Advanced course on Data Science and Machine Learning (acdl2025.icas.events). Wonderful to have conversations with the ACDL participants! thanks to the directors, Giuseppe Nicosia and Panos Pardalos

lileics's tweet image. Just delivered 4 lectures (50mins each, a total of 3hours 20mins) in a roll at Advanced course on Data Science and Machine Learning (acdl2025.icas.events). Wonderful to have conversations with the ACDL participants! thanks to the directors, Giuseppe Nicosia and Panos Pardalos

1.0K

Lei Li@lileics · May 23

We are organizing Generative AI for Biology workshop at #ICML2025. Welcome to submit any relevant work on AI for biomolecule, AI model for bio systems, AI and experiments, Agent for bio discovery, new datasets and tools, etc. The deadline is May 25th. genbio-workshop.github.io/2025/

GGenBio Workshop @ ICML25@genbio_workshop · May 20

⏰ Deadline extended to May 25th for GenAI and Biology workshop, considering multiple requests & NeurIPS deadline! 🚀Recent submissions to NeurIPS & other conferences/journals are welcome! 🧬For amazing speakers and more details: genbio-workshop.github.io/2025/

7.0K

Lei Li@lileics · May 3

Better than LoRA! You only need to train as few as 18 token embeddings of LLaMA to achieve superior translation performance on new languages. KS-Lottery provides a statistical sound method to find an extremely small number of LLM embedding parameters to fine-tune!

LLei Li@lileics · May 1

I will give a talk at 11:15am today in Ruidoso at #NAACL2025 about KS-Lottery— finding small number of token embeddings in an LLM that are effective for fine-tuning. Surprising finding: 18 tokens are enough for fine-tuning!

2.0K

Lei Li@lileics · May 2

How to reduce latency for simultaneous (text) translation? Siqi proposes TAF method — the key idea is to forecast source side continuations of utterance before actual input, and then using majority voting to generate possible translations. arxiv.org/abs/2410.22499 #NAACL2025

SSiqi Ouyang@siqi_ouyang · Apr 29

Excited to be at #NAACL2025 in Albuquerque! We have two papers on simultaneous translation 🎉 1️⃣ Anticipating Future with Large Language Model for Simultaneous Machine Translation 🗓 Apr 30, 11:45–12:00 @ Ruidoso (Oral) 🔗 arxiv.org/abs/2410.22499 2️⃣ CA*: Addressing Evaluation…

2.0K

Lei Li@lileics · May 2

Simultaneous translation always aims to reduce latency while retaining translation quality, but measuring latency turns non-trivial. Xi and Siqi’s new work proposes a highly accurate method, CA*, to measure latency in ST, by taking actual inference time into account. #NAACL25

lileics's tweet image. Simultaneous translation always aims to reduce latency while retaining translation quality, but measuring latency turns non-trivial. Xi and Siqi’s new work proposes a highly accurate method, CA*, to measure latency in ST, by taking actual inference time into account.
#NAACL25

1.0K

Lei Li@lileics · May 2

Can AI text detectors identify LLm generated code, paper reviews, abstract, translation, summary? Brian is presenting a new study about existing AI text detectors on LLM generated content at #NAACL2025 TLDR; all existing detectors work poorly. arxiv.org/abs/2412.05139

lileics's tweet image. Can AI text detectors identify LLm generated code, paper reviews, abstract, translation, summary? Brian is presenting a new study about existing AI text detectors on LLM generated content at #NAACL2025

TLDR; all existing detectors work poorly.

arxiv.org/abs/2412.05139

4.0K

Lei Li@lileics · May 2

Kexun is presenting OSCA - Optimal Sample Compute Allocation at #NAACL2025 in Hall 3 (#50). The paper presents an optimization algorithm to find optimal configurations for LLM inference. arxiv.org/abs/2410.22480

lileics's tweet image. Kexun is presenting OSCA - Optimal Sample Compute Allocation at #NAACL2025 in Hall 3 (#50).
The paper presents an optimization algorithm to find optimal configurations for LLM inference.

arxiv.org/abs/2410.22480

746

Lei Li@lileics · May 1

lileics's tweet image. I will give a talk at 11:15am today in Ruidoso at #NAACL2025 about KS-Lottery— finding small number of token embeddings in an LLM that are effective for fine-tuning.
Surprising finding: 18 tokens are enough for fine-tuning!

4.0K

Lei Li@lileics · Apr 29

Excited to visit ABQ! We are presenting six papers at #NAACL2025 on simultaneous translation/speech translation, inference-time optimization, finding lottery tickets in LLMs, AI text detection, and language agents for task planning. I am here the full week. Feel free to DM.

lileics's tweet image. Excited to visit ABQ! We are presenting six papers at #NAACL2025 on simultaneous translation/speech translation, inference-time optimization, finding lottery tickets in LLMs, AI text detection, and language agents for task planning.
I am here the full week. Feel free to DM.

3.0K

Lei Li Retweeted

Xuandong Zhao@xuandongzhao · Apr 20

This work was partially done with @lileics and @yuxiangw_cs during our time at @UCSB Poster attached for a better overview! 🎯

1.0K

Lei Li@lileics · Apr 17

The 2nd Generative AI and Biology workshop will collocate with ICML 2025 in Vancouver this year (July 18/19, 2025). CFP: genbio-workshop.github.io/2025/ We have a fantastic lineup of speakers. @MengdiWang10 @ericxing @marinkazitnik @StefanoErmon @MinkaiX @ZhenqiaoSong

GGenBio Workshop @ ICML25@genbio_workshop · Apr 16

Hi everyone, we are so back! Delighted to announce the 2nd Generative AI and Biology (GenBio) Workshop @icmlconf #icml2025! Join us in this exciting discourse on all aspects of the future of #GenerativeAI and Biology!! 🧬🚀 Website: genbio-workshop.github.io/2025/ 1/n

4.0K

Lei Li@lileics · Mar 25

a newly baked Dr. Congratulations to @WendaXu2 for successfully defending his phd thesis "On Evaluation and Efficient Post-training for LLMs". Highly recommend his slides: covering RL training, better KD, LLM/text gen evaluation, bias in LLM as a judge: docs.google.com/presentation/d…

WWenda Xu@WendaXu2 · Mar 25

[Life update] 🎉 I successfully defended my PhD thesis "On Evaluation and Efficient Post-training for LLMs" @ucsbNLP and am officially a PhD! Huge thanks to my advisors @WilliamWangNLP @lileics, my committee @markuseful & Simon Todd, and everyone who supported me during my PhD…

5.0K

Lei Li Retweeted

Conference on Language Modeling@COLM_conf · Mar 10

Excited to announce our 2025 keynote speakers: @cosmo_shirley, Nicholas Carlini, @LukeZettlemoyer, and Tom Griffiths!

127

34.0K

Lei Li@lileics · Mar 2

Congratulations Dr. Sun @EdwardSun0909 ! Zhiqing's phd thesis on Scalable alignment of LLM is a must-read if you work on LLM recently.

ZZhiqing Sun@EdwardSun0909 · Feb 28

I successfully defended my PhD thesis today! 🎉 "Scalable Alignment of Large Language Models Towards Truth-Seeking, Complex Reasoning, and Human Values" Slides (Fact-RLHF, Lean-STaR, Easy-to-Hard Generalization, Self-Align, Instructable Reward Model): docs.google.com/presentation/d… A…

7.0K

Lei Li@lileics · Feb 14

A new comprehensive multilingual (and multitask) evaluation suite for LLMs (covering 17 diverse languages), developed by @xuhuang87 and folks! Check out BenchMAX at github.com/CONE-MT/BenchM…

XXu Huang@xuhuang87 · Feb 13

🤩Excited to announce our new work BenchMAX!🥳 BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models Paper: huggingface.co/papers/2502.07… Repo: github.com/CONE-MT/BenchM… Datasets: huggingface.co/collections/LL…

7.0K