Genglin Liu (@genglin_liu)

Pinned

G

Genglin Liu@genglin_liu · Apr 20

Excited to share my first project at UCLA! We built MOSAIC — a social network simulator where LLM-powered agents behave like real users on social media. They post, share, flag, and debate the factuality of news content — all at scale. It’s open-source. 🧵 TL;DR 🌐 Realistic…

genglin_liu's tweet image. Excited to share my first project at UCLA!

We built MOSAIC — a social network simulator where LLM-powered agents behave like real users on social media.

They post, share, flag, and debate the factuality of news content — all at scale. It’s open-source. 🧵

TL;DR

🌐 Realistic…

9

22

143

87

15.0K

Pinned

Genglin Liu Retweeted

H

Haoyi Qiu@HaoyiQiu · Nov 4

🌐 Are LLM agents prepared to navigate the rich diversity of cultural and social norms? 🏠 CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations. 🧠 We’re…

4

33

127

50

21.0K

Pinned

Genglin Liu Retweeted

D

Dan Hendrycks@DanHendrycks · Jun 7, 2024

As an alternative to RLHF and adversarial training, we released short-circuiting. It makes models ~100x more robust. It works for LLMs, multimodal models, and agents. Unlike before, I now think robustly stopping models from generating harmful outputs may be highly tractable and…

25

93

617

568

95.0K

Genglin Liu Retweeted

S

Salman@salman1422571 · Jun 18

🚨Thrilled to share our new work: AI debate combats misinformation better than single AI advisors! 🤔We tested if two AIs debating opposite sides helps biased humans judge controversial COVID-19 claims more accurately. Paper: arxiv.org/abs/2506.02175 🧵👇 #AI #Debate

1

6

10

1

2.0K

Genglin Liu Retweeted

S

Salman@salman1422571 · Apr 22

🚨 Excited to share our new paper on 𝕏-Teaming! 🤖 Multiagent system for multiturn jaibreaking 🔍 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 💥 Upto 98.1% attack success on leading model 🛡️ Released 30K safety dataset 🧵below #AI #LLMSafety

2

10

40

10

5.0K

Genglin Liu Retweeted

Y

Yuji Zhang@Yuji_Zhang_NLP · Mar 1

🔍New findings of knowledge overshadowing! Why do LLMs hallucinate over all true training data? 🤔Can we predict hallucinations even before model training or inference? 🚀Check out our new preprint: [arxiv.org/pdf/2502.16143] The Law of Knowledge Overshadowing: Towards…

6

30

117

75

24.0K

Genglin Liu Retweeted

Z

Zhenhailong Wang@zhenhailongW · Jan 22

📱Current mobile agents struggle with real-world tasks that align with human needs—like finding the best deal across 3 apps. 💸 Introducing Mobile-Agent-E: a novel mobile assistant designed for complex, long-horizon tasks and capable of self-evolving🐣🐥through experience. 🧵1/3

1

9

28

5

1.0K

Genglin Liu Retweeted

L

Liwei Jiang@liweijianglw · Jan 13

I'm thrilled to share that our Delphi paper is officially published today at @NatMachIntell after almost four years of hard works from all my amazing collaborators (a quite insane timeline considering the rapid AI world)! Special thanks to the unwavering support of my advisor,…

11

27

181

26

16.0K

Genglin Liu Retweeted

u

uclanlp@uclanlp · Nov 14

UCLANLP and alumni at #EMNLP2024 social event. What a group!

0

10

85

4

7.0K

G

Genglin Liu@genglin_liu · Nov 6

📢 A single line of code to thoroughly evaluate your LLM for Embodied Decision Making 📢 Please checkout our new NeurIPS D&B Oral Paper!! (Part-1 of my summer intern works @StanfordSVL)

MManling Li@ManlingLi_ · Nov 6

[NeurIPS D&B Oral] Embodied Agent Interface: Benchmarking LLMs for Embodied Agents A single line of code to evaluate your model! 🌟Standardize Goal Specifications: LTL 🌟Standardize Modules and Interfaces: 4 modules, 438 tasks, 1475 goals 🌟Standardize Fine-grained Metrics: 18…

0

4

39

13

6.0K

Genglin Liu Retweeted

Z

Zeyuan Allen-Zhu, Sc.D.@ZeyuanAllenZhu · Jul 16, 2024

If you're attending ICML 2024, join my 2-hour tutorial on Monday July 22 to explore the Physics of Language Model - all 6 parts. Visit: physics.allen-zhu.com and it will be live-streamed on Zoom. BONUS: this is the premiere of Part 2.1 + 2.2, don't miss out! #ICML2024 #MetaAI

18

168

870

645

295.0K

G

Genglin Liu@genglin_liu · Jun 19, 2024

Excited to share our R-Tuning got an outstanding paper award@NAACL 2024! Take a look at this paper to see how to align your LLMs to honesty. arxiv.org/abs/2311.09677 This work is finished during my visit at UIUC. Thanks for Prof. Ji and Prof. Zhang’s supervision!

HHeng Ji@hengjinlp · Jun 19, 2024

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han @Glaciohound will be on academic job market next year! arxiv.org/pdf/2308.16137 arxiv.org/pdf/2311.09677

12

10

77

10

17.0K

G

Genglin Liu@genglin_liu · Jun 20, 2024

🎖 Excited to receive an outstanding paper award at NAACL2024 for LM-Infinite "Zero-Shot Extreme Length Generalization for Large Language Models" work! We extend to 200M length with no parameter updates, with downstream improvements arxiv.org/abs/2308.16137 github.com/Glaciohound/LM…

HHeng Ji@hengjinlp · Jun 19, 2024

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han @Glaciohound will be on academic job market next year! arxiv.org/pdf/2308.16137 arxiv.org/pdf/2311.09677

5

7

48

5

14.0K

Genglin Liu Retweeted

H

Heng Ji@hengjinlp · Jun 19, 2024

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han @Glaciohound will be on academic job market next year! arxiv.org/pdf/2308.16137 arxiv.org/pdf/2311.09677

17

13

224

13

39.0K

Genglin Liu Retweeted

Y

Yao Fu@Francis_YAO_ · Apr 25, 2024

From Claude100K to Gemini10M, we are in the era of long context language models. Why and how a language model can utilize information at any input locations within long context? We discover retrieval heads, a special type of attention head responsible for long-context factuality

22

172

848

660

138.0K

Genglin Liu Retweeted

Z

Zhenhailong Wang@zhenhailongW · Apr 10, 2024

Large multimodal models often lack precise low-level perception needed for high-level reasoning, even with simple vector graphics. We bridge this gap by proposing an intermediate symbolic representation that leverages LLMs for text-based reasoning. mikewangwzhl.github.io/VDLM 🧵1/4

4

25

98

60

37.0K