Jessy Li (@jessyjli)

Pinned

J

Jessy Li@jessyjli · Jul 9

Excited to share that QUDsim has been accepted to #COLM2025!! 🎉🎉

RRamya Namuduri@ramya_namuduri · Apr 21

Have that eerie feeling of déjà vu when reading model-generated text 👀, but can’t pinpoint the specific words or phrases 👀? ✨We introduce QUDsim, to quantify discourse similarities beyond lexical, syntactic, and content overlap.

1

5

18

0

1.0K

Pinned

J

Jessy Li@jessyjli · Jul 1

Check out this new opinion piece from Sebastian and Lily! We have really powerful AI systems now, so what’s the bottleneck preventing the wider adoption of fact checking systems, in high stakes scenarios like medicine? It’s how we define the tasks 👇

LLily Chen@lilyychenn · Jul 1

Are we fact-checking medical claims the right way? 🩺🤔 Probably not. In our study, even experts struggled to verify Reddit health claims using end-to-end systems. We show why—and argue fact-checking should be a dialogue, with patients in the loop arxiv.org/abs/2506.20876 🧵1/

0

4

2

975

J

Jessy Li@jessyjli · 6 h

I am excited to present our study on information salience in LLMs today at #ACL2025NLP (x4/x5, Tue, 16:00--17:30). Please come by if you are interested! 📝 Behavioral Analysis of Information Salience in Large Language Models With @jschloetterer @jessyjli @SeifertChristin

JJan Trienes@jantrienes · Feb 21

Do you want to know what information LLMs prioritize in text synthesis tasks? Here's a short 🧵 about our new paper: an interpretable framework for salience analysis in LLMs. First of all, information salience is a fuzzy concept. So how can we even measure it?

0

1

3

1

209

Jessy Li Retweeted

K

Kanishka Misra 🌊@kanishkamisra · 15 h

Looking forward to attending #cogsci2025! I’m especially excited to meet students who will be applying to PhD programs in Computational Ling/CogSci in the coming cycle. Please reach out if you want to meet up and chat! Email is best, but DM also works if you must quick🧵:

1

7

26

5

1.0K

J

Jessy Li@jessyjli · 14 h

Tuesday at #ACL2025: @jantrienes will be presenting this from 4-5:30pm in x4/x5! Turns out content selection in LLMs are highly consistent with each other, but not so much with their own notion of importance or with human’s…

JJan Trienes@jantrienes · Feb 21

Do you want to know what information LLMs prioritize in text synthesis tasks? Here's a short 🧵 about our new paper: an interpretable framework for salience analysis in LLMs. First of all, information salience is a fuzzy concept. So how can we even measure it?

0

11

2

653

J

Jessy Li@jessyjli · Jul 27

🇦🇹 I’m on my way to #ACL2025 to help present two papers (🧵s below) ➡️ MAT-Steer (07/30 at 11am), our method for steering LLMs w/ multiple attributes (e.g. truthfulness, bias reduction, and toxicity mitigation) simultaneously. ➡️ LAQuer (07/28 at 11am), a new task/framework for…

EElias Stengel-Eskin@EliasEskin · May 5

Extremely excited to announce that I will be joining @UTAustin @UTCompSci in August 2025 as an Assistant Professor! 🎉 I’m looking forward to continuing to develop AI agents that interact/communicate with people, each other, and the multimodal world. I’ll be recruiting PhD…

2

15

60

3

5.0K

J

Jessy Li@jessyjli · Jul 27

Heading to Vienna tomorrow for #ACL2025! Monday: Will Sheffield will be presenting his work on just the sneaky little discourse particle called JUST 😁 Just stop by Hall X4/X5 6-7:30pm CEST! Paper aclanthology.org/2025.findings-… w/ @kanishkamisra @valentina__py Ashwini Deo @kmahowald

jessyjli's tweet image. Heading to Vienna tomorrow for #ACL2025!

Monday: Will Sheffield will be presenting his work on just the sneaky little discourse particle called JUST 😁 Just stop by Hall X4/X5 6-7:30pm CEST!
Paper aclanthology.org/2025.findings-…

w/ @kanishkamisra @valentina__py Ashwini Deo @kmahowald

0

8

52

3

2.0K

Jessy Li Retweeted

S

Sanghee J. Kim@Sanghee__Kim · Jul 23

New study on LMs and discourse sensitivity! We evaluate 25 LMs on their ability to prioritize discourse-relevant info, and find that (1) smaller & dialogue-trained models align closer to human patterns, while (2) larger/instruction models overuse structural or discourse cues.

2

5

13

0

894

J

Jessy Li@jessyjli · Jul 22

Welcome to UT, Jiaxin!!!! 🥳

JJiaxin Pei@jiaxin_pei · Jul 21

Life Update: I will join @UTiSchool as an Assistant Professor in Fall 2026 and will continue my work on LLM, HCI, and Computational Social Science. I'm building a new lab on Human-Centered AI Systems and will be hiring PhD students in the coming cycle!

1

0

17

0

2.0K

J

Jessy Li@jessyjli · Jul 21

"Seeing" robins and sparrows may not necessarily make them birdier to LMs! Super excited about this paper -- massive shoutout to all my co-authors, especially @yulu_qin and @dhevarghese for leading the charge!

YYulu Qin@yulu_qin · Jul 21

Does vision training change how language is represented and used in meaningful ways?🤔 The answer is a nuanced yes! Comparing VLM-LM minimal pairs, we find that while the taxonomic organization of the lexicon is similar, VLMs are better at _deploying_ this knowledge. [1/9]

0

5

23

3

1.0K

Jessy Li Retweeted

R

Ramez Kouzy, MD@RKouzyMD · Jul 19

What are patients saying about GLP-1 meds and cancer risk? We analyzed 400K+ Reddit posts using an AI-powered pipeline, revealing major communication gaps: overall cancer risk discussions were low but of those only 19% mentioned talking to a doctor. 🧵 @JAMANetworkOpen

1

3

13

1

2.0K

J

Jessy Li@jessyjli · Jul 15

👇Happening this afternoon 4:30pm! Come meet @Yurochkin_M, @RayaHoresh, and I, at East Exhibition Hall #1103. 📍I’m also on the industry job market this coming year! Let’s connect and chat about opportunities in the industry :)

HHongli Zhan ✈️ ICML (on the job market)@HongliZhan · Jul 8

I'll be at #icml2025 @icmlconf to present SPRI next week! Come by our poster on Tuesday, July 15, 4:30pm, and let’s catch up on LLM alignment! 😃 🚀TL;DR: We introduce Situated-PRInciples (SPRI), a framework that automatically generates input-specific principles to align…

0

2

9

0

779

J

Jessy Li@jessyjli · Jul 10

Happy to share that EvalAgent has been accepted to #COLM2025 @COLM_conf 🎉🇨🇦 We introduce a framework to identify implicit and diverse evaluation criteria for various open-ended tasks! 📜 arxiv.org/pdf/2504.15219

MManya Wadhwa@ManyaWadhwa1 · Apr 22

Evaluating language model responses on open-ended tasks is hard! 🤔 We introduce EvalAgent, a framework that identifies nuanced and diverse criteria 📋✍️. EvalAgent identifies 👩‍🏫🎓 expert advice on the web that implicitly address the user’s prompt 🧵👇

1

16

72

11

5.0K

J

Jessy Li@jessyjli · Jul 8

If you’ll be at #icml2025, check out Hongli’s work on context-specific principles!

HHongli Zhan ✈️ ICML (on the job market)@HongliZhan · Jul 8

I'll be at #icml2025 @icmlconf to present SPRI next week! Come by our poster on Tuesday, July 15, 4:30pm, and let’s catch up on LLM alignment! 😃 🚀TL;DR: We introduce Situated-PRInciples (SPRI), a framework that automatically generates input-specific principles to align…

0

15

3

948

J

Jessy Li@jessyjli · Jun 3

We have very good frameworks for cooperative dialog… but how about the opposite? @Asher_Zheng00’s new paper takes a game-theoretic view and develops new metrics to quantify non-cooperative language ♟️ Turns out LLMs don’t have the pragmatic capabilities to perceive these…

AAsher Zheng@Asher_Zheng00 · Jun 3

Language is often strategic, but LLMs tend to play nice. How strategic are they really? Probing into that is key for future safety alignment.🛟 👉Introducing CoBRA🐍, a framework that assesses strategic language. Work with my amazing advisors @jessyjli and @David_Beaver! 🧵👇

2

3

18

2

1.0K

J

Jessy Li@jessyjli · Jun 3

CosmicAI collab: benchmarking the utility of LLMs in astronomy coding workflows & focusing on the key research capability of scientific visualization. @sebajoed @jessyjli @Murtazahusaintx @gregd_nlp @StephaJuneau @paultorrey9 Adam Bolton, Stella Offner, Juan Frias, Niall Gaffney

SSebastian Joseph@sebajoed · Jun 2

How good are LLMs at 🔭 scientific computing and visualization 🔭? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵

0

6

7

0

1.0K

Jessy Li Retweeted

A

Asher Zheng@Asher_Zheng00 · Jun 3

Language is often strategic, but LLMs tend to play nice. How strategic are they really? Probing into that is key for future safety alignment.🛟 👉Introducing CoBRA🐍, a framework that assesses strategic language. Work with my amazing advisors @jessyjli and @David_Beaver! 🧵👇

2

9

20

7

3.0K

J

Jessy Li@jessyjli · Jun 2

Is AI ready to play a real role in science? This work with @CosmicAI_Inst evaluates LLMs targeting the implementation of scientific workflows, and the scientific utility of visualizations from LLM-generated code -- and the answer is not yet, even with the best SOTA models 👇

SSebastian Joseph@sebajoed · Jun 2

How good are LLMs at 🔭 scientific computing and visualization 🔭? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵

0

2

9

0

938

Jessy Li Retweeted

S

Sebastian Joseph@sebajoed · Jun 2

How good are LLMs at 🔭 scientific computing and visualization 🔭? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵

1

8

18

2

5.0K

J

Jessy Li@jessyjli · Jun 2

Super thrilled that @kanishkamisra is going to join @UT_Linguistics as our newest computational linguistics faculty member -- looking forward to doing great research together! 🧑‍🎓Students: Kanishka is a GREAT mentor -- apply to be his PhD student in the upcoming cycle!!

KKanishka Misra 🌊@kanishkamisra · Jun 2

News🗞️ I will return to UT Austin as an Assistant Professor of Linguistics this fall, and join its vibrant community of Computational Linguists, NLPers, and Cognitive Scientists!🤘 Excited to develop ideas about linguistic and conceptual generalization! Recruitment details soon

0

4

40

4

3.0K