Caiqi Zhang

@caiqizh

PhD student at University of Cambridge;

Joined October 2023

57Following

76Followers

Pinned

Caiqi Zhang@caiqizh · Jun 2

🔥 We teach LLMs to say how confident they are on-the-fly during long-form generation. 🤩No sampling. No slow post-hoc methods. Not limited to short-form QA! ‼️Just output confidence in a single decoding pass. ✅Better calibration! 🚀 20× faster runtime. arXiv:2505.23912 👇

caiqizh's tweet image. 🔥 We teach LLMs to say how confident they are on-the-fly during long-form generation.

🤩No sampling. No slow post-hoc methods. Not limited to short-form QA!

‼️Just output confidence in a single decoding pass.

✅Better calibration!
🚀 20× faster runtime.

arXiv:2505.23912
👇

3.0K

Pinned

Caiqi Zhang Retweeted

Artem Shelmanov@ArtemShelmanov · Nov 14

I just met Caiqi Zhang – one of the users of our Python framework for uncertainty quantification, LM-Polygraph. It’s incredibly rewarding to see our work helping other researchers achieve outstanding results! LM-Polygraph: github.com/IINemo/lm-poly… #EMNLP2024 #Uncertainty #NLP

598

Caiqi Zhang Retweeted

Yuan He@lawhy_X · Jul 15

🚨 [Call for Papers] SEA @ NeurIPS 2025 🚨 Scaling Environments for Agents (SEA) Workshop 📅 December 6, 2025 | 📍 San Diego, USA We're excited to invite submissions to the SEA Workshop at NeurIPS 2025! 🧵1/n

4.0K

Caiqi Zhang Retweeted

Xiaochen Zhu (Neo)@ZhuNeo13294 · May 28

Thrilled to announce that our paper, “Conformity in Large Language Models,” has been accepted to the ACL 2025 Main Conference! 🎉 Looking forward to presenting our findings at ACL 2025 in Vienna this July! #AI #NLP #LLMs #Conformity #MachineLearning #AIResearch #Psychology #ACL

Caiqi Zhang Retweeted

Xiaochen Zhu (Neo)@ZhuNeo13294 · May 27

Inception Lab and Gemini Diffusion are hot these days. Just published a blog post on Diffusion Language Models! 🚀 Exploring how diffusion (yes, the image model kind) can be used for text generation. Check it out👇 spacehunterinf.github.io/blog/2025/diff… #NLP #LLMs #DiffusionModels

562

Caiqi Zhang Retweeted

Yi Xu@_yixu · May 19

🚀Let’s Think Only with Images. No language and No verbal thought.🤔 Let’s think through a sequence of images💭, like how humans picture steps in their minds🎨. We propose Visual Planning, a novel reasoning paradigm that enables models to reason purely through images.

212

1.0K

201.0K

Caiqi Zhang Retweeted

Yuan He@lawhy_X · Mar 31

🚨 New paper: "Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs" Insight: LLMs treat equivalent facts differently due to bias from pre-training data. 🔗 Arxiv: arxiv.org/abs/2503.22362 #NLP #LLMs #AI

393

Caiqi Zhang Retweeted

Yi Xu@_yixu · Mar 18

🔥Are we ranking LLMs correctly?🔥 Large Language Models (LLMs) are widely used as automatic judges, but what if their rankings are unstable?😯Our latest study finds non-transitivity in LLM-as-a-judge evaluations—where A > B, B > C, but… C > A?! 🔄

132

22.0K

Caiqi Zhang Retweeted

Chengzu Li@li_chengzu · Jan 14

Forget just thinking in words. 🚀 New Era of Multimodal Reasoning🚨 🔍 Imagine While Reasoning in Space with MVoT Multimodal Visualization-of-Thought (MVoT) revolutionizes reasoning by generating visual "thoughts" that transform how AI thinks, reasons, and explains itself.

168

750

708

78.0K

Caiqi Zhang Retweeted

Yinhong Liu@YinhongLiu2 · Jan 5

🚨 New Paper Alert! 🚨 When using LLMs for judgements, ever wondered about the consistency of those judgments? 🤔 Check out our latest work, where we quantify, evaluate, and enhance the logical/preference consistency of LLMs. 📚 🔗 Read more: arxiv.org/abs/2410.02205

250

160

23.0K

Caiqi Zhang Retweeted

Zhijiang Guo@ZhijiangG · Dec 2

Life update: 🎉 I'm excited to share that I will be joining @HKUSTGuangzhou as an Assistant Professor in Spring 2025! I'm looking for multiple PhDs and interns who are passionate about exploring research questions related to knowledge and reasoning in the context of LLMs. 🤖

186

19.0K

Caiqi Zhang@caiqizh · Nov 12

I will present this work in EMNLP Nov 13 Wednesday! See you there!

CChengzu Li@li_chengzu · Nov 12

Happy to share that TopViewRS is selected as oral presentation at @emnlpmeeting #EMNLP2024 Though I’m attending virtually, @caiqizh will present our work on Nov 13 11:45 to 12:00 at Ashe Auditorium. If you are interested in multimodality/spatial reasoning, feel free to reach out!

262

Caiqi Zhang@caiqizh · Nov 12

CChengzu Li@li_chengzu · Jun 6, 2024

Excited to introduce TopViewRS: VLMs as Top-View Spatial Reasoners🤖 TopViewRS assess VLMs’ spatial reasoning in top-view scenarios🏠just like how you read maps🗺️ Spoiler🫢GPT4V and Gemini are neck-and-neck, each excelling in different setups but neither even close to us humans

828