Ziniu Li

@ZiniuLi

Ph.D. student @ CUHK, Shenzhen. Intern @Bytedance-Seed Working on RL and LLMs. Prev: Intern @Tencent-AI Lab

Shenzhen, China

Joined August 2017

503Following

480Followers

Pinned

Ziniu Li Retweeted

Qingfeng Lan@qingfeng_lan · Apr 2

🚀RL algorithms are shaping the post-training of LLMs, but how do their objectives connect? In this blog, I explore their relationships and provide a unified perspective through the Policy Gradient Theorem—the backbone of policy gradient methods. Dive in: lancelqf.github.io/note/llm_post_…

287

236

18.0K

Ziniu Li@ZiniuLi · Jul 10

Amazing work by @RidgerZhu ，more resources to investigate the mechanism behind the hybrid linear attention. Resources: Paper: arxiv.org/pdf/2507.06457 Huggingface CKPT Link: huggingface.co/collections/m-…

RRui-Jie (Ridger) Zhu@RidgerZhu · Jul 10

Hybrid architectures mix linear & full attention in LLMs. But which linear attention is best? This choice has been mostly guesswork. In our new work, we stop guessing. We trained, open-sourced 72 MODELS (340M & 1.3B) to dissect what truly makes a hybrid model tick🧶

1.0K

Ziniu Li@ZiniuLi · Jul 8

🚀 Thrilled to announce that our paper "SCRIT: Self-Evolving LLM Critique without Human or Stronger Models" was accepted to #COLM2025! We enable LLMs to self-improve critique abilities — zero human annotations, zero stronger models needed! 🔄✨ Looking forward to meeting…

ZZiniu Li@ZiniuLi · Jan 13

🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data…

738

Ziniu Li Retweeted

Zhengyang Tang@zhengyang_42 · Jun 13

We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”! 🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning. 🔍 Key Highlights: Strategic hint…

2.0K

Ziniu Li Retweeted

xiong-hui (barry) chen@xiong_hui_chen · May 19

D4RL is (almost) solved—many tasks now >95. Real systems still crash because benches are too clean. Meet NeoRL-2 🚀: 7 envs with stochastic delays, exogenous disturbances, hard safety constraints & data-scarce logs. Challenge accepted? 📄 arxiv.org/abs/2503.19267

2.0K

Ziniu Li Retweeted

Zhengyang Tang@zhengyang_42 · May 12

Thrilled to share our paper "ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling" has been accepted by Operations Research! 🎉 This is the FIRST LLM paper in the 70+ year history of this prestigious journal. Our framework improves modeling…

1.0K

Ziniu Li Retweeted

Yushun Zhang@yushun_zzz · May 8

New paper alert! We report that the Hessian of NNs has a very special structure: 1. it appears to be a "block-diagonal-block-circulant" matrix at initialization; 2. then it quickly evolves into a "near-block-diagonal" matrix along training. We then theoretically reveal two…

398

380

66.0K

Ziniu Li Retweeted

Zichen Liu@zzlccc · Apr 23

🚨 RL x LLM folks at #ICLR2025 — come join us during the Friday lunch break! If you haven’t RSVP’d on Whova, you can also register here: lu.ma/s8udv997?tk=B4… @Benjamin_eecs and I will scout for a chill spot (likely a corner at the venue) and share the location tomorrow.…

3.0K

Ziniu Li@ZiniuLi · Apr 20

🎉 I'll be attending #ICLR2025 and the #Alignment Workshop in Singapore next week! If you're interested in LLMs, RL theory & algorithms, reasoning, optimization, and multimodal LLMs—let's connect! 🚀 I'll be presenting our latest research: 📌 Preserving Diversity in Supervised…

ZiniuLi's tweet image. 🎉 I'll be attending #ICLR2025 and the #Alignment Workshop in Singapore next week!

If you're interested in LLMs, RL theory &amp; algorithms, reasoning, optimization, and multimodal LLMs—let's connect! 🚀

I'll be presenting our latest research:

📌 Preserving Diversity in Supervised…

2.0K