Lingpeng Kong

@ikekong

Assistant Professor @ The University of Hong Kong, Previously Research Scientist @ DeepMind

Hong Kong

Joined February 2010

294Following

978Followers

Lingpeng Kong Retweeted

HKUNLP@hkunlp2020 · Jul 18

Xinyu Yang from CMU will be giving a talk titled "Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation" at Friday July 25 11am HKT (Thursday July 24 8pm PDT). Link to talk: hku.zoom.us/j/92651812689?…

2.0K

Lingpeng Kong Retweeted

Jiacheng Ye@JiachengYe15 · Jul 15

📢 Update: Announcing Dream's next-phase development. - Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. - DreamOn: targeting the variable-length generation problem in dLLM!

9.0K

Lingpeng Kong Retweeted

Zhihui Xie@_zhihuixie · Jul 15

🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.

109

11.0K

Lingpeng Kong Retweeted

Zirui Wu@WilliamZR7 · Jul 15

We present DreamOn: a simple yet effective method for variable-length generation in diffusion language models. Our approach boosts code infilling performance significantly and even catches up with oracle results.

113

14.0K

Lingpeng Kong Retweeted

JingqiZhou@zhou_jingqi_ · Jul 8

🌳TreeSynth: Synthesizing large-scale diverse data from scratch! Struggling with repetition and space collapse in data synthesis? 🤔 Our latest research mitigates this challenge through innovative tree-guided subspace partitioning. ✨Introducing TreeSynth—a novel framework…

343

Lingpeng Kong Retweeted

Sansa Gong@sansa19739319 · Jul 2

🤖Can diffusion models write code competitively? Excited to share our latest 7B coding diffusion LLM!!💻 With DiffuCoder, we explore how they decode, why temperature🔥 matters, and how to improve them via coupled-GRPO that speaks diffusion!!📈 Code: github.com/apple/ml-diffu… 🧵

112

581

378

45.0K

Lingpeng Kong@ikekong · Jun 27

Fun especially considering this is a shiba :)

JJunyang Lin@JustinLin610 · Jun 25

🥸

369

Lingpeng Kong@ikekong · Jun 26

Huge milestone from the team! A blazing-fast diffusion LLM built for chat, delivering real-time performance at commercial scale. If you liked Mercury Coder for code, you'll love this for conversation.

IInception@InceptionAILabs · Jun 26

We’re excited to launch Mercury, the first commercial-scale diffusion LLM tailored for chat applications! Ultra-fast and efficient, Mercury brings real-time responsiveness to conversations, just like Mercury Coder did for code.

178

26.0K

Lingpeng Kong@ikekong · Jun 27

Thanks for sharing our work!!!🙏Code release is in progress😺

TTanishq Abraham is at ICML@iScienceLuvr · Jun 26

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Apple introduces DiffuCoder, a 7B diffusion LLM trained on 130B tokens of code authors also propose a diffusion-native RL training framework, coupled-GRPO Decoding of dLLMs differ from…

3.0K

Lingpeng Kong Retweeted

Lei Li@_TobiasLee · Jun 5

MiMo-VL technical report, models, and evaluation suite are out! 🤗 Models: huggingface.co/XiaomiMiMo/MiM… (or RL) Report: arxiv.org/abs/2506.03569 Evaluation Suite: github.com/XiaomiMiMo/lmm… Looking back, it's incredible that we delivered such compact yet powerful vision-language…

4.0K

Lingpeng Kong@ikekong · Jun 20

The RL recipe from us, with everything fully open! hkunlp.github.io/blog/2025/Pola…

CChenxin An@AnChancy46881 · Jun 20

# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels…

2.0K

Lingpeng Kong Retweeted

Jing Xiong@_June1126 · Jun 11

🔬 The HKU team presents ParallelComp: a training-free technique for efficient context length extrapolation in LLMs—from 8K up to 128K tokens—on a single A100 GPU, with minimal performance loss. 📄 Paper: arxiv.org/abs/2502.14317 💻 Code: github.com/menik1126/Para…

1.0K

Lingpeng Kong@ikekong · May 29

Constant memory long CoT is here!

XXueliang Zhao@xlzhao_hku · May 29

🔥 Meet PromptCoT-Mamba The first reasoning model with constant-memory inference to beat Transformers on competition-level math & code ⚡ Efficient decoding: no attention, no KV cache ⚡ +16.0% / +7.1% / +16.6% vs. s1.1-7B on AIME 24 / 25 / LiveCodeBench 🚀 Up to 3.66× faster

393

Lingpeng Kong Retweeted

Google DeepMind@GoogleDeepMind · May 20

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

663

5.0K

1.0K

1.2M

Lingpeng Kong@ikekong · Apr 29

I waited until like 1 am but it turns out to be 5am!!

QQwen@Alibaba_Qwen · Apr 28

Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general…

1.0K

Lingpeng Kong Retweeted

chang ma@ma_chang_nlp · Apr 21

Excited to share our work at ICLR 2025 in 🇸🇬. @iclr_conf 🥳 Happy to chat about LLM reasoning & planning, agents, and AI4Science! 📍Sat 26 Apr 3 p.m. CST — 5:30 p.m Hall 3 + Hall 2B #554

2.0K