Qingxiu Dong
@qx_dong
PhD student @PKU1898. Research Intern @MSFTResearch Asia.
Thanks to @omarsar0 for sharing our work!
Reinforcement Pre-Training New pre-training paradigm for LLMs just landed on arXiv! It incentivises effective next-token reasoning with RL. This unlocks richer reasoning capabilities using only raw text and intrinsic RL signals. A must-read! Bookmark it! Here are my notes:
So happy to reunite with old and new friends at ICLR! Had an amazing time exploring Singapore too! 🌟🇸🇬 #ICLR2025




Excited to introduce BitNet b1.58 2B4T — the first large-scale, native 1-bit LLM🚀🚀 BitNet achieves performance on par with leading full-precision LLMs — and it’s blazingly fast⚡️⚡️uses much lower memory🎉 Everything is open-sourced — run it on GPU or your Macbook 🖥️⚙️
Proud to introduce our latest work “Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey” as our new year gift for the multimodal learning community! Paper: huggingface.co/papers/2412.18… Github: github.com/LMM101/Awesome…
🎆Survey of the Year: 𝐍𝐞𝐱𝐭 𝐓𝐨𝐤𝐞𝐧 𝐏𝐫𝐞𝐝𝐢𝐜𝐭𝐢𝐨𝐧 𝐓𝐨𝐰𝐚𝐫𝐝𝐬 𝐌𝐮𝐥𝐭𝐢𝐦𝐨𝐝𝐚𝐥 𝐈𝐧𝐭𝐞𝐥𝐥𝐢𝐠𝐞𝐧𝐜𝐞: 𝐀 𝐂𝐨𝐦𝐩𝐫𝐞𝐡𝐞𝐧𝐬𝐢𝐯𝐞 𝐒𝐮𝐫𝐯𝐞𝐲 arXiv: arxiv.org/abs/2412.18619 HugFace: huggingface.co/papers/2412.18… Github: github.com/LMM101/Awesome…
About to arrive in #Miami 🌴 after a 30-hour flight for #EMNLP2024! Excited to see new and old friends :) I’d love to chat about data synthesis and deep reasoning for LLMs (or anything else) —feel free to reach out!
🚀Introducing MixEval-X, the first real-world, any-to-any benchmark. mixeval-x.github.io It extends all benefits of MixEval to multi-modal evaluations, including real-world query distribution, fast yet accurate model ranking, high standards evaluation across modalities!
🏇 Frontier players are racing to solve modality puzzles in the quest for AGI. But to get there, we need consistent, high-standard evaluations across all modalities! 🚀 Introducing MixEval-X, the first real-world, any-to-any benchmark. Inheriting the philosophy from MixEval,…
How to deploy a 100B model on your CPU devices? 🔥 Excited to introduce bitnet.cpp, our inference framework for BitNet b1.58 🚀🚀 github.com/microsoft/bitn…
✨A Spark of Vision-Language Intelligence! We introduce DnD-Transformer, a new auto-regressive image gen model beats GPT/Llama w/o extra cost. AR gen beats diffusion in joint VL modeling in a self-supervised way! Github: github.com/chenllliang/Dn… Paper: huggingface.co/papers/2410.01…
(Perhaps a bit late) Excited to announce our survey on ICL has been accepted to #EMNLP2024 main conf and been cited 1,000+ times! Thanks to all collaborators and contributors to this field! We've updated the survey arxiv.org/abs/2301.00234. Excited to keep pushing boundaries!
🤔 Are LLMs Ready for Real-World Data Science Challenges? 🚀 We’ve just open-sourced our #EMNLP2024 work DA-Code, a cutting-edge benchmark designed to push LLMs to their limits in real-world data science tasks. Get involved and challenge your models! da-code-bench.github.io
🤔How much potential do LLMs have for self-acceleration through layer sparsity? 🚀 🚨 Excited to share our latest work: SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration. Arxiv: arxiv.org/abs/2410.06916 🧵1/n