CZG1225
@chen_zigen56940
PHD student at National University of Singapore.
💡 Can LLMs learn when to think? We introduce Thinkless, an LLM that knows when to think. 🔑 Decoupled GRPO: learns when to think & how to answer 🔀 Cuts reasoning by 50–90% ❌ Stop overthinking 1 + 1 📎 Paper: arxiv.org/abs/2505.13379 💻 Code: github.com/VainF/Thinkless #LLM
🚀Ultra-Resolution Adaptation with Ease 😊Your Free Flux Pro Ultra URAE achieves comparable 2K performance of SOTA close-source model FLUX1.1 [Pro] Ultra using just 3K samples and 2K iterations, and sets new benchmarks for 4K generation. 🚀Paper: arxiv.org/abs/2503.16322
🚀 Controlling CoT Length with a Magic Valve! What if we could adjust the reasoning chain length for QwQ/DeepseekR1 based on task difficulty? 🤔 Our solution: CoT-Valve, a tuning strategy that elastically controls and compresses CoT length Paper: arxiv.org/abs/2502.09601
MaskLLM Learnable Semi-Structured Sparsity for Large Language Models discuss: huggingface.co/papers/2409.17… Large Language Models (LLMs) are distinguished by their massive parameter counts, which typically result in significant redundancy. This work introduces MaskLLM, a learnable…
Thanks @_akhaliq for sharing our work, AsyncDiff, a training-free distributed acceleration scheme for diffusion models. For SD v2.1, AsyncDiff achieves a 2.7x speedup with negligible degradation and a 4.0x speedup with only a slight reduction of 0.38 in CLIP Score with 4 A5000s.
AsyncDiff Parallelizing Diffusion Models by Asynchronous Denoising Diffusion models have garnered significant interest from the community for their great generative ability across various applications. However, their typical multi-step sequential-
Mark
Restoration by Generation with Constrained Priors paper page: huggingface.co/papers/2312.17… The inherent generative power of denoising diffusion models makes them well-suited for image restoration tasks where the objective is to find the optimal high-quality image within the…
The Data-Efficient SAM Compression Method offers exceptional performance with significantly reduced training costs. Both the paper and the code are now available for you to explore and experiment with. Your participation and trials are highly encouraged.
Our new work: 0.1% Data Makes Segment Anything Slim Page: huggingface.co/papers/2312.05… Github: github.com/czg1225/SlimSAM Compared to SAM-H, SlimSAM achieves approaching performance while reducing parameter counts to merely 0.9%(5.7M), MACs to 0.8%(21G), and requiring only 0.1% data.
Our new work: 0.1% Data Makes Segment Anything Slim Page: huggingface.co/papers/2312.05… Github: github.com/czg1225/SlimSAM Compared to SAM-H, SlimSAM achieves approaching performance while reducing parameter counts to merely 0.9%(5.7M), MACs to 0.8%(21G), and requiring only 0.1% data.