Haofei Yu
@haofeiyu44
CS PhD student @UofIllinois | previously CS undergrad @ZJU_China, MS @LTIatCMU | ex-intern @Apple @TencentGlobal @MITIBMLab
🚀 Excited about automatic research? What if we can combine graphs and LLMs to simulate our interconnected human research community? ✨ Check our latest paper ResearchTown: Simulator of Human Research Community (arxiv.org/pdf/2412.17767) #AI #LLM #AutoResearch #MultiAgent #Graph
Attending #ICML2025 next week! Excited to chat about auto-research, social learning, and any other exciting topics. Please reach out to me!
⚠️ Rogue AI scientists? 🛡️ SafeScientist rejects unsafe prompts for ethical discoveries. Check out paper ➡️ (arxiv.org/pdf/2505.23559) #AISafety #LLM #SafeAI #AI
💥Time-R1 is here! Can a 3B LLM truly grasp time? 🤔 YES! Excited to share our new work, Time-R1: Towards Comprehensive Temporal Reasoning in LLMs 🚀 Check it out: 📖 Paper: arxiv.org/abs/2505.13508 💻 Code: github.com/ulab-uiuc/Time… #TemporalReasoning #RL #LLMs
🤯NeurIPS 2025 might break records as the most submitted-to academic conference ever. One of our submission IDs is already ~23,000 — final count could hit 30,000. Absolute madness. #NeurIPS2025 #AI
Attended an #ICLR workshop on self-improving LLMs. Asked what might be my best audience question, to renowned panelists Yoshua Bengio, Noah Goodman, and @ShunyuYao12 (even got some applause :) TLDR from Noah: self-improve LLM's interpretability could be a major next step for AGI
🚀Congrats to U Lab member @Kunlun_Zhu for leading MultiAgentBench—the first comprehensive benchmark evaluating both collaboration and competition among LLM agents! Stay tuned for more agent works from U Lab. 🔗 arXiv: arxiv.org/abs/2503.01935 💻 GitHub: github.com/MultiagentBenc…
Tweet 1/5 🚀 Introducing MultiAgentBench: the first comprehensive benchmark for LLM agents evaluating both collaboration and competition! 🏞️ • arXiv: arxiv.org/abs/2503.01935 • GitHub: github.com/MultiagentBenc… • Hugging Face: huggingface.co/papers/2503.01…
Congrats @KiwiJWY
SGLang Powers lm-eval-harness, The Gold Standard in LLM Eval! Harness is a highly fair and authoritative framework in LLM evaluation. We’ve previously used it internally for calibration. Now, thanks to the efforts of Jinwei Yao, Jin Pan, Xiaotong Jiang, and Qiujiang Chen,…
🎉 Our GitHub repo (github.com/ulab-uiuc/rese…) has reached 100 stars! 🌟 🚀 Thanks for the support! Feel free to explore, use, and share your feedback to help us create a better platform for automatic research in one step: `pip install research-town`
🚀 Excited about automatic research? What if we can combine graphs and LLMs to simulate our interconnected human research community? ✨ Check our latest paper ResearchTown: Simulator of Human Research Community (arxiv.org/pdf/2412.17767) #AI #LLM #AutoResearch #MultiAgent #Graph