Kunlun Zhu
@Kunlun_Zhu
Graduate Student @UofIllinois http://Kunlun-Zhu.github.io
Check out recent paper on Safe AI scientist
⚠️ Rogue AI scientists? 🛡️ SafeScientist rejects unsafe prompts for ethical discoveries. Check out paper ➡️ (arxiv.org/pdf/2505.23559) #AISafety #LLM #SafeAI #AI
⚠️ Rogue AI scientists? 🛡️ SafeScientist rejects unsafe prompts for ethical discoveries. Check out paper ➡️ (arxiv.org/pdf/2505.23559) #AISafety #LLM #SafeAI #AI
See You all at San Jose
Have you registered for AMD Advancing AI 2025? @Kunlun_Zhu will lead a hands-on workshop on building AI agent apps with AMD Instinct MI300X, Qwen3, Llama 4, and the OpenManus framework. Join us June 12 → bit.ly/3SOgyf7
Have you registered for AMD Advancing AI 2025? @Kunlun_Zhu will lead a hands-on workshop on building AI agent apps with AMD Instinct MI300X, Qwen3, Llama 4, and the OpenManus framework. Join us June 12 → bit.ly/3SOgyf7
🧪 Want an AI-generated paper draft in just 1 minute? Or dreaming of building auto-research apps but frustrated with setups? Meet tiny-scientist, a minimal package to start AI-powered research: 👉 pip install tiny-scientist 🔗 github.com/ulab-uiuc/tiny… #AIAgent #pythonpackages
the best researchers from Meta, Yale, Stanford, Google DeepMind, and Microsoft laid out all we know about Agents in a 264-page paper [book], here are some of their key findings:
🎯After half a year of effort, our 20-team collaboration has formally defined the “Foundation Agent” 🤖 in a comprehensive 264-page paper! We’ve correlated agents with the human brain 🧠 • Huggingface: lnkd.in/gmzrW2Np • Arxiv: lnkd.in/gpYxBct4
🧠264 pages and 1416 references chart the future of Foundation Agents. Our latest survey dives deep into agents—covering brain-inspired cognition, self-evolution, multi-agents, and AI safety. Discover the #1 Paper of the Day on Hugging Face👇: huggingface.co/papers/2504.01… 1/3
🧠264 pages and 1416 references chart the future of Foundation Agents. Our latest survey dives deep into agents—covering brain-inspired cognition, self-evolution, multi-agents, and AI safety. Discover the #1 Paper of the Day on Hugging Face👇: huggingface.co/papers/2504.01… 1/3
🚀Excited to share our latest paper, *ResearchTown*, a framework for research community simulation using LLMs and graphs 🌟We hope this marks a new step toward automated research 🧠Paper: arxiv.org/abs/2412.17767 💻Code: github.com/ulab-uiuc/rese… 🤗Dataset: huggingface.co/datasets/ulab-…
🚀 Excited about automatic research? What if we can combine graphs and LLMs to simulate our interconnected human research community? ✨ Check our latest paper ResearchTown: Simulator of Human Research Community (arxiv.org/pdf/2412.17767) #AI #LLM #AutoResearch #MultiAgent #Graph
(1/n) Human research community is far from perfect. Frustrated with NeurIPS results? Research Town simulates the community as a graph of LLM agents and knowledge. It helps you find ideas, receive reviews, refine proposals, get metareviews - essentially running “NeurIPS” by LLM.
🚀 Excited about generating groundbreaking research ideas with LLMs but feeling overwhelmed by the engineering effort and costs? 😄Meet Research Town developed by #ULab: github.com/ulab-uiuc/rese…. 💻·pip install research-town· is all you need. #LLM #AI #AutoResearch #Papers
3/ Enter RAGEval: a framework designed to address this gap by creating domain-specific evaluation datasets. This approach ensures that RAG models are tested on the real-world scenarios they’ll encounter, making evaluations more relevant and reliable. @Kunlun_Zhu @zibuyu9 paper…
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework Presents a framework for generating domain-specific datasets to evaluate RAG systems. Focuses on vertical domains and introduces new metrics to assess LLMs' knowledge usage. 📝arxiv.org/abs/2408.01262
RAGEval Proposes a simple framework to automatically generate evaluation datasets to assess knowledge usage of different LLM under different scenarios. It defines a schema from seed documents and then generates diverse documents which leads to question-answering pairs. The QA…