Siwei Han
@lillianwei423
Senior at @FudanUni. Interested in alignment and application of LLMs, VLMs and multimodal model, currently interning in Prof. @HuaxiuYaoML’s team at @UNC.
Huge thanks to my advisor @HuaxiuYaoML for guiding me through this paper, and big appreciation to my senior @richardxp888 for all the help! 🙏 #AI #DocQA #LLM #Agent
🚀 Introducing MDocAgent! 🧐📄 📚 Ever struggled with AI that can’t handle complex documents filled with text, images, tables, and figures? 💡 Enter MDocAgent 🧠🤖—a next-gen multi-modal multi-agent framework that revolutionizes document understanding! #AI #DocQA #LLM #Agent
🚀 We introduce MMed-RAG, a powerful multimodal RAG system that boosts the factuality of Medical Vision-Language Models (Med-LVLMs) by up to 43.8%! 🩺💡 🔍 MMed-RAG enhances alignment across medical domains like radiology, pathology, and ophthalmology with a domain-aware…
📢Excited to share our new self-rewarding method called CREAM🌟! CREAM extends the Self-Rewarding Language Model (SRLM) to smaller (e.g., 7B-level LLMs), and mitigates the potential rewarding bias issues through self-consistency regularization. Key Findings: 👉1. SRLMs with…
Introduce our 🌟NEW benchmark MMIE🌟 Project page: mmie-bench.github.io Thanks Prof. Huaxiu Yao @HuaxiuYaoML for tutoring, and @richardxp888, @StephenQS0710 for great collaboration!
🌟NEW Benchmark Release Alert🌟 We introduce 📚MMIE, a knowledge-intensive benchmark to evaluate interleaved multimodal comprehension and generation in LVLMs, covering 20K+ examples covering 12 fields and 102 subfields. 🔗 [Explore MMIE here](mmie-benchmark.github.io)
🌟NEW Benchmark Release Alert🌟 We introduce 📚MMIE, a knowledge-intensive benchmark to evaluate interleaved multimodal comprehension and generation in LVLMs, covering 20K+ examples covering 12 fields and 102 subfields. 🔗 [Explore MMIE here](mmie-benchmark.github.io)