Guowei Xu

@Kevin_GuoweiXu

Undergraduate student at Yao Class (Tsinghua University), interested in Language Models and Reinforcement Learning

Beijing, China

Joined July 2023

313Following

870Followers

Guowei Xu@Kevin_GuoweiXu · Jul 14

Unfortunately I cannot attend the conference in person this year, but our co-author @Kevin_GuoweiXu will be presenting the paper and answer all your questions! 📜Poster session: Time: Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT Location: West Exhibition Hall B2-B3 #W-607

SSuning Huang@suning_huang · Oct 26

🚀 Introducing MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning! 🌟 We propose a strong model-free visual RL algorithm that can learn robust visuomotor policies from scratch – in the real world! 💪🤖 🌐 Check out the project…

519

Guowei Xu Retweeted

James Zou@james_y_zou · Jul 11

📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…

119

482

206

105.0K

Guowei Xu Retweeted

Saining Xie@sainingxie · Jul 7

Thanks for bringing this to my attention. I honestly wasn’t aware of the situation until the recent posts started going viral. I would never encourage my students to do anything like this—if I were serving as an Area Chair, any paper with this kind of prompt would be…

215

37.0K

Guowei Xu Retweeted

Tianying Ji @ ICML2024@jtinyng1 · Jun 12, 2023

Five-year-old algorithms SAC and TD3 are still being used as the backbone RL algos today. Consider trying a new well-performing backbone without too much pain? We introduce BAC, a simple but effective method that has a significant performance boost on various tasks. 🧵👇

9.0K

Guowei Xu@Kevin_GuoweiXu · Apr 23

Thanks to @Kevin_GuoweiXu this model-free RL method and a few other baselines have been added to an experimental/contributed folder in ManiSkill: github.com/haosulab/ManiS… we welcome people to try out some of the more difficult tasks in our benchmark and provide tuned+open-sourced…

TTianying Ji @ ICML2024@jtinyng1 · Jun 12, 2023

4.0K

Guowei Xu Retweeted

DeepSeek@deepseek_ai · Jan 20

🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today! 🐋 1/n

2.0K

7.0K

37.0K

10.0K

12.4M

Guowei Xu@Kevin_GuoweiXu · Jan 16

An interesting paper: They found out that existing language models might be essentially seeking a more efficient way to approximate the Data-Tree. These may indicate that the reasoning process in LLMs is more likely to be probabilistic pattern-matching rather than formal…

nningkp@PengKun24255 · Jan 15

💡Excited to share our latest research on the explainability of GPT! 🔎 We from a novel perspective to flatten the language dataset and GPT models as the Monte Calo Language Trees, and exhibit their significant similarity. 📰 arxiv.org/pdf/2501.07641 📎 github.com/PKU-YuanGroup/…

697

Guowei Xu Retweeted

Aran Komatsuzaki@arankomatsuzaki · Jan 9

SynthLabs + Stanford presents: Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought Proposes Meta Meta-CoT, which extends CoT by explicitly modeling the underlying reasoning required to arrive at a particular CoT

136

653

634

82.0K

Guowei Xu Retweeted

Saining Xie@sainingxie · Dec 22

Video understanding is the next frontier, but not all videos are alike. Models now reason over youtube clips and feature films, but what about the everyday spaces we—and our future AI assistants—navigate and experience? Introducing Thinking in Space, our latest study exploring…

106

681

374

205.0K

Guowei Xu Retweeted

Zhou Xian@zhou_xian_ · Dec 18

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics…

575

3.0K

16.0K

12.0K

3.7M

Guowei Xu Retweeted

Carlo Sferrazza@carlo_sferrazza · Dec 17

🚨 New reinforcement learning algorithms 🚨 Excited to announce MaxInfoRL, a class of model-free RL algorithms that solves complex continuous control tasks (including vision-based!) by steering exploration towards informative transitions. Details in the thread 👇

594

418

69.0K

Guowei Xu Retweeted

Xiang Yue@xiangyue96 · Dec 10

🚀 Introducing MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale We’re excited to open-source: - 12M MM instruction tuning dataset - MAmmoTH-VL-8B, a SoTA VL model (~10B size) on 20+ downstream tasks compared with fully open-source baselines such as…

103

16.0K

Guowei Xu@Kevin_GuoweiXu · Nov 28

🚀 LLaVA-CoT is now fully opensource! 🎉 Here’s how you can access everything: 1️⃣ Model: huggingface.co/Xkev/Llama-3.2… 2️⃣ Dataset: huggingface.co/datasets/Xkev/… 3️⃣ Code (Data Generation, Training, Inference): github.com/PKU-YuanGroup/… 4️⃣ Gradio APP: huggingface.co/spaces/Xkev/Ll… 💡 As an academic…

GGuowei Xu@Kevin_GuoweiXu · Nov 18

🚀 Introducing LLaVA-o1: The first visual language model capable of spontaneous, systematic reasoning, similar to GPT-o1! 🔍 🎯Our 11B model outperforms Gemini-1.5-pro, GPT-4o-mini, and Llama-3.2-90B-Vision-Instruct! 🔑The key is training on structured data and a novel inference…

172

12.0K

Guowei Xu Retweeted

alphaXiv@askalphaxiv · Nov 28

LLaVA-o1 is the first visual language model capable of systematic reasoning similar to GPT-o1 🚀 But how does it perform on multimodal math reasoning questions? 🔎 New numbers from LlaVA-o1 on the MathVision Dataset from author @Kevin_GuoweiXu LLaVA-o1 (11B): 23.7%…

1.0K

Guowei Xu Retweeted

Mohamed Elsayed@mhmd_elsaye · Nov 22

Would you believe that deep RL can work without replay buffers, target networks, or batch updates? Our recent work gets deep RL agents to learn from a continuous stream of data one sample at a time without storing any sample. Joint work with @Gautham529 and @rupammahmood.

106

630

390

158.0K

Guowei Xu@Kevin_GuoweiXu · Nov 21

Updates about our new research: (1) After careful consideration, we have decided to rename LLaVA-o1 to LLaVA-CoT to make its name sound more like an academic research project. We are currently updating this change across platforms such as arXiv and GitHub, which is expected to…

13.0K

Guowei Xu Retweeted

DeepSeek@deepseek_ai · Nov 20

🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! 🔍 o1-preview-level performance on AIME & MATH benchmarks. 💡 Transparent thought process in real-time. 🛠️ Open-source models & API coming soon! 🌐 Try it now at chat.deepseek.com #DeepSeek

264

795

5.0K

2.0K

3.2M

Guowei Xu Retweeted

AK@_akhaliq · Nov 18

LLaVA-o1 Let Vision Language Models Reason Step-by-Step

131

742

408

86.0K