Han Fang

@Han_Fang_

Research scientist at Meta SuperIntelligence Labs. Reasoning & Agents

United States

Joined August 2011

137Following

1KFollowers

Pinned

Han Fang@Han_Fang_ · Jul 13

Thought provoking

�🐻熊狸@bigeagle_xd · Jul 13

some of my own thoughts and opinions behind k2 bigeagle.me/2025/07/kimi-k…

275

Han Fang Retweeted

Jiao Sun@sunjiao123sun_ · Dec 14

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡

182

811

4.0K

526

2.2M

Han Fang@Han_Fang_ · Oct 16

Meta GenAI is looking for 2025 research interns across language, and multimodal. In particular, my team is looking for interns on RLHF algos, agents, and post-training more broadly. metacareers.com/jobs/432691156…

203

232

23.0K

Han Fang@Han_Fang_ · Oct 14

Excited to share our latest research on red teaming and agent safety from SEAL team at @scale_AI . This work highlights a critical gap: safety mechanisms in advanced LLMs do not generalize well to downstream browser agents. We also found that LLM attacks transfer with high…

ZZifan (Sail) Wang@_zifan_wang · Oct 14

(1/7) Excited to share our new red teaming work at Scale, Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents. We find jailbreaking LLM agents that use browsers is surprisingly easy. In many cases, you can just direct ask! Paper & Project page: scale.com/research/brows…

127

32.0K

Han Fang Retweeted

Philipp Schmid@_philschmid · Oct 3

How can we mitigate reward hacking in RLHF? 🤔 Constrained Generative Policy Optimization (CGPO) is a new RLHF method using Mixture of Judges (MoJ) from @AIatMeta. CGPO outperforms PPO (single RM) on Alpaca Eval, Arena Hard, IFEval! 👀 Implementation 1️⃣ Select pre-trained LLM…

5.0K

Han Fang@Han_Fang_ · Oct 2

📢 New papers from GenAI & FAIR: mixture of Judges work really well in RLHF! Please check @Han_Fang_ 's thread for more details!

AAI at Meta@AIatMeta · Oct 1

📣 New paper from GenAI and Meta FAIR. CGPO uses Mixture of Judges and consistently outperforms SOTA RLHF approaches across various tasks. More details and key results in the full thread 🧵

8.0K

Han Fang@Han_Fang_ · Oct 1

📣 New paper from GenAI and Meta FAIR. CGPO uses Mixture of Judges and consistently outperforms SOTA RLHF approaches across various tasks. More details and key results in the full thread 🧵

HHan Fang@Han_Fang_ · Oct 1

A new RLHF paper from our team- The Perfect Blend: Redefining RLHF with Mixture of Judges arxiv.org/abs/2409.20370

247

50.0K