Hritik Bansal

@hbXNov

CS PhD @UCLA Intern @MetaAI FAIR | Prev: Bachelors @IITDelhi, Intern @GoogleDeepMind @AmazonScience | Multimodal ML, Language models | Cricket🏏

Joined May 2018

2KFollowing

2KFollowers

Pinned

Hritik Bansal@hbXNov · Apr 2

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️ Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵

hbXNov's tweet image. 📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️

Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵

275

251

45.0K

Pinned

Hritik Bansal Retweeted

Tanishq Abraham is at ICML@iScienceLuvr · May 23

LaViDa: A Large Diffusion Language Model for Multimodal Understanding "We introduce LaViDa, a family of VLMs built on DMs. We build LaViDa by equipping DMs with a vision encoder and jointly fine-tune the combined parts for multimodal instruction following. " "LaViDa achieves…

244

109

17.0K

Hritik Bansal Retweeted

Yihe Deng@Yihe__Deng · Jul 24

🙌 We've released the full version of our paper, OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles Our OpenVLThinker-v1.2 is trained through three lightweight SFT → RL cycles, where SFT first “highlights” reasoning behaviors and RL then explores and…

171

11.0K

Hritik Bansal Retweeted

Sedrick Keh@sedrickkeh2 · Jul 18

📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀 OpenThinker3-1.5B is a smaller version of our previous 7B model, trained on the same OpenThoughts3-1.2M dataset.

112

10.0K

Hritik Bansal@hbXNov · Jun 19

Excited to share that I will join @Meta FAIR (Seattle 🗻) for my final summer internship w/ @ramakanth1729! 🧑‍🎓Looking forward to meeting new people, learning new things, and chatting about data, algorithms, and evaluation for LLM/VLM reasoning.

103

7.0K

Hritik Bansal@hbXNov · Jun 17

🥳 Excited to share that VideoPhy-2 has been awarded 🏆 Best Paper at the World Models Workshop (physical-world-modeling.github.io) #ICML2025! Looking forward to presenting it as a contributed talk at the workshop! 😃 w/ @clarkipeng @YonatanBitton Roman @adityagrover_ @kaiwei_chang…

HHritik Bansal@hbXNov · Mar 10

Video generative models hold the promise of being general-purpose simulators of the physical world 🤖 How far are we from this goal❓ 📢Excited to announce VideoPhy-2, the next edition in the series to test the physical likeness of the generated videos for real-world actions. 🧵

5.0K

Hritik Bansal Retweeted

Tanmay Parekh@tparekh97 · Jun 11

🚨 New work: LLMs still struggle at Event Detection due to poor long-context reasoning and inability to follow task constraints, causing precision and recall errors. We introduce DiCoRe — a lightweight 3-stage Divergent-Convergent reasoning framework to fix this.🧵📷 (1/N)

4.0K

Hritik Bansal Retweeted

Ryan Marten@ryanmart3n · Jun 5

Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…

191

922

726

189.0K

Hritik Bansal@hbXNov · May 22

🧑‍🍳Very excited to present LaViDa, one of the first diffusion language models for multimodal understanding! 🌟Unlike autoregressive LMs, you can control the speed-quality tradeoff, and solve constrained generation problems out of the box 📦 🌟 We also release LaViDa-Reason, a…

SShufan (Jack) Li@li78658171 · May 22

📢(1/11)Diffusion LMs are fast and controllable at inference time! But why restrict such benefits for processing text data? We are excited to announce LaViDa, one of the first and fastest large diffusion LM for vision-language understanding!!

20.0K

Hritik Bansal@hbXNov · May 20

Great to see that the latest #GeminiDiffusion release benchmarks on our challenging general-purpose reasoning Big Bench Extra Hard dataset! It is now available on HF 🤗: huggingface.co/datasets/BBEH/… Eval code: github.com/google-deepmin…

MMehran Kazemi@kazemi_sm · Feb 27

Is BIG-Bench Hard too easy for your LLM? We just unleashed BIG-Bench EXTRA Hard (BBEH)! 😈 Every task, harder! Every model, humbled! (Poem Credit: Gemini 2.0 Flash) Massive headroom for progress across various areas in general reasoning 🤯

1.0K

Hritik Bansal@hbXNov · May 16

📢 Submit your cool ideas as short or long papers to the first workshop on the foundations of long video generation, understanding and evaluation 🚀 ramoscsv.github.io/longvid_founda…

VVasco Ramos@ramoscsv · May 16

📢 Announcing our 1st Workshop on Long Multi-Scene Video Foundations @ #ICCV2025 (@ICCVConference) in Honolulu, Hawaii! Co-organized by @regev_cohen, @SivanDoveh, @hila_chefer , Jehanzeb Mirza, @hbXNov , @inbar_mosseri , Joao Magalhaes and me. website: ramoscsv.github.io/longvid_founda…

599