Patel Maitreya
@patelmaitreya
Research Intern @Adobe | PhD at @ApgAsu @ASU | Vision & Language | T2I Diffusion Modeling | Prev. @SonyAI_global @Adobe
🚀 Introducing FlowChef: "Steering Rectified Flow Models in the Vector Field for Controlled Image Generation"! 🌌✨ 💡 Key Highlights: - Perform image editing, solve inverse problems, and more. - Achieved inversion-free, gradient-free, & training-free inference time steering!…
First time I’m not feeling FOMO over the OpenAI agent — I’ve got early access to @yutori_ai and it’s way, way better 😁 The team has absolutely nailed the use case. Not surprised — they’re brilliant.
I mean, the Windsurf/ScaleAI saga was wild to watch… but honestly, it feels unfair to all the early employees who took the biggest risks. If this keeps happening, why would anyone join a startup early? 🚩 Something’s gotta change.
🚨BREAKING: GOOGLE TO PAY $2.4 BILLION FOR WINDSURF STAFF AND IP we are so back.
Only in the Bay Area: Had a full convo on the context limitations of generative vision models with a random guy on Caltrain. No intros, no names — just shared frustration with 224x224 and short memories. 🤝 #GenAI #BayArea
Okay… diffusability is a real concern. 🙁 Any post-hoc solution without equivariance training?
I have been wondering about this for months now. This makes total sense. Gotta change my scripts. 🏃➡️
I'm glad to see that someone finally wrote a paper on this low hanging fruit, which I shared on X back in December and also included in invited talks. (check this out) x.com/patelmaitreya/…
FlowMo Variance-Based Flow Guidance for Coherent Motion in Video Generation
Lately, so many incredible projects are being released—it’s inspiring and a little overwhelming. Some days I wake up excited to be part of this space. Other days, I wonder if what I’m building will still matter tomorrow. That back-and-forth is constant.
I have lost confidence in all papers only using Qwen for alignment improvements. How can we trust that these improvements are not influenced by spurious biases or methodological changes? Interesting work though.
🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…
I agree with this. There are still many great papers that predate the recent diffusion LLM hype that are being overlooked. Hint: At least check the #ICLR2025 papers.
Tbh, there are many good papers coming out of academic labs too, for example diffusion LLMs, but they didn’t really get attention until Google released a paper on it. Perhaps you should also tweet a bit about academic research :)
Veo 3 is here, and in addition to better visuals, it makes noises and speaks! This was a massive effort made possible by incredible passion from the whole Veo team and the many other team enabling it to launch today. Looking forward to seeing what others do with it! #veo3
Whoa… just got access to the model—and it’s phenomenal. Solves many of the classic LLM problems right out of the gate. Time to put it through its paces. Wish I knew the parameter count for fair comparison, but I’ll benchmark it soon enough.
We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO
I’m seriously getting FOMO.
We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO
Just had a Final Destination-type moment on my way to the Bay Area. Life really threw a plot twist today. Luckily, I did not get hurt and just got a big logistical task on my to-do list. 😮💨
It just came to me that not having time conditioning won't be as bad as I thought. If scaled well, then maybe we could just use it as the post-processing method to further enhance generation quality on top of consistency models. Hmmm.... 🤔
This reminds me of several experiments I did years ago.... The correlation between x(t) and ( t ) likely explains why models can perform well without time conditioning. However, key factors like training stability, the number of function evaluations (NFEs), and trajectory…
This is the best part of the conference! 😆
This ICLR is the best conference ever. Attendees are extremely friendly and cuddly. ..What do you mean this is the wrong hall?