Daniel Israel
@danielmisrael
PhD Student Studying AI/ML @UCLA
“That’s one small [MASK] for [MASK], a giant [MASK] for mankind.” – [MASK] Armstrong Can autoregressive models predict the next [MASK]? It turns out yes, and quite easily… Introducing MARIA (Masked and Autoregressive Infilling Architecture) arxiv.org/abs/2502.06901
Had the pleasure of learning about TRACE by Gwen Yidou-Weng, Benjie Wang, and @guyvdb at ICML! It view alignment/controlled decoding through a Bayesian lens and derives a simple, principled, and effective new method. I highly recommend reading this paper!
🚀 Introducing PhysiX: One of the first large-scale foundation models for physics simulations! PhysiX is a 4.5B parameter model that unifies a wide range of physical systems, from fluid dynamics to reaction-diffusion, outperforming specialized, state-of-the-art models.
(1/6)Our work Reflect-DiT was accepted to #ICCV2025 ! Reflect-DiT allows the model to reflect on its past generations and textual feedback to self-correct and improve, extending reasoning to text-to-image generation.
The unreasonable effectiveness of model merging for cross-lingual transfer ! Our preprint evaluates a number of *modular* approaches to fine-tuning LLMs that "assign" model params to either task or language. Surprisingly, merging experts beats all ! 🧵1/4 arxiv.org/abs/2505.18356
📢(1/11)Diffusion LMs are fast and controllable at inference time! But why restrict such benefits for processing text data? We are excited to announce LaViDa, one of the first and fastest large diffusion LM for vision-language understanding!!
📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️ Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵
What happens if we tokenize cat as [ca, t] rather than [cat]? LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself. #AI #LLMs #Token
Video generative models hold the promise of being general-purpose simulators of the physical world 🤖 How far are we from this goal❓ 📢Excited to announce VideoPhy-2, the next edition in the series to test the physical likeness of the generated videos for real-world actions. 🧵
A few months ago, we started Inception Labs, a new generative AI startup with a rockstar founding team. At Inception, we are challenging the status quo for language generation. Our first results bring blazing fast speeds at 1000+ tokens/sec while matching the quality of leading…
Excited to release PrefEval (ICLR '25 Oral), a benchmark for evaluating LLMs’ ability to infer, memorize, and adhere to user preferences in long-context conversations! ⚠️We find that cutting-edge LLMs struggle to follow user preferences—even in short contexts. This isn't just…
Enabling Autoregressive Models to Fill In Masked Tokens Hybrid autoregressive and masked language model for infilling by training a linear decoder that takes their concatenated hidden states as input. Provides faster inference with KV caching. MARIA significantly outperforms…
I really enjoyed contributing to this project and am excited to share what we have built!
Natively multimodal models unlock new possibilities for AI biomedical 🥼assistants, from answering questions about images to generating them for decision-making. Thrilled to introduce MedMax—an open sota multimodal model designed for diverse biomedical tasks and domains🩻
You have some model/knowledge (e.g. Bayes Net, Probabilistic/Logic Program, DB) and some query (e.g. MAP, Causal Adjustment) you want to ask. When can you compute this efficiently? Find out @ NeurIPS today in Poster Session 6 East, #3801. Paper: arxiv.org/abs/2412.05481