Shivam Duggal

@ShivamDuggal4

PhD Student @MIT | Prev: Carnegie Mellon University @SCSatCMU | Research Scientist @UberATG

Joined June 2017

410Following

1KFollowers

Pinned

Shivam Duggal@ShivamDuggal4 · Jul 11

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

ShivamDuggal4's tweet image. Compression is the heart of intelligence
From Occam to Kolmogorov—shorter programs=smarter representations

Meet KARL: Kolmogorov-Approximating Representation Learning.

Given an image, token budget T &amp; target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

344

270

51.0K

Shivam Duggal@ShivamDuggal4 · Jul 25

For @NeurIPSConf, we can't update the main PDF or upload a separate rebuttal PDF — so no way to include any new images or visual results? What if reviewers ask for more vision experiments? 🥲 Any suggestions or workarounds?

1.0K

Shivam Duggal@ShivamDuggal4 · Jul 22

Great work from great people! @mihirp98 @pathak2206 AR aligns w/ compression theory (KC, MDL, arithmetic coding), but diffusion is MLE too. Can we interpret diffusion similarly? Curious how compression explains AR vs. diffusion scaling laws. (Ilya’s talk touches on this too.)

MMihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

2.0K

Shivam Duggal@ShivamDuggal4 · Jul 15

Indeed! I find H-Net to be closely related to KARL — and even our earlier work ALIT (the recurrent tokenizer in the figure below) shares strong connections. Loved reading H-Net, like all @_albertgu’s work. Congrats to @sukjun_hwang and team!

�𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8 · Jul 15

Single-pass Adaptive Image Tokenization for Minimum Program Search KARL is a single-pass adaptive image tokenizer that predicts how many tokens are needed based on Kolmogorov Complexity, without test-time search. It halts once enough information is captured, using token count as…

3.0K

Shivam Duggal@ShivamDuggal4 · Jul 11

Our new work on adaptive image tokenization: Image —> T tokens * variable T, based on image complexity * single forward pass both infers T and tokenizes to T tokens * approximates minimum description length encoding of the image

SShivam Duggal@ShivamDuggal4 · Jul 11

203

14.0K

Shivam Duggal Retweeted

Giannis Daras@giannis_daras · Jun 16

Announcing Ambient Diffusion Omni — a framework that uses synthetic, low-quality, and out-of-distribution data to improve diffusion models. State-of-the-art ImageNet performance. A strong text-to-image results in just 2 days on 8 GPUs. Filtering ❌ Clever data use ✅

453

401

62.0K

Shivam Duggal Retweeted

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

365

121

86.0K