njkumarr

@njkumarr

research engineer

SF/NYC

Joined July 2015

485Following

393Followers

Pinned

njkumarr@njkumarr · Jul 23

New blogpost and demo! I optimized a Flappy Bird world model to run locally in my web browser (30 FPS) (demo and blog in replies)

6.0K

Pinned

njkumarr@njkumarr · Jul 17

There are some cool insights here. I would love to see somone create an index on how usage patterns + behaviors look for different disciplines of software how does Claude Code usage look for someone working in research vs product? Are they using these tools differently?

QQuentin Anthony@QuentinAnthon15 · Jul 12

I was one of the 16 devs in this study. I wanted to speak on my opinions about the causes and mitigation strategies for dev slowdown. I'll say as a "why listen to you?" hook that I experienced a -38% AI-speedup on my assigned issues. I think transparency helps the community.

384

njkumarr@njkumarr · Jul 17

ControlNet for world models 🧙‍♂️ I can see Snapchat/Tiktok ML teams eventually adding something like this in the future

DDecart@DecartAI · Jul 17

Introducing MirageLSD: The First Live-Stream Diffusion (LSD) AI Model Input any video stream, from a camera or video chat to a computer screen or game, and transform it into any world you desire, in real-time (<40ms latency). Here’s how it works (w/ demo you can use!):

457

njkumarr@njkumarr · Jul 6

reading another AI written kernels optimization article

211

njkumarr@njkumarr · Jul 3

this sounds bad and I think gpu allocation isn't well provisioned at Meta, but these duct tape solutions are common in a lot of ML workflows. i've had to write similar scripts at my job as well even with a lot of GPU resources.

ZZeyuan Allen-Zhu, Sc.D.@ZeyuanAllenZhu · Jul 2

1) Use cloud storage (AWS) to store checkpoint; 2) use wandb to cross-cluster communicate; 3) schedule identical jobs across clusters, if one start, the rest auto kill; 4) an infinite-loop job re-submitter to detect if job not finished, then resubmit, etc. A complex system.

313

njkumarr@njkumarr · Jun 27

interesting how transliteration affects some languages more than others, my hunch is that this might be a data distribution issue

TTessa Barton@tessybarton · Jun 23

GPT 4.1 is as good as GPT 4.5 for English and Spanish. But for Indic Languages I see a whole letter grade difference in basic math. Questions like "If I have 4 eggs and eat 2 how many are left?" Does distillation come at the cost of multilingual performance?

288

njkumarr@njkumarr · Jun 24

no new posts in r/LocalLLaMA for past 2 days because the sole moderator left and set auto-moderator to delete every new post/comment seems really dumb

njkumarr's tweet image. no new posts in r/LocalLLaMA for past 2 days because the sole moderator left and set auto-moderator to delete every new post/comment

seems really dumb

370

njkumarr Retweeted

TBPN@tbpn · Jun 10

We asked @SaiVC_ from @Redpoint about the biggest opportunities in AI. "If you look at the application with the strongest PMF within AI, it's coding." "With moderate assumptions, it's like a $1.6T market in terms of spend." "It's like the mother of all markets."

8.0K

njkumarr@njkumarr · Jun 10

We asked @SaiVC_ from @Redpoint to break down Meta's $14B Scale deal. "I think it's a brilliant acquisition." "We're going to look back and it's going to be like the Instagram deal." "You need accurate classified data to feed these large language models."

TThe Information@theinformation · Jun 10

Scoop from @coryweinberg: Meta to pay more than $14 billion for stake in Scale AI and grab the startup’s founder in an unusual deal aimed at bolstering its AI effort. Read more 👇 theinformation.com/articles/meta-…

28.0K

njkumarr@njkumarr · Mar 20

karpathy uses bearblog

AAndrej Karpathy@karpathy · Mar 19

Seeding my Bear ʕ•ᴥ•ʔ blog with more random posts, e.g. here's something I had on backlog for a while: # The append-and-review note An approach to note taking that I stumbled on and has worked for me quite well for many years. I find that it strikes a good balance of being…

836

njkumarr Retweeted

Dianadotlu@Dianadotlu · Feb 18

Hey yall! I am looking for a summer internship, please hit me up!!!!! Also made some new updates to my portfolio, check it out 😉 diana.lu

2.0K

1.0K

158.0K

njkumarr Retweeted

Zach Nussbaum@zach_nussbaum · Feb 13

many embedding models, especially multilingual ones, have been scaled up from BERT-base sized to 7B Mistral-sized models. but why haven't embeddings taken a page out of LLMs and leveraged Mixture of Experts? 🤔 enter nomic-embed-text-v2-moe!

4.0K