Vahid Kazemi
@VahidK
PhD in machine learning. KTH 14. Ex @OpenAI, @Apple, @Waymo, @Google.
Finally finished editing my video. Episode 02: Build your own language model: youtube.com/watch?v=7ny8Sr….
In my opinion we have already achieved AGI and it’s even more clear with O1. We have not achieved “better than any human at any task” but what we have is “better than most humans at most tasks”. Some say LLMs only know how to follow a recipe. Firstly, no one can really explain…
I have seen many variations of "AI can't invent new things", but if you look at the theory of how machine learning models are trained there's nothing that supports this claim. Neural nets can't generalize is just plain wrong. In terms of machine learning theory, there's no…
I shared a controversial take the other day at an event and I decided to write it down in a longer format: I’m afraid AI won't give us a "compressed 21st century". The "compressed 21st century" comes from Dario's "Machine of Loving Grace" and if you haven’t read it, you probably…
It's incredible that we live in a time when a paper casually uploaded on arXiv can wipe $1 trillion off the stock market. To me, it makes sense to see a correction in Nvidia's and some cloud providers' valuations, but cheaper AI means everyone else is a winner.
I was thinking about making a new video on AI. I made this one last year which covers the basics of ML: youtube.com/watch?v=RZMpet… After that, I gained a great appreciation for YouTube content creators! It's really hard work, and a full-time job, even with mediocre production value.…
Went to watch the new Lion King in the theater earlier this week and ended up leaving 15 minutes in. Went back home and rewatched the first Pirates of Caribbean instead for the 5th time or so and once again confirmed it's a master piece! What happened to Disney? From the quality…
One common theme I find in what works in ML right now is that so far the best performing ideas are the simplest ones. Two notable examples are transformers and diffusion models that are conceptually much simpler than a vast majority of work that came before them. So why didn't we…
Currently AI video generation is 2-3 orders of magnitude slower than GPU accelerated rendering in video games. As the GPUs get faster and with algorithmic improvements to generative models, it maybe feasible in a few years to have AI game consoles that just run neural net games.
A near magical experience for people who work with LLMs is when you finetune an LLM on a complex task with a handful of examples. The LLM often generalizes beyond your expectations. It's as if the model gets what you are trying to teach it with the examples and doesn’t get bogged…
What nonsense. I worked closely with the team who made the transformer architecture at Google and was following the internal discussions pretty closely at the time. The key ingredients of the transformer architecture are: 1. Outer product attention: which can be traced back to…
Wow! "Attention is All You Need" (i.e., Transformers) was inspired by the Alien's communication style in the movie Arrival.
The AI industry is shifting from training multiple models from scratch to the continuous training of a unified model. This marks a new paradigm with many open research questions. How can we build a future-proof model that can be improved and expanded for years to come?
The Wild Robot is such a great movie. The best I’ve watched this year so far.
I used to be a fan of monorepos. Overtime I have come to believe the cons outweigh the benefits. My preference is now to make small low dependency libraries. That would naturally encourage decoupled code. Also, once in a while starting a new repo from scratch with no baggage can…
One of the software engineering malpractices I often see is prematurely writing general code. Almost always, it's better to start with something specific and minimal and generalize as real use cases grow, rather than designing for imaginary general use cases and locking yourself…
I'm sure there are use cases for which RAGs make sense but most applications I've seen (e.g. Bing's Copilot, Google's AI overviews) seem like products no one asked for and are both worse than regular search and vanilla LLM chatbots. LLMs have enormous memorization capability, if…
I've been mostly writing PyTorch code since I've been out of Google, but I'm revisiting Jax again. Seems like now there are more Jax frameworks than Jax engineers. I like that Jax has resisted the urge to impose a framework. Yet, I think Google should have standardized a single…
Watched Dune Part II earlier today and even though it was near 3 hours I didn't want it to end. What a magnificent movie. Human creativity is a marvelous thing.