vishal
@vishal_learner
http://ColBERT.ai Maintainer. https://fast.ai community member. Will post about sports occasionally. #FlyEaglesFly
This is incredibly exciting and a tremendous opportunity for me. Studying ColBERT (both the papers and the repo) has been one of my greatest sources of joy this past year. Honored to be able to have an opportunity to contribute to the community!
Welcome @vishal_learner as a new Maintainer of the ColBERT repo! I’ve long loved Vishal’s ColBERT and PLAID deepdives here and on YouTube. Thanks to the many incredible folks who DM’ed.
When you realize that open-source is at the frontier of AI despite: - less GPUs - less money - less public and policy support - no $100M salaries to attract talent - with closed-source taking advantage and copying all the innovations of open-source without contributing back…
TIL about super().__getstate__() via this PyTorch PR: github.com/pytorch/pytorc…




We’ve extended enrollment in our **last** live cohort on AI Evals until the end of this week! Here’s the syllabus (2 lessons per week): Week 1: Fundamentals & Lifecycle LLM Application Evaluation, Systematic Error Analysis Week 2: Implementing Effective Evaluations,…
how much do I have to pay per month to get back emoji reactions during discord voice calls?
Summoning the wisdom of the crowd once again: At the moment, what’s the most GRPOable small (all definitions of small: 7B, sub-4B, sub-2B) model? Is it still the case that Qwen is always ready to learn while the others are more hit and miss?
Seeing ModernBERT and Ettin models being useful is heart warming
🚀 Introducing GLiClass‑V3 – a leap forward in zero-shot classification! Matches or beats cross-encoder accuracy, while being up to 50× faster. Real-time inference is now possible on edge hardware. huggingface.co/collections/kn… #TextClassification #NLP #ZeroShot #GLiClass
I made a simple tutorial how to fine-tune LLMs using (almost) same memory as needed for inference.
Been thinking about this, especially during the AI evals course.
Don't leave AI to the STEM folks. They are often far worse at getting AI to do stuff than those with a liberal arts or social science bent. LLMs are built from the vast corpus human expression, and knowing the history & obscure corners of human works lets you do far more with AI
The PyTorch PR that changes "KB" to "KiB" in `torch.cuda.memory_summary()` because "we're talking powers of 2 not 10" github.com/pytorch/pytorc…

Nice. Late interaction on the document side, at the granularity of chunks. Just add it on the query side and do MaxSim and voila!
Before: chunk overlaps, context summaries, metadata augmentation Now: voyage-context-3 processes the full doc in one pass and generates a distinct embedding for each chunk. Each embedding encodes the chunk-level details AND full doc context, for more semantically aware…
sorry my verdict on Grok-4 is that it is not better than Opus for coding, and not better for o3 for reasoning. I don't think it has been trained on benchmarks, but I think its brain is deep friend into a problem-solution mindset that doesn't extend to real-world situations...…
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
Cool to see the PyTorch PR that made printing a ModuleList much more concise/cleaner github.com/pytorch/pytorc…


Really like this set of standout ideas. We say a million things in the course reader and I love hearing what sticks / what's practical
Just published a blog post where I highlight 10 ideas that stood out to me from the first lesson and first three chapters of the course reader from the AI evals course taught by @HamelHusain and @sh_reya. vishalbakshi.github.io/blog/posts/202…