vishal

@vishal_learner

http://ColBERT.ai Maintainer. https://fast.ai community member. Will post about sports occasionally. #FlyEaglesFly

Joined August 2023

100Following

684Followers

Pinned

vishal@vishal_learner · Jul 5

This is incredibly exciting and a tremendous opportunity for me. Studying ColBERT (both the papers and the repo) has been one of my greatest sources of joy this past year. Honored to be able to have an opportunity to contribute to the community!

OOmar Khattab@lateinteraction · Jul 5

Welcome @vishal_learner as a new Maintainer of the ColBERT repo! I’ve long loved Vishal’s ColBERT and PLAID deepdives here and on YouTube. Thanks to the many incredible folks who DM’ed.

13.0K

Pinned

vishal@vishal_learner · Jul 24

1 minute of Sascha napping (he loves the fan)

vishal Retweeted

clem 🤗@ClementDelangue · 11 h

When you realize that open-source is at the frontier of AI despite: - less GPUs - less money - less public and policy support - no $100M salaries to attract talent - with closed-source taking advantage and copying all the innovations of open-source without contributing back…

100

918

107

52.0K

vishal@vishal_learner · Jul 25

TIL about super().__getstate__() via this PyTorch PR: github.com/pytorch/pytorc…

105

vishal Retweeted

Hamel Husain@HamelHusain · Jul 24

We’ve extended enrollment in our **last** live cohort on AI Evals until the end of this week! Here’s the syllabus (2 lessons per week): Week 1: Fundamentals & Lifecycle LLM Application Evaluation, Systematic Error Analysis Week 2: Implementing Effective Evaluations,…

4.0K

vishal@vishal_learner · Jul 24

how much do I have to pay per month to get back emoji reactions during discord voice calls?

111

vishal Retweeted

Ben Clavié@bclavie · Jul 24

Summoning the wisdom of the crowd once again: At the moment, what’s the most GRPOable small (all definitions of small: 7B, sub-4B, sub-2B) model? Is it still the case that Qwen is always ready to learn while the others are more hit and miss?

4.0K

vishal@vishal_learner · Jul 24

TIL the term "releng"

131

vishal@vishal_learner · Jul 24

vishal@vishal_learner · Jul 24

102

vishal@vishal_learner · Jul 24

TIL the term "footgun"

171

vishal@vishal_learner · Jul 24

me trying to understand a PyTorch PR but there's no .py files

140

vishal@vishal_learner · Jul 23

Seeing ModernBERT and Ettin models being useful is heart warming

KKnowledgator@knowledgator · Jul 23

🚀 Introducing GLiClass‑V3 – a leap forward in zero-shot classification! Matches or beats cross-encoder accuracy, while being up to 50× faster. Real-time inference is now possible on edge hardware. huggingface.co/collections/kn… #TextClassification #NLP #ZeroShot #GLiClass

1.0K

vishal Retweeted

Vlado Boza@bozavlado · Jul 23

I made a simple tutorial how to fine-tune LLMs using (almost) same memory as needed for inference.

583

627

44.0K

vishal@vishal_learner · Jul 23

Been thinking about this, especially during the AI evals course.

EEthan Mollick@emollick · Jul 20

Don't leave AI to the STEM folks. They are often far worse at getting AI to do stuff than those with a liberal arts or social science bent. LLMs are built from the vast corpus human expression, and knowing the history & obscure corners of human works lets you do far more with AI

155

vishal@vishal_learner · Jul 23

The PyTorch PR that changes "KB" to "KiB" in `torch.cuda.memory_summary()` because "we're talking powers of 2 not 10" github.com/pytorch/pytorc…

vishal_learner's tweet image. The PyTorch PR that changes "KB" to "KiB" in `torch.cuda.memory_summary()` because "we're talking powers of 2 not 10"

github.com/pytorch/pytorc…

107

vishal@vishal_learner · Jul 23

Nice. Late interaction on the document side, at the granularity of chunks. Just add it on the query side and do MaxSim and voila!

VVoyage AI by MongoDB@VoyageAI · Jul 23

Before: chunk overlaps, context summaries, metadata augmentation Now: voyage-context-3 processes the full doc in one pass and generates a distinct embedding for each chunk. Each embedding encodes the chunk-level details AND full doc context, for more semantically aware…

6.0K

vishal Retweeted

Taelin@VictorTaelin · Jul 23

sorry my verdict on Grok-4 is that it is not better than Opus for coding, and not better for o3 for reasoning. I don't think it has been trained on benchmarks, but I think its brain is deep friend into a problem-solution mindset that doesn't extend to real-world situations...…

120

2.0K

314

226.0K

vishal@vishal_learner · Jul 22

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

OOwain Evans@OwainEvans_UK · Jul 22

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

180

2.0K

251

80.0K

vishal@vishal_learner · Jul 23

Cool to see the PyTorch PR that made printing a ModuleList much more concise/cleaner github.com/pytorch/pytorc…

115

vishal@vishal_learner · Jul 23

Really like this set of standout ideas. We say a million things in the course reader and I love hearing what sticks / what's practical

vvishal@vishal_learner · Jul 23

Just published a blog post where I highlight 10 ideas that stood out to me from the first lesson and first three chapters of the course reader from the AI evals course taught by @HamelHusain and @sh_reya. vishalbakshi.github.io/blog/posts/202…

3.0K