vespa.ai
@vespaengine
http://Vespa.ai - the open source platform for combining data and AI, online. Vectors/tensors, full-text, structured data; ML model inference at scale.
Announcing: The RAG Blueprint Build RAG like the world's most successful applications. Start from our open source sample app which contains all you need to do to achieve world-class quality at any scale. Sample app: github.com/vespa-engine/s… Blog post: blog.vespa.ai/the-rag-bluepr…
If you want to learn more about @vespaengine, you might find our playlists interesting. Lots of podcasts and conference presentations on various topics here: youtube.com/@vespaai/playl…
Recommended weekend listening is this podcast from AWS with our CEO
My podcast with AWS is out podcasts.apple.com/nl/podcast/jon… Some of what we talked about: - Even a superintelligent LLM won't help if you can't give it the right data. - This is called "relevance", and is Not Exactly a New Problem. - With deep research we're seeing query load exploding.…
We've updated our ES comparison to cover Elastic 9. Congrats to Elastic, achieving latencies just 3x those of Vespa is no small feat!
TL;DR 1. #Elasticsearch 9 is more efficient than 8, gap to @vespaengine reduced to ~3x with 16 clients 2. Single client latency is higher, unless force-merged => a bigger gap (~1.7x for hybrid) 3. Pushing more load increases both gaps: - ES 9 >> ES 8 - Vespa >> ES 9
cool to learn @allen_ai are using @vespaengine! and binarized embeddings are amazing for cost/perf, this is a great model
just learned that ai2's scholar qa is powered by mixedbreads embedding model (using binary quantization) and our v1 reranking model. check out the report here: arxiv.org/abs/2504.10861
Spreading Vespa cheer in Prague today — wearing what we run. Vespa.ai powers the @searchplex stack and my holiday look today ;-) Thanks for the cool swag, @vespaengine 😎
We'll have time to explain later, but for those who understand, enjoy Vespa 8.530.

Another, related new feature: Proximity between chunks. If you index chunks, setting this will instantly improve your quality.
This is a cool addition. It's more than just chunking too. Generally, lexical search algorithms like BM25 and TF-IDF are tailored for a world of whole documents. Then, lots of modern embeddings and semantic retrieval benefit from smaller text chunks (and maybe prefix/suffix…
As most have moved to multi-chunk documents in RAG applications, we thought we should make this easier. Introducing the chunk indexing function:

According to the recent Columbia Journalism Review, Perplexity has the best AI Search and only ChatGpt is even in the same league. Perplexity is using the same models as the competition, but is the only one using Vespa.ai for search.
Hard problem indeed. Not everyone wraps google or start from scratch though. Case in point: @perplexity_ai builds on @vespaengine vespa.ai/perplexity/
According to this reuters article, OpenAI is years away from its own web search technology. I think people don't realize how hard it is to build search over the web from scratch, and that's why nearly everyone wraps Google. The challenge: filter trillions of webpages to the N…
Introduction to @vespaengine (with lots of references) for #Solr users: blog.vespa.ai/solr-vs-vespa/ As always, feedback is welcome!