Benjamin Trent
@benwtrent
Doer of things | Builder of things | software engineer @elastic.
I enjoyed Berlin Buzzwords very much, always a pleasure to catch up with everyone in person. Here is my talk with @jpountz on shipping Lucene 10: youtu.be/GRhzgCEL_ac?si… .
Someone asked me for my opinion on the Vespa vs. Elasticsearch performance comparison today at Berlin Buzzwords, so I gave it a try: jpountz.github.io/2025/06/17/ana…
Awesome read on Lucene's implementation of ACORN-1🔥🔥 Filtered vector search is everywhere! Efficient, general-purpose (predicate-agnostic) indices that can support those use cases are super, super powerful!! Try it out & check out our original paper dl.acm.org/doi/10.1145/36…
Elasticsearch / Lucene adopts ACORN-1, which expands the exploration of nodes to ensure enough candidates that meet the filter By @benwtrent elastic.co/search-labs/bl…
This small change yielded a ~5% speedup on several queries of Lucene's nightly benchmarks (see last data point at benchmarks.mikemccandless.com/OrStopWords.ht…). Can you guess why?
Several weeks ago, I put the R-in RAG with @HamelHusain by discussing hybrid search best practices. Next up we put the F(ilter) in HNSW to build hybrid search. Which doesn't quite fit... but @benwtrent and I are not intimidated by such trivialities maven.com/p/430592/hybri…
Next in the series of building a search engine from scratch - we focus on hybrid retrieval with Benjamin Trent of Elastic. How do you add filtering to an HNSW index? maven.com/p/430592/hybri…
There has been a big regression in Lucene's nightly benchmarks recently after a kernel upgrade. @mikemccand and @rcmuir found that it was caused by a change in the Linux scheduler configuration. github.com/apache/lucene/…
I wanted to share what I learned from Tantivy's "Search Benchmark, the Game", so I set up GitHub pages and wrote two blogs, on general observations on the benchmark jpountz.github.io/2025/05/12/ana… and how it helped drive performance improvements in Lucene jpountz.github.io/2025/04/12/why…
Now live! Elastic 9.0/8.18 includes faster quantization, Elastic Distributions of OTel and LLM observability, the GA of Attack Discovery and Automatic Import, major enhancements in ES|QL like JOIN, and more. Learn more → go.es.io/42tjpyA
🚀 Unlock richer insights in Elasticsearch 8.18/9.0! ES|QL's new LOOKUP JOIN enables powerful query-time joins across indices / datastreams for instant data enrichment elastic.co/search-labs/bl…
It's time to redo benchmarks! #Lucene 10.2 was just released, with - huge speedups to non-scoring boolean queries, range queries and filtered vector search, - better merging defaults for faster search, - much faster merging of vectors And more... lucene.apache.org/core/corenews.…
#Apache #Lucene will soon have a faster and smaller terms index! This is a complex part of Lucene, and a major hotspot for terms heavy use cases like (primary) key/value store (~34% speedup, but results are preliminary!). Lucene's pluggable Codec API makes experimentation like…
Guo Feng contributed a 2.5x (!) speedup to #Lucene's numeric range queries by using vectorization. HZ sped up query evaluation, ID sped up decoding data from the index. Lots of great performance improvements coming in Lucene 10.2.
Two good speedups on queries sorted by field on #Lucene's nightly benchmarks last night. This is due to a recent optimization to conjunctive queries being generalized to intersecting a query and a collector. github.com/apache/lucene/…