Connor Shorten
@CShorten30
AI and Databases @weaviate_io
Hey everyone! I am super excited to share our new research report is live on ArXiv! ๐ Querying Databases with Function Calling! Thread with more details! ๐งต(1/8)

๐ฌ Hot Take: @lateinteraction, MIT Professor & @DSPyOSS Creator & Project Lead โThe level of AI we're approaching is a rebirth of computing.โ From assembly to C to Python, we used to see big leaps in how we built software. Thenโฆ not much. Now AI is accelerating things againโฆ
TREC RAG 2025 official retrieval baselines are available now! ๐ฅ๐ฅ๐ฅ Time to start generating those answers and submit them to eval base before August 17th! ๐๏ธ Let the games begin, well you have less than a month remaining to submit! ๐ป
๐ The official baselines and validation scripts for TREC RAG 2025 are now available! These include both retrieval results (for the AG task) and the corresponding end-to-end augmented generation outputs. Access the baselines and necessary scripts here: trec-rag.github.io/annoucements/2โฆ
Curious to understand system prompts from sota LLM systems, like the one that allowed Gemini 2.5 Pro to solve IMO 2025 problems? @AnkitaSundar analyzed ~100 leaked system prompts systematically using LLM-powered data analysis with LOTUS. ๐ Check out her awesome collab demoโฆ
๐จ Olympiad math + AI: We ran Googleโs Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 โ remarkable for tasks demanding deep insight and creativity. The model could win gold! ๐ฅ #AI #Math #LLMs #IMO2025
> multi-agent LLM system deployed in the real world to improve doctor-patient communication > optimized prompts using DSPy > even with only 100 annotated examples, we got strong performance across empathy, completeness, and correctness
๐จ New paper! We present Dr.Copilot โ a multi-agent LLM system deployed in the real world to improve doctor-patient communication in Romanian ๐ท๐ด. One of the first production deployments of LLMs in Romanian telemedicine. ๐ ๐ arxiv.org/abs/2507.11299
Vibe coding vector search with @weaviate_io works pretty well with Gemini in Google Colab. What are your tips for working with coding assistants and Jupyter Notebooks?
Get started with Gemini Embedding using @weaviate_io, which supports over 100+ languages and flexible dimensions for performance and storage needs. Check out the notebook. github.com/weaviate/recipโฆ
How do you feed a 500,000-word novel to an AI? You can't. Not all at once, anyway. This is where ๐ฐ๐ต๐๐ป๐ธ๐ถ๐ป๐ด comes in. ๐ช๐ต๐ฎ๐ ๐ถ๐ ๐๐ต๐๐ป๐ธ๐ถ๐ป๐ด? Chunking is the pre-processing step of splitting texts into smaller pieces of texts, i.e. "chunks". Each chunk becomes theโฆ
Heard of @DSPyOSS but find it intimidating? I wrote a beginner's playbook that breaks it down into simple, practical steps. Learn the fundamentals of building AI systems that don't constantly need fixing. This is part of chapter 3 from our upcoming book on AI Agents. Link:โฆ
I just published a new playbook which is part of my upcoming book on AI Agents. The response to the last playbook was incredible, thank you! To build on that, I'm sharing the next logical step: the exact 4-part template I use to prompt reliable agents. The full playbook isโฆ
My talk for @aiDotEngineer on what I think every person working with language models needs to know about GPUs is now available! - Latency lags bandwidth. - GPUs embrace bandwidth. - Don't be scared of N squared. - Use the Tensor Cores, Luke! youtube.com/watch?v=y-UGrYโฆ
๐ฉ๐ฒ๐ฐ๐๐ผ๐ฟ ๐ฑ๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ ๐บ๐ฒ๐บ๐ผ๐ฟ๐ ๐๐๐ฎ๐ด๐ฒ ๐ฑ๐ฟ๐ผ๐ฝ๐ ๐๐ถ๐๐ต ๐ป๐ฒ๐ ๐ฟ๐ผ๐๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐พ๐๐ฎ๐ป๐๐ถ๐๐ฎ๐๐ถ๐ผ๐ป. With Weaviate 1.32, weโre introducing memory footprint reduction, seamless collection migrations with aliases, and accelerated cost-aware sorting. Keyโฆ
Agents arenโt reliable. They donโt learn from experience. At @composiohq, we provide skills that evolve with your agents @lightspeedvp gave us $25M to make agents usable
Vector embeddings all the way down
What really runs LLM models.. IYKYK @weaviate_io
Thanks @lateinteraction ! Every time I think about the gazillion prompt / systems engineering tweaks that also go into making an AI system work I think about how early you were with @DSPyOSS :) Shared theme: find the key human input and make it programmatic.
Every time I think about what it takes to systematically organize the gazillion training tasks that together make a great foundation model, my appreciation for how early @SnorkelAI was increases.
I read @helloiamleonie's excellent post "37 Things I Learned About Information Retrieval..." and thought, damn, this is stuff I never want to forget. So I create a bunch of flashcards to add to my collection. Maybe it will be useful to you too. gist.github.com/intellectronicโฆ
AegisLLM leverages DSPy's MIPROv2 optimizer in a totally unexpected way: to evolve its prompts based on the attacks it sees in real time. Some really large gains!
If you are interested in building agentic workflows, AegisLLM is a nice instantiation in safety/security domain! Thanks @furongh for sharing it with me. Agentic workflows must be designed and optimized as systems, as @lateinteraction keeps repeating.
Thanks for sharing! Building and optimizing agentic systems with DSPy has been a great experience!
AegisLLM leverages DSPy's MIPROv2 optimizer in a totally unexpected way: to evolve its prompts based on the attacks it sees in real time. Some really large gains!
Gartner predicts that by 2027, over 40% of agents will fail. That number will be much higher unless we act ๐ง๐จ๐ฐ. This is why weโve launched our Agent Reliability platform ๐ญ๐จ๐๐๐ฒ. Our customers and partners have been testing it out, and we've been receiving great feedbackโฆ
๐ข Big update: Introducing BFCL V4 Agentic โ and BFCL published at ICML 2025! ๐๏ธ Some BFCL lore... back in 2022, as researchers we couldn't find good open-source models that could handle zero-shot function calling โ so we decided to train our own. Sounds simple, right? It was!โฆ