llm-d
@_llm_d_
llm-d: a Kubernetes-native high-performance distributed LLM inference framework
llm-d organizes through 7 specialized teams (SIGs): 🔀 Inference Scheduler 📊 Benchmarking ⚡ PD-Disaggregation 🗄️ KV-Disaggregation 🚀 Installation 📈 Autoscaling 👀 Observability Weekly meetings, public docs, active Slack channels. Join today! llm-d.ai/docs/community…
It’s hard to believe the @_llm_d_ project launched just 𝟮 𝗺𝗼𝗻𝘁𝗵𝘀 𝗮𝗴𝗼 — and it's already gaining serious traction across the industry!
🚀 Community win! Learn how to deploy llm-d on @digitalocean Kubernetes ✅ Step-by-step DOKS setup ✅ Core component deployment ✅ @vllm_project integration for high-perf serving Real practitioners sharing real deployment knowledge 🔥 👉 se.isfusion.cloud/blog/llm-d-dok… #llmd #vLLM
🚨 Last chance to shape the llm-d roadmap! 🚨 Our survey on LLM challenges, hardware & model strategy closes July 11th. Your input is critical! We'll review the anonymous results live at our community meeting on July 16th. Take 5 mins to help: red.ht/llm-d-user-sur…
Developer Day Milan happening now! Here's Roberto Carratalà fand Marco Caimi showing Llama Stack @vllm_project @_llm_d_ and MCP servers in action @RedHat_AI @openshift @rhdevelopers
Want to influence the future of llm-d? Our 5-min survey on real-world LLM use cases is open until July 11. We're reviewing the results live at our community meeting on July 16th, so your voice will be heard immediately. Make an impact: red.ht/llm-d-user-sur… #AI #MLOps #vllm
We're defining the next phase of llm-d, and we need your expert input. Share your insights on LLM serving, from hardware to performance, in our quick 5-minute survey. Your feedback will directly shape our roadmap. Take the survey here: red.ht/llm-d-user-sur… Thank you!
Big news from the llm-d project! Your input on our 5-min survey will define our future roadmap. Plus, we've just launched our YouTube channel with meeting recordings & tutorials. Subscribe and help us build the future of LLM serving! llm-d.ai/blog/llm-d-com…

Are you serving LLMs in production? We need your input for the llm-d project! Take our 5-min anonymous survey to help guide our roadmap by sharing your model strategy, hardware use, and biggest challenges. Let's build a better project together. red.ht/llm-d-user-sur…
Two new ways to get involved with the llm-d project! ✅ Help shape our roadmap by taking our 5-min survey on your LLM use cases. ✅ Subscribe to our new YouTube channel for tutorials & SIG meetings! Details in our latest community update: llm-d.ai/blog/llm-d-com…

Join the @vllm_project office hours this week with @RedHat_AI for an intro to llm-d 📅 Thursday, June 12 at 11AM PST / 2PM EST 🔗 Sign up to get a Google Calendar invite w/ Google Meet link: red.ht/office-hours 📷 Or watch live on YouTube: youtube.com/watch?v=ZbOWWi…
.@_llm_d_ is a Kubernetes-native distributed LLM inference framework designed for fast, scalable serving across hardware accelerators with strong performance per dollar. Learn how it works (and see a live demo) at this week’s @vllm_project office hours. Join details below 👇
🚀 2 weeks since launch: 1,000 ⭐️'s on GitHub! The llm-d project is moving fast with active design discussions on inference APIs, workload scheduling & model serving on Kubernetes. Check out what we are currently working on: llm-d.ai/blog/llm-d-wee… #llm #kubernetes #ai
Want to contribute to llm-d? Here's how to get started! - Attend weekly office hours, Wed at 1230 Eastern Time (red.ht/llm-d-public-c…) - Join the Google group (red.ht/llm-d-google-g…) - Join Slack (inviter.co/llm-d-slack)
llm-d is a new open-source distributed inference serving framework, but what does that mean? In this lightboard video, I dig into @_llm_d_ and how it works. Let me know if this is helpful!! llm-d.ai
AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving rocm.blogs.amd.com/artificial-int…
Red Hat announces the launch of llm-d, a new open source project that answers the most crucial need of generative AI’s (gen AI) future: Inference at scale. red.ht/4krtt2u
LLM inference is too slow, too expensive, and too hard to scale. 🚨 Introducing llm-d, a Kubernetes-native distributed inference framework, to change that—using vLLM (@vllm_project), smart scheduling, and disaggregated compute. Here’s how it works—and how you can use it today:
Read how llm-d is a k8s-native high-performance distributed LLM inference framework- a well-lit path for anyone to serve at scale, with the fastest time-to-value and competitive performance per dollar for most models across most hardware accelerators. llm-d.ai/blog