Red Hat AI

@RedHat_AI

Deliver AI value with the resources you have, the insights you own, and the freedom you need.

Joined May 2018

2KFollowing

7KFollowers

Pinned

Red Hat AI@RedHat_AI · May 20

LLM inference is too slow, too expensive, and too hard to scale. 🚨 Introducing llm-d, a Kubernetes-native distributed inference framework, to change that—using vLLM (@vllm_project), smart scheduling, and disaggregated compute. Here’s how it works—and how you can use it today:

551

495

68.0K

Red Hat AI@RedHat_AI · Jul 24

Happening next week! Hear the @vllm_project update and lear how to scale MoE with @_llm_d_. Register: red.ht/office-hours

488

Red Hat AI@RedHat_AI · Jul 23

Serving LLMs at scale is tough. Slow response times, poor GPU utilization, and high costs get in the way. In this video, @mgoin_ explains how @vllm_project tackles these challenges with 24x throughput and efficient batching. See how it works and more: youtube.com/watch?v=lxjWiV…

608

Red Hat AI@RedHat_AI · Jul 23

Thrilled to see @Meta joining @thealliance_ai! We're excited to continue our work with Meta and all AI Alliance members as we collectively drive an open future for AI. 🤝

TThe AI Alliance@thealliance_ai · Jul 23

🚨 New open-source drop: The AI Alliance is now supporting Llama Stack, a modular AI application framework developed by Meta. Built for portability, developer choice, and real-world deployment. Details ⬇️ 🔗 thealliance.ai/blog/ai-allian…

3.0K

Red Hat AI Retweeted

llm-d@_llm_d_ · Jul 22

llm-d organizes through 7 specialized teams (SIGs): 🔀 Inference Scheduler 📊 Benchmarking ⚡ PD-Disaggregation 🗄️ KV-Disaggregation 🚀 Installation 📈 Autoscaling 👀 Observability Weekly meetings, public docs, active Slack channels. Join today! llm-d.ai/docs/community…

966

Red Hat AI@RedHat_AI · Jul 22

.@vllm_project office hours return next week! Alongside project updates from @mgoin_, vLLM committers and HPC experts @robertshaw21 + @tms_jr will share how to scale MoE models with llm-d and lessons from real world multi-node deployments. Register: red.ht/office-hours

RedHat_AI's tweet image. .@vllm_project office hours return next week!

Alongside project updates from @mgoin_, vLLM committers and HPC experts @robertshaw21 + @tms_jr will share how to scale MoE models with llm-d and lessons from real world multi-node deployments.

Register: red.ht/office-hours

1.0K

Red Hat AI Retweeted

Casper Hansen@casper_hansen_ · Jul 22

vLLM is finally addressing a long-standing problem: startup times 35s -> 2s for CUDA graph capture is a great reduction!

614

162

35.0K

Red Hat AI@RedHat_AI · Jul 22

🎉 Huge congrats to @luka_govedic from @RedHat who’s now a core committer to @vllm_project! He’s led the torch.compile integration and custom passes, the vLLM startup time reduction initiative, AMD enablement, and more. A well-earned milestone 👏 github.com/ProExpertProg

1.0K

Red Hat AI@RedHat_AI · Jul 18

Random Samples: Grounding Feedback is All You Need: Aligning Small Vision-Language Models x.com/i/broadcasts/1…

423

Red Hat AI@RedHat_AI · Jul 17

Llama 4 quantization support just landed in llm-compressor! ✅ W4A16 quantization ✅ FP4 quantization ✅ Support for Llama 4 tokenizer + model loading This sets the stage for fast, community-optimized Llama 4 models. Jump in to try, test, contribute: github.com/vllm-project/l…

RedHat_AI's tweet card. Summary: Updates prepare method to no longer require a replace function but just pass in the orignal module directly along with the text config Add llama4 calibration support - swaps Llama4TextMoe...

925

Red Hat AI@RedHat_AI · Jul 17

Red Hat AI Inference Server allows you to run any LLM, on any accelerator, in any cloud environment. And it's all open source! Hear from Red Hat CEO @matthicksj on what this means for you and your business.

MMatt Hicks@matthicksj · Jul 17

Red Hat AI Inference Server delivers our vision of running any gen AI model on any AI accelerator in any cloud environment. See how we're empowering our customers with @RedHat_AI: red.ht/4lyPlKd

838

Red Hat AI Retweeted

Red Hat@RedHat · Jul 2

How do you solve AI's biggest performance hurdles? On Technically Speaking, @kernelcdub & Nick Hill dive into vLLM, exploring how techniques like PagedAttention solve memory bottlenecks & accelerate inference: red.ht/4lDjJ5P.

5.0K

Red Hat AI@RedHat_AI · Jul 17

Random Samples, our weekly seminar series that bridges the gap between cutting-edge AI research and real-world application, continue this Friday, July 18! Title: Grounding Feedback is All You Need: Aligning Small Vision-Language Models Abstract: While recent vision-language…

RedHat_AI's tweet image. Random Samples, our weekly seminar series that bridges the gap between cutting-edge AI research and real-world application, continue this Friday, July 18!

Title:
Grounding Feedback is All You Need: Aligning Small Vision-Language Models

Abstract:
While recent vision-language…

422

Red Hat AI@RedHat_AI · Jul 1

Minimax M1 is one of the SOTA open weight model from @MiniMax__AI. Checkout how is it efficiently implemented in vLLM, directly from the team! blog.vllm.ai/2025/06/30/min…

llmarena.ai@lmarena_ai · Jun 27

🔥 Another strong open model with Apache 2.0 license, this one from @MiniMax_AI - places in the top 15. MiniMax-M1 is now live on the Text Arena leaderboard landing at #12. This puts it at equal ranking with Deepseek V3/R1 and Qwen 3! See thread to learn more about its…

120

20.0K

Red Hat AI Retweeted

Hunter Gerlach@HunterGerlach · Jul 12

If you're curious where @RedHat fits into this whole AI thing, watch this quick interview with @matthicksj on @theCUBE: (spoiler: the answer is @RedHat_AI) youtu.be/dIe3-sfZfKc?si…

691

Red Hat AI@RedHat_AI · Jul 14

FP4 models and inference kernels ready for Blackwell GPUs! GPTQ and Hadamard for accuracy, and fused Hadamard for runtime. Check out more details about our work in the thread below 👇

DDan Alistarh@DAlistarh · Jul 14

Announcing our early work on FP4 inference for LLMs! - QuTLASS: low-precision kernel support for Blackwell GPUs - FP-Quant: a flexible quantization harness for Llama/Qwen We reach 4x speedup vs BF16, with good accuracy through MXFP4 microscaling + fused Hadamard rotations.

1.0K

Red Hat AI@RedHat_AI · Jul 11

Random Samples: On scalable RL in the era of agentic LLMs x.com/i/broadcasts/1…

868

Red Hat AI@RedHat_AI · Jul 2

Nick Hill digs into the details of vLLM with me on Technically Speaking. Helpful in understanding why vLLM is so important in high performance, open source AI inferencing

RRed Hat@RedHat · Jul 2

1.0K

Red Hat AI Retweeted

llm-d@_llm_d_ · Jul 5

Want to influence the future of llm-d? Our 5-min survey on real-world LLM use cases is open until July 11. We're reviewing the results live at our community meeting on July 16th, so your voice will be heard immediately. Make an impact: red.ht/llm-d-user-sur… #AI #MLOps #vllm

616

Red Hat AI Retweeted

Virginia M@Virginia__MM · Jul 9

Red Hat and @AMD are bringing together the power of @RedHat_AI with AMD’s portfolio of high-performance computing architectures to support optimized, cost-efficient, and production-ready environments for AI-enabled workloads. Check it out. #RHSummit sprou.tt/10b87MeYUwM

351

Red Hat AI Retweeted

Virginia M@Virginia__MM · Jul 1

Red Hat + @NVIDIA = a new wave of agentic AI innovation 💡 See how we're supporting NVIDIA Blackwell AI factories across @RedHat_AI and the hybrid cloud. #RHSummit sprou.tt/1ypiGRkRnJ5

531