Ramya Ravi

@Ramya_ravi19

Product Marketing Engineer @Intel | MSU Spartan | Nature lover | Opinions are my own

Joined September 2022

87Following

88Followers

Ramya Ravi@Ramya_ravi19 · Jun 25

Excited to share my latest @towards_AI article! In this article, I break down ⚙️ What LLM serving frameworks are 🛠️ Why they matter for hosting large language models 📌 Key frameworks you should know 👉 Read here: medium.com/towards-artifi… #AI #LLMs #GenAI #DeepLearning

Ramya_ravi19's tweet card. Deploying LLMs in real-world applications goes far beyond loading weights and generating text. It’s about addressing key challenges like…

Ramya Ravi Retweeted

open.intel@OpenAtIntel · Jun 24

Building #GenAI apps with @OPEAdev? This article covers orchestration, microservices, and real-time data using Amazon Bedrock and OpenSearch: intel.ly/3HNb7Lc

138

Ramya Ravi Retweeted

OPEA@OPEAdev · Jun 24

What makes OPEA work for scalable GenAI apps? With Amazon Bedrock and OpenSearch, it brings orchestration, RAG, and microservices into one integrated stack: intel.com/content/www/us…

103

Ramya Ravi@Ramya_ravi19 · Jun 18

LLMs generate. Agents act. But the real magic? When they work together. In this article, I break down: 🔗 How they complement each other 🧠 When to use which — or both 👉 Read more: medium.com/@ramyaravi19/l… #AI #LLMs #AIAgents #ArtificialIntelligence #GenAI

Ramya_ravi19's tweet card. As AI developers, we have used LLMs like GPT-4, Claude, or Llama. These models are proficient in understanding natural language, creating…

Ramya Ravi@Ramya_ravi19 · Jun 17

Explore how OPEA, AWS Bedrock, and #OpenSearch simplify building #RAG pipelines, agents & more. Built for #developers who want to move from prototype to production with confidence. Read more: intel.com/content/www/us… #AI #GenAI #AIAgents #AWSBedrock @OpenSearchProj @OPEAdev

Ramya_ravi19's tweet card. In this article, we’ll explain the core concepts of OPEA, Bedrock, and OpenSearch, and how to leverage the microservices effectively using OPEA.

Ramya Ravi@Ramya_ravi19 · Jun 16

Confused about when to fine-tune a model, use adapters like LoRA, or build with RAG? If you're working with foundation models and want to customize smarter, not harder, this guide is for you. Read here: medium.com/@ramyaravi19/l… #FineTuning #AI #MachineLearning #DeepLearning

Ramya_ravi19's tweet card. With the rise of powerful foundation models, customizing LLMs for domain-specific tasks is both essential and more feasible than ever. But…

Ramya Ravi@Ramya_ravi19 · Jun 13

🚀 Smarter, Faster #AI with Mixture of Experts (#MoE) 🤖 MoE lets AI models scale efficiently by activating only what’s needed. In my Medium post, I cover: 🔹 What MoE is 🔹 How it works 🔹 Why it matters for #GenAI & #LLMs 📖 Dive in: medium.com/@ramyaravi19/w… #DeepLearning

Ramya_ravi19's tweet card. As LLMs continue to grow in size and complexity, there’s a pressing need for more efficient, modular architectures. That’s where MoE models…

105

Ramya Ravi Retweeted

Jack Erickson@JackErickson_ · May 28

Quantization using torch.export graph mode is easier and more efficient than eager mode. And in @PyTorch 2.7 it's available on Intel GPUs. Learn how to get started: youtube.com/watch?v=nek7u5…

209

Ramya Ravi Retweeted

Intel Software@IntelSoftware · May 16

Ever tried searching for “more ladybugs than flowers”? We did. And the AI nailed it. Fine-tuned LLMs can really deliver when trained on the right datasets. This demo shows what happens when #Qwen3 models are optimized and deployed on Intel hardware. Read the full article:…

837

Ramya Ravi Retweeted

Intel Software@IntelSoftware · May 5

Still prompt-and-hope? Agentic LLMs plan, adapt, and act—turning AI into an active problem solver. This blog breaks down how devs are using AutoGen, multi-agent design, and low-code tools to build smarter apps—featuring insights from Microsoft’s Daron Yöndem at Intel AI…

2.0K

Ramya Ravi Retweeted

Intel Software@IntelSoftware · Apr 30

On devices with Intel® Core™ Ultra processors, you can unlock full acceleration using WebNN—thanks to integrated NPUs. Full dev guide → intel.ly/3Gc5yoV

401

Ramya Ravi Retweeted

Intel Software@IntelSoftware · Apr 30

Frameworks like WebLLM & Transformers.js are designed to run LLMs inside the browser using WebGPU, WebNN, and WebAssembly. No server required. No cloud roundtrips. Just fast, local inference. • WebGPU = GPU compute • WebNN = NPU/CPU/GPU inference • Wasm = near-native speed

564

Ramya Ravi Retweeted

Intel Software@IntelSoftware · Apr 30

LLMs in the browser? It’s not sci-fi. You can now run chatbots, summarizers, and other AI tools entirely in-browser—with JavaScript frameworks and hardware-accelerated APIs. Here's how it works

741

Ramya Ravi Retweeted

Intel Software@IntelSoftware · Apr 28

Automated prompt engineering on-device—no fine-tuning, no RAG. This new guide shows how to use #DSPy with Intel #oneAPI and llama.cpp to boost task accuracy from 📉 35% → 📈 78% Run LLMs locally, optimize efficiently. Read the guide → intel.ly/4lBzYRw

759

Ramya Ravi Retweeted

Intel Software@IntelSoftware · Apr 28

Big step forward for long-context #LLM benchmarks. We’re excited to share HELMET — a benchmark co-developed by Intel and @Princeton University to evaluate models across real-world, long-context tasks. Evaluate any LLMs. Extensible context lengths. Built for scale on Intel…

845

Ramya Ravi Retweeted

Intel Software@IntelSoftware · Apr 22

With insights from Intel AI DevSummit 2025, Ramya Ravi shares key considerations for fine-tuning and self-hosting large language models. Read the blog: intel.ly/4jvVJAw

618

Ramya Ravi Retweeted

Jack Erickson@JackErickson_ · Apr 16

Learn how to get started running on Intel Gaudi AI accelerators using models from Hugging Face Hub youtube.com/watch?v=ibpsVj… #IamIntel

119