Pruna AI

@PrunaAI

The AI optimisation framework

Munich & Paris

Joined April 2023

149Following

685Followers

Pruna AI@PrunaAI · Jul 25

🧑‍🏫 AI Efficiency Fundamentals - Week 4: Quantization We see quantization everywhere, but do you know the difference between static and dynamic quantization? Even if you do, these slides are great for you. At Pruna, we want to educate about efficient AI, so our lead researcher…

PrunaAI's tweet image. 🧑‍🏫 AI Efficiency Fundamentals - Week 4: Quantization

We see quantization everywhere, but do you know the difference between static and dynamic quantization? Even if you do, these slides are great for you.

At Pruna, we want to educate about efficient AI, so our lead researcher…

308

Pruna AI@PrunaAI · Jul 24

🔥 Our team just optimized GLiNER for a major cloud monitoring platform processing millions of logs/second for PII detection. Key Highlights: • 35ms → 19ms inference time (nearly 2X speedup!) • 50% memory reduction • Zero quality degradation • €28K-€58K annual savings…

298

Pruna AI@PrunaAI · Jul 23

🚀 Pruna v0.2.7: Major Breakthroughs in AI Optimization! ⚡️ @deepseek_ai Janus Support + Quantization Combo • Autoregressive Image Generation gets massive speed boost • Memory impact dramatically reduced with lightning-fast latency • Quantization + torch.compile working…

PrunaAI's tweet image. 🚀 Pruna v0.2.7: Major Breakthroughs in AI Optimization!

⚡️ @deepseek_ai Janus Support + Quantization Combo
• Autoregressive Image Generation gets massive speed boost
• Memory impact dramatically reduced with lightning-fast latency
• Quantization + torch.compile working…

140

Pruna AI@PrunaAI · Jul 22

🔥 Deploy custom AI models with Pruna optimization speed + @lightningai LitServe serving engine! Lightning-Fast AI Deployments! What makes this awesome: • ⚡️ FastAPI-powered serving • 🎯 Built-in batching • ⚙️ Define and serve any model (vision, audio, text) • 🚀 Easy…

PrunaAI's tweet image. 🔥 Deploy custom AI models with Pruna optimization speed + @lightningai LitServe serving engine! Lightning-Fast AI Deployments!

What makes this awesome:
• ⚡️ FastAPI-powered serving
• 🎯 Built-in batching
• ⚙️ Define and serve any model (vision, audio, text)
• 🚀 Easy…

953

Pruna AI Retweeted

John@johnrachwan · Jul 19

Yesterday we launched wan-image on @replicate and it can generate amazing animal cinematic pictures

3.0K

Pruna AI@PrunaAI · Jul 18

📷 Introducing Wan Image – the fastest endpoint for generating beautiful 2K images! From Wan Video, we built Wan Image which generates stunning 2K images in just 3.4 seconds on a single H100 📷 Try it on @replicate: replicate.com/prunaai/wan-im… Read our blog for details, examples,…

15.0K

Pruna AI@PrunaAI · Jul 18

🧑‍🏫 AI Efficiency Fundamentals - Week 3: Evaluation Do you know your evaluation measures for measuring efficiency instead of focusing on mere quality? Even if you do, these slides are great for you. At Pruna, we want to educate about efficient AI, so our lead researcher and…

PrunaAI's tweet image. 🧑‍🏫 AI Efficiency Fundamentals - Week 3: Evaluation

Do you know your evaluation measures for measuring efficiency instead of focusing on mere quality? Even if you do, these slides are great for you.

At Pruna, we want to educate about efficient AI, so our lead researcher and…

184

Pruna AI@PrunaAI · Jul 17

🎨 New Tutorial: Complete Image Generation Model Optimization - From Stable Diffusion to Production We just dropped a comprehensive guide showing how to optimize image generation models with zero quality loss and massive performance gains. What we did: 🚀 𝟮𝘅 𝗳𝗮𝘀𝘁𝗲𝗿…

PrunaAI's tweet image. 🎨 New Tutorial: Complete Image Generation Model Optimization - From Stable Diffusion to Production

We just dropped a comprehensive guide showing how to optimize image generation models with zero quality loss and massive performance gains.

What we did:
🚀 𝟮𝘅 𝗳𝗮𝘀𝘁𝗲𝗿…

307

Pruna AI@PrunaAI · Jul 16

🚀 Scaling has fueled the latest breakthroughs in language, image, and video models. As model sizes increase, so do the computational and energy expenses of running them. But we can do something about it! In this talk at @munichnlp, our very own Nils Fleischmann, explores…

134

Pruna AI@PrunaAI · Jul 15

🚀 𝗣𝗿𝘂𝗻𝗮 𝘅 @𝗴𝗼𝗸𝗼𝘆𝗲𝗯 𝗣𝗮𝗿𝘁𝗻𝗲𝗿𝘀𝗵𝗶𝗽 𝗨𝗽𝗱𝗮𝘁𝗲! 🔥 Early adopters are reporting great results from our lightning-fast inference platform: Performance Breakthrough: • ⚡️ Much faster models • 💰 Cost reduction • 🎯 Minimal quality degradation Let’s talk…

PrunaAI's tweet image. 🚀 𝗣𝗿𝘂𝗻𝗮 𝘅 @𝗴𝗼𝗸𝗼𝘆𝗲𝗯 𝗣𝗮𝗿𝘁𝗻𝗲𝗿𝘀𝗵𝗶𝗽 𝗨𝗽𝗱𝗮𝘁𝗲!

🔥 Early adopters are reporting great results from our lightning-fast inference platform:

Performance Breakthrough:
• ⚡️ Much faster models
• 💰 Cost reduction
• 🎯 Minimal quality degradation

Let’s talk…

234

Pruna AI@PrunaAI · Jul 10

🌱 Do you know the ML.Energy - ml.energy - initiative? Even if you do, make sure to watch this webinar we hosted with @jaewon_chung_cs. He explains the daily thing him and his colleagues face during their goal to measure, understand, optimize, and…

PrunaAI's tweet image. 🌱 Do you know the ML.Energy - ml.energy - initiative?

Even if you do, make sure to watch this webinar we hosted with @jaewon_chung_cs. He explains the daily thing him and his colleagues face during their goal to measure, understand, optimize, and…

184

Pruna AI@PrunaAI · Jul 8

🔬 New Tutorial: Complete LLM Optimization Workflow We just released a comprehensive guide showing exactly how to compress and evaluate large language models using our open-source library. The pipeline is simple: Load → Configure → Compress → Evaluate → Deploy! 🚀…

PrunaAI's tweet image. 🔬 New Tutorial: Complete LLM Optimization Workflow

We just released a comprehensive guide showing exactly how to compress and evaluate large language models using our open-source library. The pipeline is simple: Load → Configure → Compress → Evaluate → Deploy!

🚀…

208

Pruna AI@PrunaAI · Jul 7

🧃Juicy updated from the Pruna team! We've just dropped some major improvements that'll make your models optimizations run smoother than ever: ⚡ 𝗚𝗣𝗨 𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗶𝗼𝗻 𝗠𝗮𝗱𝗲 𝗘𝗮𝘀𝘆: Pruna now supports accelerate for models distributed across multiple GPUs.…

PrunaAI's tweet image. 🧃Juicy updated from the Pruna team!

We've just dropped some major improvements that'll make your models optimizations run smoother than ever:

⚡ 𝗚𝗣𝗨 𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗶𝗼𝗻 𝗠𝗮𝗱𝗲 𝗘𝗮𝘀𝘆: Pruna now supports accelerate for models distributed across multiple GPUs.…

295

Pruna AI@PrunaAI · Jul 4

🧑‍🏫 AI Efficiency Fundamentals - Week 2: Compression Do you know how to maximize compute utilization in your GPUs? Even if you do, these slides are great for you. At Pruna, we want to educate about efficient AI, so our lead researcher and founder @bertrand_charp prepared a…

PrunaAI's tweet image. 🧑‍🏫 AI Efficiency Fundamentals - Week 2: Compression

Do you know how to maximize compute utilization in your GPUs? Even if you do, these slides are great for you.

At Pruna, we want to educate about efficient AI, so our lead researcher and founder @bertrand_charp prepared a…

236

Pruna AI@PrunaAI · Jul 3

Say hello to Sara Han, the newest member of our Developer Advocacy team! With a laptop and his puppy on her lap, she'll help build connections between Pruna and developers to make models faster, cheaper, smaller and greener. Her time at @argilla_io and @huggingface, combined…

PrunaAI's tweet image. Say hello to Sara Han, the newest member of our Developer Advocacy team!

With a laptop and his puppy on her lap, she'll help build connections between Pruna and developers to make models faster, cheaper, smaller and greener.

Her time at @argilla_io and @huggingface, combined…

377

Pruna AI@PrunaAI · Jul 1

🌱 Compressing a single AI model endpoint can save 2t CO2e per year! In comparison, a single EU person consumes ~10t CO2 per year. Last week, our compressed Flux-Schnell endpoint on @replicate has run 𝟮𝗠 𝘁𝗶𝗺𝗲𝘀 𝗼𝗻 𝗛𝟭𝟬𝟬 𝗼𝘃𝗲𝗿 𝟮 𝘄𝗲𝗲𝗸𝘀. For each run, the model…

PrunaAI's tweet image. 🌱 Compressing a single AI model endpoint can save 2t CO2e per year! In comparison, a single EU person consumes ~10t CO2 per year.

Last week, our compressed Flux-Schnell endpoint on @replicate has run 𝟮𝗠 𝘁𝗶𝗺𝗲𝘀 𝗼𝗻 𝗛𝟭𝟬𝟬 𝗼𝘃𝗲𝗿 𝟮 𝘄𝗲𝗲𝗸𝘀. For each run, the model…

2.0K

Pruna AI@PrunaAI · Jun 26

FLUX.1 Kontext [dev] dropped just hours ago and the community is already hacking 👀 Our friends @PrunaAI made it 5x faster in just a few hours. This is what open-source is all about: remix, build, share. We love to see it! Run it here: replicate.com/prunaai/flux-k……

RReplicate@replicate · Jun 26

Black Forest Labs have released their much anticipated open source version of Kontext. FLUX.1 Kontext [dev] is now available on Replicate: replicate.com/black-forest-l… We love open source, and we can't wait to see what the community does with this.

6.0K