samagra14
@samagra_sharma
The AI Infrastructure Guy - Founder @tensorfuse (YC W24) On a mission to make #AI accessible.
~3s container start times with @tensorfuse !
Today, we’re launching Tensorfuse serverless GPU runtime in private beta! Deploy custom ML models on serverless GPUs using your AWS account. It’s 100x better than Sagemaker and more affordable than most managed serverless platforms. Join the waitlist: forms.gle/BTtyJukx7YCxG7…
Shipping today: Tensorkube Volumes With a single command, you can now create and attach volumes to your deployments. Attaching volumes to your deployments helps you provide persistent storage, ensuring data longevity across pod restarts and deployments If you are building AI…
We're filling up fast and only have ~20% of the seats remaining for our upcoming workshop in SF If you’re building AI apps, Voice Agents, LLM gateways, etc. with customised models, this workshop will help you: - Set up infra to fine-tune and deploy custom models directly in…
Custom models will be the new superpower. And @tensorfuse is the reactor core to build them.
Customer Spotlight: How Vaero AI scaled to 1000s of custom LLMs with @tensorfuse ✅ 40% higher TPS, zero rate-limits ✅ 1000+ LoRA adapters deployed on a single GPU node ✅ 70% cost reduction by provisioning entire infra in their own AWS Full case study 👇
Customer Spotlight: How Vaero AI scaled to 1000s of custom LLMs with @tensorfuse ✅ 40% higher TPS, zero rate-limits ✅ 1000+ LoRA adapters deployed on a single GPU node ✅ 70% cost reduction by provisioning entire infra in their own AWS Full case study 👇
This one event can bring down two zeros on your GenAI infra bill.
We're hosting a live, hands-on session focused on customising and serving open source LLMs in your own AWS cloud in partnership with @AWSstartups This is a builders only event where we will: - provision a completely serverless fine-tuning and inference infra in your own AWS…
We're cooking something big in partnership with @AWSstartups Will share more details by this week Stay tuned!
Today we're launching a new feature. GPU Node Health Alerts. GPU nodes can sometimes go offline due to overheating or inference servers getting stuck in crash loops. Tensorfuse already tracks node health and auto-removes unhealthy nodes. Now, we’ve added email alerts if nodes…
My co-founder @samagra_sharma is live blogging our progress on the Voice AI blog! - 10% spelling mistakes - 30% grammar errors - 100% nerd content on inference servers Tune in now: samagra.me/philosophy/202…
For most Voice AI apps you would need a websocket interface. @nvidia triton does not support a websocket interface. There is gRPC streaming inference but I am unsure as to how easy it would be to extend @pipecat_ai or @livekit to include those.
Optimising Chatterbox from @resembleai on @nvidia Triton server. Wish me luck.
SSH Isn't Enough: Why Production-Grade AI Deployment Demands More! Running a model by SSHing into a remote server and executing scripts is not production deployment. It will probably work fine for initial 5-10 customers. But if you care about: ✅ Uptime, ✅ Reliability,…
AI agents ftw!
🤖 Your AI agents just got better, faster and cost effective. With our latest integration with @crewAIInc , a popular open source AI agent framework, you can now build AI agents with self hosted models in your own AWS. Follow the full tutorial below and join our slack community…
Tutorial: youtu.be/A42DI_e3DyA?si… Get Started with Tensorfuse: prod.tensorfuse.io/login Tensorfuse Slack community: join.slack.com/t/tensorfuseco…
🤖 Your AI agents just got better, faster and cost effective. With our latest integration with @crewAIInc , a popular open source AI agent framework, you can now build AI agents with self hosted models in your own AWS. Follow the full tutorial below and join our slack community…
Morning Texts like these make me feel things. PC @tensorfuse @HubSpot

Recently, one of our users were fine-tuning LoRA adapters via Axolotl. They ran into an issue where some occasional training jobs would run extremely slowly and eventually crash with a “Watchdog timeout” error. So we dig deep into the Nvidia NCCL rabbit hole, fixed the…