How Krnl Scaled to Millions of Users—and Cut Infra Costs by 65% With RunPod
When Krnl’s AI tools went viral, they outgrew AWS fast. Discover how switching to RunPod’s serverless 4090s helped them scale effortlessly, eliminate idle costs, and cut infrastructure spend by 65%.

Built on RunPod
When your AI product goes viral overnight, the last thing you want is to be bottlenecked by infrastructure. That’s exactly the situation Giacomo and the team at Krnl found themselves in—not once, but several times—as their innovative AI tools caught fire on social media and racked up waitlists in the thousands.
The Tipping Point: Growth Meets GPU Scarcity
Before RunPod, Krnl was running on AWS and using A100s. It worked—until it didn’t.
“There just weren’t enough GPUs available when we needed them,” said Giacomo. “We’d hit a viral moment, and suddenly our queue would spike to 6,000 users, and we couldn’t scale fast enough.”
Even worse, they were paying for infrastructure that sat idle between those bursts of demand. “We were getting crushed on cost,” he explained. “We needed something that could scale with us—and scale down when we weren’t using it.”
Setting up and managing infrastructure on AWS was also eating into their dev time. “It wasn’t just about compute—it was the setup, the networking, the constant tuning. We were spending more time managing infrastructure than building our product.”
The Switch: From A100s to 4090s—and From Always-On to Serverless
Giacomo first came across RunPod while exploring alternatives for cheaper inference. Out of curiosity, he spun up his own RTX 4090 endpoint on RunPod Serverless and ran some quick benchmarks. What he found was surprising: the 4090s performed nearly as well as the A100s—at a fraction of the price.
“That’s when it clicked. We could get the performance we needed without breaking the bank. And because it was serverless, we only paid for what we actually used.”
The Krnl team migrated to RunPod and immediately saw results. With pay-per-use billing, they eliminated the problem of idle costs. With RunPod’s multi-region support, they could serve users with low latency around the world. And with seamless auto-scaling, they were ready for their next viral spike.
The Results: Millions of Users, Zero Downtime, 65% Lower Costs
Krnl’s first post-migration viral moment was a stress test—and RunPod delivered.
“We had millions of users, thousands in the queue at once, and the system held strong,” Giacomo recalled. “No crashes. No scrambling. It just worked.”
Compared to their previous AWS setup, Krnl saw a 65% reduction in infrastructure costs—without compromising on performance.
“The best part? We could stop worrying about infrastructure and go back to building. That’s the real win.”
Looking Ahead: Building the Future on RunPod
With their backend solid and their costs under control, Krnl is now focused on expanding their AI offerings. And as they scale, they know RunPod can scale with them.
“RunPod’s not just our infra provider. They’re a partner in what we’re building,” Giacomo said. “They’ve given us the flexibility to grow fast without sacrificing performance or burning cash.”
As Krnl continues pushing boundaries in AI, RunPod remains behind the scenes—powering their models, supporting their scale, and keeping their costs predictable in an unpredictable world.