How Coframe used RunPod Serverless to Scale During their Viral Product Hunt Launch
Coframe uses RunPod Serverless to scale inference from 0 GPUs to hundreds in minutes. With RunPod, Coframe launched their generative UI tool on Product Hunt to thousands of users in a single day without having to worry about their infrastructure failing.
In under a week, Coframe was able to deploy their custom diffusion model on Serverless GPUs. Equipped with auto scaling and sub 250ms cold-start time, they were ready to handle their rapid surge in demand, all without having to hire a team of engineers to build and maintain infrastructure.
About Coframe
Coframe builds software that uses generative AI to continuously improve and optimize digital interfaces, like landing pages and web apps. They solve a critical problem UX researchers, frontend engineers, and designers face: figuring out the best version of your app through A/B testing is incredibly tedious.
It’s also nearly impossible to create a custom experience for each user who interacts with your product. With Coframe, you can create tailored UI experiences for each of your users without any engineering overhead. Simply connect your application and let Coframe handle the rest.
Avoiding the infrastructure time sink
Prior to using RunPod Serverless, Josh Payne, CEO and Founder of Coframe, considered renting a bare-metal cluster from a large cloud provider. He quickly realized that going with a bare-metal solution would require bringing on dedicated infrastructure engineers to handle scaling, redundancy, and load balancing.
As a startup, Coframe is focusing on what makes their beer taste better: their generative UI product. They just couldn’t afford to spend time building infrastructure. Diverting engineering resources away from their core product and delaying time-to-market in a rapidly changing industry would set them too far behind.
Choosing the best Serverless GPU
RunPod offers 10+ different GPU models to choose from on Serverless. Once Coframe was onboarded, they spent the first couple days running benchmarks across different GPUs to see which one would be most cost-effective for their diffusion model workload.
They found that the RTX 4090s offered the best cost-to-performance ratio, benchmarking closely in terms of speed to A100s but at ⅓ of the cost.
Getting production-ready in less than a week
In the following days, Coframe built a docker container image for their diffusion model and ran tests to see how fast their endpoint could scale to handle hundreds of concurrent requests. They were surprised to see how fast workers would go from inactive, to warm and ready for inference with Flashboot.
"The setup process was great—very quick and easy. RunPod had the exact GPUs we needed for AI inference and the pricing was very fair based on what I saw out on the market."
— Josh Payne, CEO
Successful Launch
Last month, they quickly rose to the top of Product Hunt with their latest launch: “Living Images” — an ML model to make images that optimize themselves. Gaining thousands of users in just 1 day, they ended the day at #1 with over 800 upvotes.
“The main value proposition for us was the flexibility RunPod offered. We were able to scale up effortlessly to meet the demand at launch.”
— Josh Payne, CEO
RunPod looks forward to supporting Coframe and more Generative AI companies as they look to scale their Machine Learning inference. If that sounds like you, try out RunPod Serverless here or book a call to learn more.