RunPod Blog
  • RunPod
  • Docs
Sign in Subscribe
From API to Autonomy: Why More Builders Are Self-Hosting Their Models

From API to Autonomy: Why More Builders Are Self-Hosting Their Models

Outgrowing the APIs? Learn when it’s time to switch from API access to running your own AI model. We’ll break down the tools, the stack, and why more builders are going open source.
12 May 2025 3 min read
How a Solo Dev Built an AI for Dads—No GPU, No Team, Just $5
Built on RunPod

How a Solo Dev Built an AI for Dads—No GPU, No Team, Just $5

A solo developer fine-tuned an emotional support AI for dads using Mistral 7B, QLoRA, and RunPod—with no GPU, no team, and under $5 in training costs.
09 May 2025 4 min read
How Civitai Scaled to 800K Monthly LoRAs on RunPod
Stable Diffusion

How Civitai Scaled to 800K Monthly LoRAs on RunPod

Discover how Civitai used RunPod to train over 868,000 LoRA models in one month—fueling a growing creator community and powering millions of AI generations.
08 May 2025 2 min read
From Pods to Serverless: When to Switch and Why It Matters

From Pods to Serverless: When to Switch and Why It Matters

You’ve just finished fine-tuning your model in a pod. Now it’s time to deploy it—and you’re staring at two buttons: Serverless or Pod. Which one’s right for running inference? If you’ve been using Pods to train, test, or experiment on RunPod, Serverless might be
07 May 2025 3 min read
RunPod Just Got Native in Your AI IDE
Runpod Platform

RunPod Just Got Native in Your AI IDE

RunPod’s new MCP server brings first-class GPU access to any AI IDE—Cursor, Claude Desktop, Windsurf, and more. Launch pods, deploy endpoints, and manage infrastructure directly from your editor using Model Context Protocol.
05 May 2025 3 min read
Qwen3 Released: How Does It Stack Up?

Qwen3 Released: How Does It Stack Up?

The Qwen Team has released Qwen3, their latest generation of large language models that brings groundbreaking advancements to the open-source AI community. This comprehensive suite of models ranges from lightweight 0.6B parameter versions to massive 235B parameter Mixture-of-Experts (MoE) architectures, all designed with a unique "thinking mode"
30 Apr 2025 6 min read
GPU Clusters: Powering High-Performance AI Computing (When You Need It)

GPU Clusters: Powering High-Performance AI Computing (When You Need It)

AI infrastructure isn't one-size-fits-all. Different stages of the AI development lifecycle call for different types of compute—and choosing the right tool for the job can make all the difference in performance, efficiency, and cost. At RunPod, we're building infrastructure that fits the way modern AI
28 Apr 2025 2 min read
How Krnl Scaled to Millions of Users—and Cut Infra Costs by 65% With RunPod

How Krnl Scaled to Millions of Users—and Cut Infra Costs by 65% With RunPod

When Krnl’s AI tools went viral, they outgrew AWS fast. Discover how switching to RunPod’s serverless 4090s helped them scale effortlessly, eliminate idle costs, and cut infrastructure spend by 65%.
24 Apr 2025 2 min read
Mixture of Experts (MoE): A Scalable Architecture for Efficient AI Training

Mixture of Experts (MoE): A Scalable Architecture for Efficient AI Training

Mixture of Experts (MoE) models scale efficiently by activating only a subset of parameters per input. Learn how MoE works, where it shines, and why RunPod is built to support MoE training and inference.
23 Apr 2025 3 min read
Global Networking Expansion: Now Available in 14 Additional Data Centers

Global Networking Expansion: Now Available in 14 Additional Data Centers

RunPod is excited to announce a major expansion of our Global Networking feature, which now supports 14 additional data centers. Following the successful launch in December 2024, we've seen tremendous adoption of this capability that enables seamless cross-data center communication between pods. This expansion significantly increases our global
22 Apr 2025 3 min read
How to Fine-Tune LLMs with Axolotl on RunPod
Fine-Tuning

How to Fine-Tune LLMs with Axolotl on RunPod

Learn how to fine-tune large language models (LLMs) using Axolotl on RunPod. This step-by-step guide covers setup, configuration, and training with LoRA, 8-bit quantization, and DeepSpeed—all on scalable GPU infrastructure.
21 Apr 2025 3 min read
RTX 5090 LLM Benchmarks for AI: Is It the Best GPU for ML?

RTX 5090 LLM Benchmarks for AI: Is It the Best GPU for ML?

The AI landscape demands ever-increasing performance for demanding workloads, especially for large language model (LLM) inference. Today, we're excited to showcase how the NVIDIA RTX 5090 is reshaping what's possible in AI compute with breakthrough performance that outpaces even specialized data center hardware. Benchmark Showdown: RTX
17 Apr 2025 4 min read
The Complete Guide to Training Video LoRAs: From Concept to Creation
LoRAs

The Complete Guide to Training Video LoRAs: From Concept to Creation

Learn how to train custom video LoRAs for models like Wan, Hunyuan Video, and LTX Video. This guide covers hyperparameters, dataset prep, and best practices to help you fine-tune high-quality, motion-aware video outputs.
16 Apr 2025 10 min read
The RTX 5090 Is Here: Serve 65,000+ Tokens per Second on RunPod

The RTX 5090 Is Here: Serve 65,000+ Tokens per Second on RunPod

RunPod customers can now access the NVIDIA RTX 5090—the latest powerful GPU for real-time LLM inference. With impressive throughput and large memory capacity, the 5090 enables serving for small and mid-sized AI models at scale. Whether you’re deploying high-concurrency chatbots, inference APIs, or multi-model backends, this next-gen GPU
15 Apr 2025 2 min read
Cost-effective Computing with Autoscaling on RunPod
Runpod Platform

Cost-effective Computing with Autoscaling on RunPod

Learn how RunPod helps you autoscale AI workloads for both training and inference. Explore Pods vs. Serverless, cost-saving strategies, and real-world examples of dynamic resource management for efficient, high-performance compute.
14 Apr 2025 3 min read
The Future of AI Training: Are GPUs Enough for the Next Generation of AI?
AI Development

The Future of AI Training: Are GPUs Enough for the Next Generation of AI?

AI workloads are evolving fast. GPUs still dominate training in 2025, but emerging hardware and hybrid infrastructure are reshaping the future. Here’s what GTC 2025 reveals—and how RunPod fits in.
10 Apr 2025 4 min read
Llama-4 Scout and Maverick Are Here—How Do They Shape Up?

Llama-4 Scout and Maverick Are Here—How Do They Shape Up?

Meta has been one of the kings of open source, open weight large language models. Their first foray with Llama-1 in 2023, while limited in its application and licensing, was a clear direction to the community that there was an alternative to large closed-off models. Later in 2023 we got
09 Apr 2025 5 min read
Built on RunPod: How Cogito Trained High-Performance Open Models on the Path to ASI

Built on RunPod: How Cogito Trained High-Performance Open Models on the Path to ASI

At RunPod, we're proud to power the next generation of AI breakthroughs—and this one is big. San Francisco-based Deep Cogito has just released Cogito v1, a family of open-source models ranging from 3B to 70B parameters. Each model outperforms leading alternatives from LLaMA, DeepSeek, and Qwen in
08 Apr 2025 3 min read
How AI Helped Win a Nobel Prize - Protein Folding and AI
AI Development

How AI Helped Win a Nobel Prize - Protein Folding and AI

AlphaFold just won the Nobel Prize—and proved AI can solve problems once thought impossible. This post explores what it means for science, compute, and how RunPod is helping make the next breakthrough accessible to everyone.
07 Apr 2025 3 min read
No-Code AI: How I Ran My First Language Model Without Coding
No-Code AI

No-Code AI: How I Ran My First Language Model Without Coding

I wanted to run an open-source AI model myself—no code, just curiosity. Here’s how I deployed Mistral 7B on a cloud GPU and what I learned.
03 Apr 2025 8 min read
Bare Metal vs. Instant Clusters: Which Is Right for Your AI Workload?
Bare Metal

Bare Metal vs. Instant Clusters: Which Is Right for Your AI Workload?

Instant Clusters are here. RunPod’s newest deployment option lets you spin up multi-node environments in minutes—no contracts, no config files. Learn how they compare to Bare Metal and when to use each for your AI workloads.
02 Apr 2025 3 min read
Introducing Instant Clusters: Multi-Node AI Compute, On Demand

Introducing Instant Clusters: Multi-Node AI Compute, On Demand

Until now, RunPod users could generally scale up to 8 GPUs in a single pod. For most use cases—like running inference on Llama 70B or fine-tuning FLUX—that was plenty. But some workloads need more compute than a single server. They need to scale across multiple machines. Today, we’
31 Mar 2025 3 min read
Machine Learning Basics for People Who Don't Code
No-Code AI

Machine Learning Basics for People Who Don't Code

You don’t need to know code to understand machine learning. This post breaks down how AI models learn—and how you can start exploring them without a technical background.
28 Mar 2025 4 min read
RunPod Expands in Asia-Pacific with Launch of AP-JP-1 in Fukushima

RunPod Expands in Asia-Pacific with Launch of AP-JP-1 in Fukushima

We're excited to announce the launch of AP-JP-1, RunPod's first data center in Japan—now live in Fukushima. This marks a major step forward in our global infrastructure strategy and opens the door to dramatically better performance for users across the Asia-Pacific region. Why This Matters
27 Mar 2025 1 min read
Supporting the Future of AGI: RunPod Partners with ARC Prize 2025

Supporting the Future of AGI: RunPod Partners with ARC Prize 2025

The race toward artificial general intelligence isn't just happening behind closed doors at trillion-dollar tech companies. It's also unfolding in the open—in research labs, Discord servers, GitHub repos, and competitions like the ARC Prize. This year, the ARC Prize Foundation is back with ARC-AGI-2, a
26 Mar 2025 2 min read
Page 1 of 9 Older Posts →
RunPod Blog © 2025
  • Sign up
Powered by Ghost