RTX 5090 LLM Benchmarks for AI: Is It the Best GPU for ML? The AI landscape demands ever-increasing performance for demanding workloads, especially for large language model (LLM) inference. Today, we're excited to showcase how the NVIDIA RTX 5090 is reshaping what's possible in AI compute with breakthrough performance that outpaces even specialized data center hardware. Benchmark Showdown: RTX
LoRAs The Complete Guide to Training Video LoRAs: From Concept to Creation Learn how to train custom video LoRAs for models like Wan, Hunyuan Video, and LTX Video. This guide covers hyperparameters, dataset prep, and best practices to help you fine-tune high-quality, motion-aware video outputs.
Llama-4 Scout and Maverick Are Here—How Do They Shape Up? Meta has been one of the kings of open source, open weight large language models. Their first foray with Llama-1 in 2023, while limited in its application and licensing, was a clear direction to the community that there was an alternative to large closed-off models. Later in 2023 we got
Introducing Easy LLM Fine-Tuning on RunPod: Axolotl Made Simple At RunPod, we're constantly looking for ways to make AI development more accessible. Today, we're excited to announce our newest feature: a pre-configured Axolotl environment for LLM fine-tuning that dramatically simplifies the process of customizing models to your specific needs. Why Fine-Tuning Matters Fine-tuning large language
Open Source Video and LLM: New Model Roundup Remember when generating decent-looking videos with AI seemed like something only the big tech companies could pull off? Those days are officially over. 2024 brought us a wave of seriously impressive open-source video generation models that anyone can download and start playing with. And here's the kicker -
Streamline GPU Cloud Management with RunPod's New REST API Managing GPU resources has always been a bit of a pain point, with most of the time spent clicking around interfaces with repetitive manual configuration. Our new API lets you control everything through code instead, which is great news for those who'd rather automate repetitive tasks and focus
Cloud Computing Unveiling Enhanced CPU Pods: Docker Runtime and Network Volume Support We're excited to announce two significant upgrades to our CPU pods that will streamline your development workflow and expand storage options. Our CPU pods now feature Docker runtime (replacing Kata Containers) and support for network volumes—previously exclusive to our GPU pods.
Run Deepseek R1 On Just 480GB of VRAM Even with the new closed model successes of Grok and Sonnet 3.7, DeepSeek R1 is still considered a heavyweight in the LLM arena as a whole, and remains the uncontested open-source LLM champion (at least until DeepSeek R2 launches, anyway.) We've written before about the concerns of
GitHub Integration Now In GA - Build Images from GitHub Repos Even Faster RunPod is pleased to announce that our GitHub integration is officially out of beta and ready for production use! This feature enables you to iterate your work more quickly by building packages to deploy on RunPod serverless directly from a GitHub repo, removing all of the friction involved in creating
Introduction to Websocket Streaming with RunPod Serverless In this followup to our 'Hello World' tutorial, we'll create a serverless endpoint that processes base64-encoded files and streams back the results. This will demonstrate how you can work with file input/output over our serverless environment by encoding the file as data within a JSON
How To Run a "Hello World" in RunPod Serverless If you're new to serverless computing and Docker, this guide will walk you through creating your first RunPod serverless endpoint from scratch. We'll build a simple "Hello World" application that demonstrates the basic concepts of serverless deployment on RunPod's platform. You'
Mistrall Small 3 Eschews Synthetic Data - What Does This Mean? Mistral AI has released Mistral Small 3, which claims to not use synthetic data in their training pipeline. Weighing in at 22B with 32k context, this is a lightweight model that can be run at full weights on an A40, and nearly any GPU spec that we offer while quantized.
The Complete Guide to GPU Requirements for LLM Fine-tuning When deciding on a GPU spec to train or fine-tune a model, you're likely going to need to hold onto the pod for hours or even days for your training run. Even a difference of a few cents per hour easily adds up, especially if you have a
DeepSeek R1 - What's the Hype? DeepSeek R1 is a recently released model that has been topping benchmarks in several key areas. Here's some of the leaderboards it's shot to the top of: * LiveBench: Second place, with only GPT4 o1-2024-12-17 surpassing it as of this writing. * Aider: Ditto. * Artificial Analysis: Fifth place,
5090s Are Almost Here: How Do They Shape Up Against the 4090? Another year has come and another new card generation from NVidia is on the way. 5090s are due to become widely available this January, and RunPod is going to be extremely eager to support them once they do. Along with the new Blackwell architecture, the 5090 is set to launch
How Do I Transfer Data Into My Pod? We've got a number of questions recently about how to transfer data into pods, and while do we have some transfer methods in our docs we'd like to go into a bit more detail on how to upload files into your pod. In general, for large
What's New for Serverless LLM Usage in RunPod in 2025? Out of all of the use cases that our serverless architecture has, LLMs are one of the best examples of it. Because so much of LLM use is dependent on the human using it to process, digest, and type a response, you save so much on GPU spend by ensuring
H200 Tensor Core GPUs Now Available on RunPod We're pleased to announce that H200 is now available on RunPod at a price point of $3.99/hr in Secure Cloud. This GPU spec boosts the available VRAM for NVidia-based applications up to 141GB in a single unit along with increased memory bandwidth. Here are how the
RunPod Sponsors CivitAI's Project Odyssey 2024 Competition RunPod is proud to sponsor Season 2 of Project Odyssey 2024 from CivitAI, the world's largest AI filmmaking competition. We've written in the past about prominent open source packages like LTX, Mochi, and Hunyuan Video – here's your chance to show off your skills and
Train Your Own Video LoRAs with diffusion-pipe You can now train your own LoRAs for Flux, Hunyuan Video, and LTX Video with tdrussells' diffusion-pipe, a training script for video diffusion models. Let's run through an example of how this is done with Hunyuan Video. Start Up a Pod First, start up a pod with
A Leap into the Unknown: Why I Joined RunPod This entry has been contributed by Jean-Michael Desrosiers, Head of Enterprise at RunPod. I take shots—sometimes far too many, and in wildly different directions. I always have, and it’s been a part of my DNA for as long as I can remember. Picture an overly enthusiastic explorer darting
Deploy Repos Straight to RunPod with GitHub Integration RunPod is pleased to announce its latest feature aimed at making the lives of developers easier: GitHub integration! Previously, Docker images were the primary method of deploying endpoints, and while this is still functional and useful, requires a number of intermediary steps. Now, with GitHub integration you can deploy directly
Lightricks LTXVideo: Sleeper Hit Open Source Video Generation With new packages like Mochi and Hunyuan Video now out, there have been some other video packages that have come out that have also slipped under the radar that definitely deserve some more love. LTXVideo by Lightricks appears to be slept on despite coming out with an out of the
Community Spotlight: How AnonAI Scales Its Chatbot Agents Through RunPod RunPod is pleased to share the story of one of our valued clients, Autonomous. We at RunPod believe very strongly in the power of free speech and privacy - our pods are run in secure environments with optional encryption and we stand by our promise that we do not inspect
Announcing Global Networking For Cross-Data Center Communication RunPod is pleased to announce its launch of our Global Networking feature, which allows for cross-data center communication between pods. When a pod with the feature is deployed, your pods can communicate with each other over a virtual internal network facilitated by RunPod. This means that you can have pods