RunPod Blog (Page 3)

RunPod Blog

Sign in Subscribe

Founder Series #1: Origin Story

Founder Series #1: Origin Story

Let's get more personal and establish a baseline. For everyone that's been enjoying RunPod, thank you for spreading the word. I am Pardeep Singh (aka flash-singh), CTO at RunPod and one of the co-founders along with Zhen Lu. What triggered me to share? Of all things

How To Run a "Hello World" in RunPod Serverless

How To Run a "Hello World" in RunPod Serverless

If you're new to serverless computing and Docker, this guide will walk you through creating your first RunPod serverless endpoint from scratch. We'll build a simple "Hello World" application that demonstrates the basic concepts of serverless deployment on RunPod's platform. You'

Mistrall Small 3 Eschews Synthetic Data - What Does This Mean?

Mistrall Small 3 Eschews Synthetic Data - What Does This Mean?

Mistral AI has released Mistral Small 3, which claims to not use synthetic data in their training pipeline. Weighing in at 22B with 32k context, this is a lightweight model that can be run at full weights on an A40, and nearly any GPU spec that we offer while quantized.

The Complete Guide to GPU Requirements for LLM Fine-tuning

The Complete Guide to GPU Requirements for LLM Fine-tuning

When deciding on a GPU spec to train or fine-tune a model, you're likely going to need to hold onto the pod for hours or even days for your training run. Even a difference of a few cents per hour easily adds up, especially if you have a

DeepSeek R1 - What's the Hype?

DeepSeek R1 - What's the Hype?

DeepSeek R1 is a recently released model that has been topping benchmarks in several key areas. Here's some of the leaderboards it's shot to the top of: * LiveBench: Second place, with only GPT4 o1-2024-12-17 surpassing it as of this writing. * Aider: Ditto. * Artificial Analysis: Fifth place,

5090s Are Almost Here: How Do They Shape Up Against the 4090?

5090s Are Almost Here: How Do They Shape Up Against the 4090?

Another year has come and another new card generation from NVidia is on the way. 5090s are due to become widely available this January, and RunPod is going to be extremely eager to support them once they do. Along with the new Blackwell architecture, the 5090 is set to launch

How Do I Transfer Data Into My Pod?

How Do I Transfer Data Into My Pod?

We've got a number of questions recently about how to transfer data into pods, and while do we have some transfer methods in our docs we'd like to go into a bit more detail on how to upload files into your pod. In general, for large

What's New for Serverless LLM Usage in RunPod in 2025?

What's New for Serverless LLM Usage in RunPod in 2025?

Out of all of the use cases that our serverless architecture has, LLMs are one of the best examples of it. Because so much of LLM use is dependent on the human using it to process, digest, and type a response, you save so much on GPU spend by ensuring

H200 Tensor Core GPUs Now Available on RunPod

H200 Tensor Core GPUs Now Available on RunPod

We're pleased to announce that H200 is now available on RunPod at a price point of $3.99/hr in Secure Cloud. This GPU spec boosts the available VRAM for NVidia-based applications up to 141GB in a single unit along with increased memory bandwidth. Here are how the

RunPod Sponsors CivitAI's Project Odyssey 2024 Competition

RunPod Sponsors CivitAI's Project Odyssey 2024 Competition

RunPod is proud to sponsor Season 2 of Project Odyssey 2024 from CivitAI, the world's largest AI filmmaking competition. We've written in the past about prominent open source packages like LTX, Mochi, and Hunyuan Video – here's your chance to show off your skills and

Train Your Own Video LoRAs with diffusion-pipe

Train Your Own Video LoRAs with diffusion-pipe

You can now train your own LoRAs for Flux, Hunyuan Video, and LTX Video with tdrussells' diffusion-pipe, a training script for video diffusion models. Let's run through an example of how this is done with Hunyuan Video. Start Up a Pod First, start up a pod with

Serverless for Artificial Intelligence and Machine Learning Workloads

Serverless for Artificial Intelligence and Machine Learning Workloads

The need to upscale, reduce operational overhead, and bring cost efficiency allows serverless computing to revolutionize AI/ML workloads. Scaling often results in expensive cost management and hardware maintenance that becomes unbearable with traditional infrastructure. RunPod dynamically allocates resources in these instances to work seamlessly with modern AI workflows. This

A Leap into the Unknown: Why I Joined RunPod

A Leap into the Unknown: Why I Joined RunPod

This entry has been contributed by Jean-Michael Desrosiers, Head of Enterprise at RunPod. I take shots—sometimes far too many, and in wildly different directions. I always have, and it’s been a part of my DNA for as long as I can remember. Picture an overly enthusiastic explorer darting

Deploy Repos Straight to RunPod with GitHub Integration

Deploy Repos Straight to RunPod with GitHub Integration

RunPod is pleased to announce its latest feature aimed at making the lives of developers easier: GitHub integration! Previously, Docker images were the primary method of deploying endpoints, and while this is still functional and useful, requires a number of intermediary steps. Now, with GitHub integration you can deploy directly

Lightricks LTXVideo: Sleeper Hit Open Source Video Generation

Lightricks LTXVideo: Sleeper Hit Open Source Video Generation

With new packages like Mochi and Hunyuan Video now out, there have been some other video packages that have come out that have also slipped under the radar that definitely deserve some more love. LTXVideo by Lightricks appears to be slept on despite coming out with an out of the

Building an OCR System Using RunPod Serverless

Building an OCR System Using RunPod Serverless

Learn how to build an Optical Character Recognition (OCR) system using RunPod Serverless and pre-trained models from Hugging Face to automate the processing of receipts and invoices. Introduction Processing receipts and invoices manually is both time-consuming and prone to errors. Optical Character Recognition (OCR) systems can automate this task by

Community Spotlight: How AnonAI Scales Its Chatbot Agents Through RunPod

Community Spotlight: How AnonAI Scales Its Chatbot Agents Through RunPod

RunPod is pleased to share the story of one of our valued clients, Autonomous. We at RunPod believe very strongly in the power of free speech and privacy - our pods are run in secure environments with optional encryption and we stand by our promise that we do not inspect

Announcing Global Networking For Cross-Data Center Communication

Announcing Global Networking For Cross-Data Center Communication

RunPod is pleased to announce its launch of our Global Networking feature, which allows for cross-data center communication between pods. When a pod with the feature is deployed, your pods can communicate with each other over a virtual internal network facilitated by RunPod. This means that you can have pods

How Much Can a GPU Cloud Save You, Really?

How Much Can a GPU Cloud Save You, Really?

Machine learning, AI, and data science workloads rely on powerful GPUs to run effectively, so organizations are deciding to either invest in on-prem GPU clusters or use cloud-based GPU solutions like RunPod. This article will show considerations of infrastructure requirements and compare the cost and performance to help you choose

Scoped API Keys Now Available on RunPod

Scoped API Keys Now Available on RunPod

We've released an expansion to our handling of API keys on RunPod. Previously, you were able to create API keys with read or read and write permissions, but now you can scope keys by endpoint and have more fine-grained control over what your keys allow access to. Here&

When to Use (Or Not Use) RunPod's Proxy

When to Use (Or Not Use) RunPod's Proxy

RunPod uses a proxy system to ensure that you have easy accessibility to your pods without needing to make any configuration changes. This proxy utilizes Cloudflare for ease of both implementation and access, which comes with several benefits and drawbacks. Let's go into a little explainer about specifically

Comparing Different Quantization Methods: Speed Versus Quality Tradeoffs

Comparing Different Quantization Methods: Speed Versus Quality Tradeoffs

Introduction Quantization is a key technique in machine learning that is used to reduce the model size and speed up inference, especially when deploying models on hardware with resource constraints. Nevertheless, achieving a good quantization setup means balancing the model performance against the computational efficiency required by the deployment environment.

Community Spotlight: How to Build and Deploy an AI Chatbot from Scratch on RunPod

Community Spotlight: How to Build and Deploy an AI Chatbot from Scratch on RunPod

In an extremely generous contribution to the RunPod community, our friends at Code in a Jiffy recently shared their journey of building a complete coffee shop application enhanced with artificial intelligence. This comprehensive project showcases how AI can transform everyday commerce applications into intelligent, interactive experiences. The video is 12

Classifier Free Guidance in LLMs - How Does It Work?

Classifier Free Guidance in LLMs - How Does It Work?

Classifier-Free Guidance (CFG) has emerged as a powerful technique for improving the quality and controllability of language model outputs. While initially developed for image generation models, CFG has found successful applications in text generation. Let's dive deep into how this technique works and why it's becoming

Mochi 1 Text-To-Video Represents New SOTA In Open Source Video Gen

Mochi 1 Text-To-Video Represents New SOTA In Open Source Video Gen

Text-to-video generation is a space where open source has lagged behind for some time, due to the difficulty and cost involved in training and evaluating video as opposed to text and images. Offerings such as Sora, while impressive, beg for open-source alternatives where you can create videos of any kind