Brendan McKeag - RunPod Blog

RunPod Blog

Sign in Subscribe

Brendan McKeag

Connecting Cursor to LLM Pods on RunPod For AI Development

Connecting Cursor to LLM Pods on RunPod For AI Development

In this comprehensive walkthrough, we'll show you how to set up and configure Cursor AI to connect to a Large Language Model (LLM) running on RunPod. This setup gives you the power of high-performance GPUs for your AI-assisted coding while maintaining the familiar Cursor interface. Not only that,

Automated Image Captioning with Gemma 3 on RunPod Serverless

Automated Image Captioning with Gemma 3 on RunPod Serverless

Creating high-quality training datasets for machine learning models often requires detailed image captions. However, manually captioning hundreds or thousands of images is time-consuming and tedious. This tutorial demonstrates how to leverage Google's powerful Gemma 3 multimodal models on RunPod Serverless to automatically generate detailed, consistent image captions. Once

Qwen3 Released: How Does It Stack Up?

Qwen3 Released: How Does It Stack Up?

The Qwen Team has released Qwen3, their latest generation of large language models that brings groundbreaking advancements to the open-source AI community. This comprehensive suite of models ranges from lightweight 0.6B parameter versions to massive 235B parameter Mixture-of-Experts (MoE) architectures, all designed with a unique "thinking mode"

Global Networking Expansion: Now Available in 14 Additional Data Centers

Global Networking Expansion: Now Available in 14 Additional Data Centers

RunPod is excited to announce a major expansion of our Global Networking feature, which now supports 14 additional data centers. Following the successful launch in December 2024, we've seen tremendous adoption of this capability that enables seamless cross-data center communication between pods. This expansion significantly increases our global

RTX 5090 LLM Benchmarks for AI: Is It the Best GPU for ML?

RTX 5090 LLM Benchmarks for AI: Is It the Best GPU for ML?

The AI landscape demands ever-increasing performance for demanding workloads, especially for large language model (LLM) inference. Today, we're excited to showcase how the NVIDIA RTX 5090 is reshaping what's possible in AI compute with breakthrough performance that outpaces even specialized data center hardware. Benchmark Showdown: RTX

The Complete Guide to Training Video LoRAs: From Concept to Creation

The Complete Guide to Training Video LoRAs: From Concept to Creation

Learn how to train custom video LoRAs for models like Wan, Hunyuan Video, and LTX Video. This guide covers hyperparameters, dataset prep, and best practices to help you fine-tune high-quality, motion-aware video outputs.

Llama-4 Scout and Maverick Are Here—How Do They Shape Up?

Llama-4 Scout and Maverick Are Here—How Do They Shape Up?

Meta has been one of the kings of open source, open weight large language models. Their first foray with Llama-1 in 2023, while limited in its application and licensing, was a clear direction to the community that there was an alternative to large closed-off models. Later in 2023 we got

Introducing Easy LLM Fine-Tuning on RunPod: Axolotl Made Simple

Introducing Easy LLM Fine-Tuning on RunPod: Axolotl Made Simple

At RunPod, we're constantly looking for ways to make AI development more accessible. Today, we're excited to announce our newest feature: a pre-configured Axolotl environment for LLM fine-tuning that dramatically simplifies the process of customizing models to your specific needs. Why Fine-Tuning Matters Fine-tuning large language

Open Source Video and LLM: New Model Roundup

Open Source Video and LLM: New Model Roundup

Remember when generating decent-looking videos with AI seemed like something only the big tech companies could pull off? Those days are officially over. 2024 brought us a wave of seriously impressive open-source video generation models that anyone can download and start playing with. And here's the kicker -

Streamline GPU Cloud Management with RunPod's New REST API

Streamline GPU Cloud Management with RunPod's New REST API

Managing GPU resources has always been a bit of a pain point, with most of the time spent clicking around interfaces with repetitive manual configuration. Our new API lets you control everything through code instead, which is great news for those who'd rather automate repetitive tasks and focus

Run Deepseek R1 On Just 480GB of VRAM

Run Deepseek R1 On Just 480GB of VRAM

Even with the new closed model successes of Grok and Sonnet 3.7, DeepSeek R1 is still considered a heavyweight in the LLM arena as a whole, and remains the uncontested open-source LLM champion (at least until DeepSeek R2 launches, anyway.) We've written before about the concerns of

GitHub Integration Now In GA - Build Images from GitHub Repos Even Faster

GitHub Integration Now In GA - Build Images from GitHub Repos Even Faster

RunPod is pleased to announce that our GitHub integration is officially out of beta and ready for production use! This feature enables you to iterate your work more quickly by building packages to deploy on RunPod serverless directly from a GitHub repo, removing all of the friction involved in creating

Introduction to Websocket Streaming with RunPod Serverless

Introduction to Websocket Streaming with RunPod Serverless

In this followup to our 'Hello World' tutorial, we'll create a serverless endpoint that processes base64-encoded files and streams back the results. This will demonstrate how you can work with file input/output over our serverless environment by encoding the file as data within a JSON

How To Run a "Hello World" in RunPod Serverless

How To Run a "Hello World" in RunPod Serverless

If you're new to serverless computing and Docker, this guide will walk you through creating your first RunPod serverless endpoint from scratch. We'll build a simple "Hello World" application that demonstrates the basic concepts of serverless deployment on RunPod's platform. You'

Mistrall Small 3 Eschews Synthetic Data - What Does This Mean?

Mistrall Small 3 Eschews Synthetic Data - What Does This Mean?

Mistral AI has released Mistral Small 3, which claims to not use synthetic data in their training pipeline. Weighing in at 22B with 32k context, this is a lightweight model that can be run at full weights on an A40, and nearly any GPU spec that we offer while quantized.

The Complete Guide to GPU Requirements for LLM Fine-tuning

The Complete Guide to GPU Requirements for LLM Fine-tuning

When deciding on a GPU spec to train or fine-tune a model, you're likely going to need to hold onto the pod for hours or even days for your training run. Even a difference of a few cents per hour easily adds up, especially if you have a

DeepSeek R1 - What's the Hype?

DeepSeek R1 - What's the Hype?

DeepSeek R1 is a recently released model that has been topping benchmarks in several key areas. Here's some of the leaderboards it's shot to the top of: * LiveBench: Second place, with only GPT4 o1-2024-12-17 surpassing it as of this writing. * Aider: Ditto. * Artificial Analysis: Fifth place,

5090s Are Almost Here: How Do They Shape Up Against the 4090?

5090s Are Almost Here: How Do They Shape Up Against the 4090?

Another year has come and another new card generation from NVidia is on the way. 5090s are due to become widely available this January, and RunPod is going to be extremely eager to support them once they do. Along with the new Blackwell architecture, the 5090 is set to launch

How Do I Transfer Data Into My Pod?

How Do I Transfer Data Into My Pod?

We've got a number of questions recently about how to transfer data into pods, and while do we have some transfer methods in our docs we'd like to go into a bit more detail on how to upload files into your pod. In general, for large

What's New for Serverless LLM Usage in RunPod in 2025?

What's New for Serverless LLM Usage in RunPod in 2025?

Out of all of the use cases that our serverless architecture has, LLMs are one of the best examples of it. Because so much of LLM use is dependent on the human using it to process, digest, and type a response, you save so much on GPU spend by ensuring

H200 Tensor Core GPUs Now Available on RunPod

H200 Tensor Core GPUs Now Available on RunPod

We're pleased to announce that H200 is now available on RunPod at a price point of $3.99/hr in Secure Cloud. This GPU spec boosts the available VRAM for NVidia-based applications up to 141GB in a single unit along with increased memory bandwidth. Here are how the

RunPod Sponsors CivitAI's Project Odyssey 2024 Competition

RunPod Sponsors CivitAI's Project Odyssey 2024 Competition

RunPod is proud to sponsor Season 2 of Project Odyssey 2024 from CivitAI, the world's largest AI filmmaking competition. We've written in the past about prominent open source packages like LTX, Mochi, and Hunyuan Video – here's your chance to show off your skills and

Train Your Own Video LoRAs with diffusion-pipe

Train Your Own Video LoRAs with diffusion-pipe

You can now train your own LoRAs for Flux, Hunyuan Video, and LTX Video with tdrussells' diffusion-pipe, a training script for video diffusion models. Let's run through an example of how this is done with Hunyuan Video. Start Up a Pod First, start up a pod with

A Leap into the Unknown: Why I Joined RunPod

A Leap into the Unknown: Why I Joined RunPod

This entry has been contributed by Jean-Michael Desrosiers, Head of Enterprise at RunPod. I take shots—sometimes far too many, and in wildly different directions. I always have, and it’s been a part of my DNA for as long as I can remember. Picture an overly enthusiastic explorer darting