RunPod Blog
  • RunPod
  • Docs
Sign in Subscribe
Stability.ai Releases Stable Diffusion 3.5 - What's New in the Latest Generation?

Stability.ai Releases Stable Diffusion 3.5 - What's New in the Latest Generation?

On October 22, Stability.AI released its latest version of Stable Diffusion, SD3.5 There are currently two versions out (Large and Large Turbo), with the former geared towards quality while the latter favoring efficiency. Next week, Medium will release, aimed at smaller GPU specs. You can quickly and easily
24 Oct 2024 4 min read
NVidia's Llama 3.1 Nemotron 70b Instruct: Can It Handle My Unsolved LLM Problem?

NVidia's Llama 3.1 Nemotron 70b Instruct: Can It Handle My Unsolved LLM Problem?

Earlier this month, NVidia released Llama 3.1 Nemotron Instruct, a 70b model that has taken some notably high spots on various leaderboards, seeming to punch far above its weight. As of October 14th, it is not only beating high-end closed source models that far outweigh it like Claude 3
18 Oct 2024 11 min read
How to Code Directly With Stable Diffusion Within Python On RunPod

How to Code Directly With Stable Diffusion Within Python On RunPod

While there are many useful front ends for prompting Stable Diffusion, in some ways it can be easier to simply it directly within Jupyter Notebook, which comes pre-installed within many RunPod templates. Once you spin up a pod you get instant access to Jupyter as well, allowing you to directly
14 Oct 2024 7 min read
Why LLMs Can't Spell 'Strawberry' And Other Odd Use Cases

Why LLMs Can't Spell 'Strawberry' And Other Odd Use Cases

Picture this: You've got an AI language model - let's call it Bahama-3-70b - who can write sonnets, explain quantum physics, and even crack jokes. But ask it to count the r's in "strawberry," and suddenly it's like a toddler
01 Oct 2024 3 min read
How to Easily Work with GGUF  Quantizations In KoboldCPP
Text Generation

How to Easily Work with GGUF Quantizations In KoboldCPP

Everyone wants more bang for their buck when it comes to their business expenditures, and we want to ensure you have as many options as possible. Although you could certainly load full-weight fp16 models, it turns out that you may not actually need that level of precision, and it may
25 Sep 2024 6 min read
Introducing Better Launcher: Spin Up New Stable Diffusion Pods Quicker Than Before
Image Generation

Introducing Better Launcher: Spin Up New Stable Diffusion Pods Quicker Than Before

Our very own Madiator2011 has done it again with the release of Better Forge, a streamlined template that lets you spin up an instance with a minimum of fuss. One fairly consistent piece of feedback brought up by RunPod users is how long it takes to start up an image
20 Sep 2024 5 min read
Use RunPod Serverless To Run Very Large Language Models Securely and Privately

Use RunPod Serverless To Run Very Large Language Models Securely and Privately

As discussed previously, a human interacting with a chatbot is one of the prime use cases for RunPod serverless functions. Because the vast majority of the elapsed time is on the human's end, where they are reading, procesisng, and responding, the GPU sits idle for the vast majority
18 Sep 2024 5 min read
Evaluate Multiple LLMs Simultaneously in a Flash with ollama

Evaluate Multiple LLMs Simultaneously in a Flash with ollama

Imagine you are a studio manager tasked with serving up a creative writing assistant to your users, and are directed to select only a few best candidates to run on endpoints to keep the project maintainable and within scope. As of the writing of this article, there are more than
13 Sep 2024 15 min read
Optimize Your vLLM Deployments on RunPod with GuideLLM

Optimize Your vLLM Deployments on RunPod with GuideLLM

As a RunPod user, you're already leveraging the power of GPU cloud computing for your machine learning projects. But are you getting the most out of your vLLM deployments? Enter GuideLLM, a powerful tool that can help you evaluate and optimize your Large Language Model (LLM) deployments for
10 Sep 2024 2 min read
RunPod Weekly #17 - Pricing Updates, SGLang Worker (Beta), Blogs
RunPod Weekly

RunPod Weekly #17 - Pricing Updates, SGLang Worker (Beta), Blogs

Welcome to another round of RunPod Weekly! This week, we are excited to share the following: 📈 Pricing Updates We've been running a temporary promotion for A40 48GB GPUs, known for their exceptional combination of vRAM, performance, and pricing. We've been thrilled to see the amazing products
30 Aug 2024 3 min read
Run Gemma 7b with vLLM on RunPod Serverless

Run Gemma 7b with vLLM on RunPod Serverless

In this blog, you'll learn: * About RunPod's latest vLLM worker for the newest models * Why vLLM is an excellent choice for running Google’s Gemma 7B * A step-by-step guide to get Google Gemma 7B up and running on RunPod Serverless with the quick deploy vLLM worker.
22 Aug 2024 5 min read
Run Llama 3.1 with vLLM on RunPod Serverless

Run Llama 3.1 with vLLM on RunPod Serverless

In this blog, you'll learn: * About RunPod's latest vLLM worker for the newest models * Why vLLM is an excellent choice for running Meta's Llama 3.1 * A step-by-step guide to get Meta Llama 3.1's 8b-instruct version up and running on RunPod
20 Aug 2024 7 min read
RunPod Weekly #16 - Serverless Improvements, Llama 3.1 on vLLM, Better Rag Support, Blogs
RunPod Weekly

RunPod Weekly #16 - Serverless Improvements, Llama 3.1 on vLLM, Better Rag Support, Blogs

Welcome to another round of RunPod Weekly! This week, we are excited to share the following: ✨ Serverless Improvements Our workers view has been revamped to give a more in-depth overview of each worker, where it's located, and it's current state. You can now also expose HTTP
16 Aug 2024 2 min read
Supercharge Your LLMs Using SGLang For Inference: Why Speed and Efficiency Matter More Than Ever

Supercharge Your LLMs Using SGLang For Inference: Why Speed and Efficiency Matter More Than Ever

RunPod is proud to partner with LMSys once again to put a spotlight on its inference engine SGLang. LMSys has a storied history within the realm of language models with prior contributions such as the Chatbot Arena which compares outputs from competing models, Vicuna, an open source competitor to ChatGPT,
15 Aug 2024 6 min read
How to Run Flux Image Generator with ComfyUI

How to Run Flux Image Generator with ComfyUI

What is Flux? Flux is an innovative text-to-image AI model developed by Black Forest Labs that has quickly gained popularity among generative AI enthusiasts and digital artists. Its ability to generate high-quality images from simple text prompts sets it apart. The Flux 1 family includes three versions of their image
13 Aug 2024 5 min read
How to run Flux image generator with RunPod

How to run Flux image generator with RunPod

What is Flux? Flux is a new and exciting text-to-image AI model developed by Black Forest Labs. This innovative model family has quickly captured the attention of generative AI enthusiasts and digital artists alike, thanks to its remarkable ability to generate high-quality images from simple text prompts. The Flux 1family
08 Aug 2024 6 min read
RunPod Weekly #15 - New Referral Program, Community Changelog, Blogs
RunPod Weekly

RunPod Weekly #15 - New Referral Program, Community Changelog, Blogs

Welcome to another round of RunPod Weekly! This week, we are excited to share the following: 🤝 New Referral Program We've reworked our referral program to make it easier (and more lucrative) for anyone to get started. These changes include higher reward rates, a new serverless referral program, no
02 Aug 2024 3 min read
How to run SAM 2 on a cloud GPU with RunPod

How to run SAM 2 on a cloud GPU with RunPod

What is SAM 2? Meta has unveiled Segment Anything Model 2 (SAM 2), a revolutionary advancement in object segmentation. Building on the success of its predecessor, SAM 2 integrates real-time, promptable object segmentation for both images and videos, enhancing accuracy and speed. Its ability to operate across previously unseen visual
02 Aug 2024 6 min read
Run Llama 3.1 405B with Ollama: A Step-by-Step Guide

Run Llama 3.1 405B with Ollama: A Step-by-Step Guide

Meta’s recent release of the Llama 3.1 405B model has made waves in the AI community. This groundbreaking open-source model not only matches but even surpasses the performance of leading closed-source models. With impressive scores on reasoning tasks (96.9 on ARC Challenge and 96.8 on GSM8K)
29 Jul 2024 5 min read
Master the Art of Serverless Scaling: Optimize Performance and Costs on RunPod

Master the Art of Serverless Scaling: Optimize Performance and Costs on RunPod

In many sports – golf, baseball, tennis, among others – there is a "sweet spot" to aim for which results in the maximum amount of lift or distance for the ball given an equivalent amount of kinetic energy in the swing. While you'll still get somewhere with an
25 Jul 2024 7 min read
Introducing RunPod’s New and Improved Referral Program

Introducing RunPod’s New and Improved Referral Program

Referring friends to RunPod just got much easier. From now until the end of the year (December 31st, 2024), we've removed all eligibility requirements for the referral program and increased the referral commission from 2% to 3% on GPU Pods and from 0% to 5% on Serverless. No
23 Jul 2024 2 min read
RunPod Weekly #14 - Pricing Changes, Community Changelog, Blogs
RunPod Weekly

RunPod Weekly #14 - Pricing Changes, Community Changelog, Blogs

Welcome to another round of RunPod Weekly! This week, we are excited to share the following: 💸 Pricing Changes RunPod pricing is dropping by up to -40% on Serverless and up to -18% on Secure Cloud. Why We're Doing This GPUs aren't cheap, nor is the infrastructure
19 Jul 2024 3 min read
How to run vLLM with RunPod Serverless

How to run vLLM with RunPod Serverless

In this blog you’ll learn: 1. When to choose between closed source LLMs like ChatGPT and open source LLMs like Llama-7b 2. How to deploy an open source LLM with vLLM If you're not familiar, vLLM is a powerful LLM inference engine that boosts performance (up to
18 Jul 2024 5 min read
RunPod Slashes GPU Prices: Powering Your AI Applications for Less

RunPod Slashes GPU Prices: Powering Your AI Applications for Less

RunPod is dropping prices across our Serverless and Secure Cloud services. Why? Because we believe in giving you the firepower you need to build applications without breaking the bank. The Lowdown on Our New Pricing Let's cut to the chase. Here's what's changing: Serverless:
12 Jul 2024 3 min read
RAG vs. Fine-Tuning: Which Method is Best for Large Language Models (LLMs)?

RAG vs. Fine-Tuning: Which Method is Best for Large Language Models (LLMs)?

Large Language Models (LLMs) have changed the way we interact with technology, powering everything from chatbots to content-generation tools. But these models often struggle with handling domain-specific prompts and new information that isn't included in their training data. So, how can we make these powerful models more adaptable?
11 Jul 2024 8 min read
← Newer Posts Page 4 of 9 Older Posts →
RunPod Blog © 2025
  • Sign up
Powered by Ghost