RunPod Blog
  • RunPod
  • Docs
Sign in Subscribe
How Much VRAM Does Your LLM Need? A Guide to GPU Memory Requirements
GPU Power

How Much VRAM Does Your LLM Need? A Guide to GPU Memory Requirements

Discover how to determine the right VRAM for your Large Language Model (LLM). Learn about GPU memory requirements, model parameters, and tools to optimize your AI deployments.
08 Jul 2024 5 min read
Benchmarking LLMs: A Deep Dive into Local Deployment and Performance Optimization
Community Contribution

Benchmarking LLMs: A Deep Dive into Local Deployment and Performance Optimization

I just love the idea of running an LLM locally. It has huge implications for data security and the ability to use AI on private datasets. Get your company’s DevOps teams some real GPU servers as soon as possible. Benchmarking LLM performance has been a blast, and I’ve
04 Jul 2024 5 min read
AMD MI300X vs. Nvidia H100 SXM: Performance Comparison on Mixtral 8x7B Inference

AMD MI300X vs. Nvidia H100 SXM: Performance Comparison on Mixtral 8x7B Inference

There’s no denying Nvidia's historical dominance when it comes to AI training and inference. Nearly all production AI workloads run on their graphics cards. However, there’s been some optimism recently around AMD, seeing as the MI300X, their intended competitor to Nvidia's H100, is strictly
01 Jul 2024 7 min read
Partnering with Defined AI to bridge the data wealth gap

Partnering with Defined AI to bridge the data wealth gap

RunPod is dedicated to democratizing access to AI development and bridging the data wealth gap. Alongside Defined.ai, the world’s largest ethical AI training data marketplace, RunPod launched a pilot program to give startups access to enterprise-grade datasets for training SOTA models. The Genesis of Collaboration To build SOTA
17 Jun 2024 3 min read
Run Larger LLMs on RunPod Serverless Than Ever Before - Llama-3 70B (and beyond!)
Language Models

Run Larger LLMs on RunPod Serverless Than Ever Before - Llama-3 70B (and beyond!)

Up until now, RunPod has only supported using a single GPU in Serverless, with the exception of using two 48GB cards (which honestly didn't help, given the overhead involved in multi-GPU setups for LLMs.) You were effectively limited to what you could fit in 80GB, so you would
06 Jun 2024 3 min read
Introduction to vLLM and PagedAttention

Introduction to vLLM and PagedAttention

What is vLLM? vLLM is an open-source LLM inference and serving engine that utilizes a novel memory allocation algorithm called PagedAttention. It can run your models with up to 24x higher throughput than HuggingFace Transformers (HF) and up to 3.5x higher throughput than HuggingFace Text Generation Inference (TGI). How
31 May 2024 11 min read
Announcing RunPod's New Serverless CPU Feature

Announcing RunPod's New Serverless CPU Feature

We are thrilled to introduce the latest addition to the RunPod platform: Serverless CPU. This feature allows you to create high-performance VM containers with up to 3.75 GHz deviated cores, DDR5 memory, and NVME SSD storage. With Serverless CPU, you have the flexibility to choose between Compute-Optimized or General
28 May 2024 2 min read
Enable SSH Password Authentication on a RunPod Pod

Enable SSH Password Authentication on a RunPod Pod

When connecting to a RunPod Pod, a common issue is that SSH doesn't work out of the box. In this tutorial, we will examine a method of using a username and password to access a RunPod Pod through SSH. By the end of this guide, you'll
16 May 2024 2 min read
RunPod's $20MM Milestone: Fueling Our Vision, Empowering Our Team
Featured

RunPod's $20MM Milestone: Fueling Our Vision, Empowering Our Team

Exciting news! RunPod has raised $20MM led by Intel Capital and Dell Technologies Capital. This boost will further our mission to revolutionize AI/ML cloud computing.
08 May 2024 4 min read
How Coframe used RunPod Serverless to Scale During their Viral Product Hunt Launch

How Coframe used RunPod Serverless to Scale During their Viral Product Hunt Launch

Coframe uses RunPod Serverless to scale inference from 0 GPUs to hundreds in minutes. With RunPod, Coframe launched their generative UI tool on Product Hunt to thousands of users in a single day without having to worry about their infrastructure failing. In under a week, Coframe was able to deploy
07 May 2024 2 min read
How KRNL AI Scaled to 10,000+ Concurrent Users while Cutting Infrastructure Costs by 65% with RunPod Serverless

How KRNL AI Scaled to 10,000+ Concurrent Users while Cutting Infrastructure Costs by 65% with RunPod Serverless

When Giacomo, Founder and CPO of KRNL, reached out to RunPod in May 2023, we weren’t actually sure if we could support his use case. They needed a provider that could cost-effectively scale up to handle hundreds of thousands of users, and scale back down to zero in minutes.
07 May 2024 3 min read
Refocusing on Core Strengths: The Shift from Managed AI APIs to Serverless Flexibility
AI Integration

Refocusing on Core Strengths: The Shift from Managed AI APIs to Serverless Flexibility

RunPod is transitioning from Managed AI APIs to focusing on Serverless solutions, offering users more control and customization with comprehensive guidance.
29 Apr 2024 2 min read
Configurable Endpoints for Deploying Large Language Models

Configurable Endpoints for Deploying Large Language Models

RunPod introduces Configurable Templates, a powerful feature that allows users to easily deploy and run any large language model. With this feature, users can provide the Hugging Face model name and customize various template parameters to create tailored endpoints for their specific needs. Why Use Configurable Templates? Configurable Templates offer
15 Apr 2024 2 min read
Orchestrating RunPod's Workloads Using dstack

Orchestrating RunPod's Workloads Using dstack

Today, we're announcing the integration between Runpod and dstack, an open-source orchestration engine, that aims to simplify the development, training, and deployment of AI models while leveraging the open-source ecosystem. What is dstack? While dstack shares a number of similarities with Kubernetes, it is more lightweight and focuses
12 Apr 2024 2 min read
Revolutionizing Real Estate: Virtual Staging AI's Success Story with RunPod

Revolutionizing Real Estate: Virtual Staging AI's Success Story with RunPod

Virtual Staging AI, an innovative startup from the Harvard Innovation Lab, is transforming the real estate industry by leveraging cutting-edge AI technology and RunPod's powerful GPU infrastructure. Their state-of-the-art solution enables realtors to virtually stage properties in just 30 seconds at a fraction of the cost of traditional
10 Apr 2024 2 min read
Generate Images with Stable Diffusion on RunPod

Generate Images with Stable Diffusion on RunPod

💡RunPod is hosting an AI art contest, find out more on our Discord in the #art-contest channel. In this tutorial, you will learn how to generate images using Stable Diffusion, a powerful text-to-image model, on the RunPod platform. By following the step-by-step instructions, you'll set up the prerequisites,
27 Mar 2024 3 min read
Announcing RunPod’s Integration with SkyPilot

Announcing RunPod’s Integration with SkyPilot

RunPod is excited to announce its latest integration with SkyPilot, an open-source framework for running LLMs, AI, and batch jobs on any cloud. This collaboration is designed to significantly enhance the efficiency and cost-effectiveness of your development process, particularly for training, fine-tuning, and deploying models. What is SkyPilot? SkyPilot is
13 Mar 2024 3 min read
Elevating Veterinary Care: A Customer Success Story with ScribbleVet and RunPod
Customer Success

Elevating Veterinary Care: A Customer Success Story with ScribbleVet and RunPod

Discover how ScribbleVet transformed veterinary care with RunPod's AI technology, showcasing our commitment to empowering businesses and enhancing service quality.
12 Mar 2024 2 min read
Introducing the A40 GPUs: Revolutionize Machine Learning with Unmatched Efficiency

Introducing the A40 GPUs: Revolutionize Machine Learning with Unmatched Efficiency

In the rapidly evolving world of artificial intelligence and machine learning, the need for powerful, cost-effective hardware has never been more critical. The launch of the A40 GPUs marks a significant milestone in this journey, offering unparalleled performance and affordability. These GPUs are designed to cater to the needs of
11 Mar 2024 3 min read
RunPod's Latest Innovation: Dockerless CLI for Streamlined AI Development
Dockerless CLI

RunPod's Latest Innovation: Dockerless CLI for Streamlined AI Development

Discover the future of AI development with RunPod's Dockerless CLI tool. Experience seamless deployment, enhanced performance, and intuitive design, revolutionizing how you bring AI projects from concept to reality.
02 Feb 2024 4 min read
Embracing New Beginnings: Welcoming Banana.dev Community to RunPod
Runpod Platform

Embracing New Beginnings: Welcoming Banana.dev Community to RunPod

RunPod extends a warm welcome to the Banana.dev community, offering a supportive transition to our platform. Honoring the path paved by Banana.dev, we commit to empowering developers with innovative serverless solutions.
02 Feb 2024 3 min read
Maximizing AI Efficiency on a Budget: The Unbeatable Value of NVIDIA A40 and A6000 GPUs for Fine-Tuning LLMs
Featured

Maximizing AI Efficiency on a Budget: The Unbeatable Value of NVIDIA A40 and A6000 GPUs for Fine-Tuning LLMs

Harnessing Power and Economy in AI Hardware In the dynamic world of AI, the balance between cutting-edge performance and cost-effectiveness is a crucial consideration for those fine-tuning large language models (LLMs). While the allure of NVIDIA's flagship H100 and A100 GPUs is undeniable, the focus of this exploration
01 Feb 2024 3 min read
RunPod's Infrastructure: Powering Real-Time Image Generation and Beyond
Cloud Computing

RunPod's Infrastructure: Powering Real-Time Image Generation and Beyond

Discover how RunPod's infrastructure powers real-time AI image generation on our unique 404 page, using SDXL Turbo AI model. A blend of creativity and high-speed tech!
30 Jan 2024 3 min read
A Fresh Chapter in RunPod's Documentation Saga: Embracing Docusaurus for Enhanced User Experience
DocumentationOverhaul

A Fresh Chapter in RunPod's Documentation Saga: Embracing Docusaurus for Enhanced User Experience

Discover RunPod's revamped documentation, now more intuitive and user-friendly. Our recent overhaul with Docusaurus offers a seamless, engaging experience, ensuring easy access to our comprehensive GPU computing resources. Explore at docs.runpod.io
28 Jan 2024 1 min read
New Navigational Changes To RunPod UI
Runpod Platform

New Navigational Changes To RunPod UI

Today, we are releasing a brand new look to the RunPod control panel, resulting in saved clicks and faster navigation through the platform. A few key changes will need some attention as you get acclimated. Here's a quick rundown of what has changed! GPU Cloud Secure and Community
17 Jan 2024 2 min read
← Newer Posts Page 5 of 9 Older Posts →
RunPod Blog © 2025
  • Sign up
Powered by Ghost