News

RunPod Partners With RandomSeed to Provide Accessible, User-Friendly Stable Diffusion API Access

Brendan McKeag

22 Sep 2023 • 8 min read

RunPod is delighted to collaborate with RandomSeed by providing the serverless compute power required to create generative AI art through their API access. The mission of RandomSeed is to help developers build AI image generators by providing a hosted AUTOMATIC1111 API that can create images on-demand through API calls, saving developers the burden of having to host and manage their own infrastructure for Stable Diffusion.

The Benefits of Serverless Pricing

RandomSeed helps leverage the serverless capacity of RunPod by providing a user-friendly method to interact with Stable Diffusion through API calls with none of the technical expertise that would normally be required in building and maintaining such a setup. All you have to do is pass your parameters to RandomSeed in an API request and you'll have your image served wherever it needs to go. The end result is that you only pay for the server time you actually use, which ends up costing you an average of a cent (or even less) per image. This means that your costs scale with your needs, compared to the static, per-hour price of renting a pod.

RandomSeed even has a playground that you can use to test out the service and see if it's a good fit for you:

AUTOMATIC1111 API documentation

RandomSeed has graciously contributed the following documentation on how to pass requests to the Stable Diffusion API.

AUTOMATIC1111 has emerged as a leading tool for image generation through stable diffusion, boasting a comprehensive API that enables a multitude of functions. However, it’s near impossible to find in-depth documentation explaining what each parameter does. In this article, we attempt to share our findings from using the API while running our own cloud-based AUTOMATIC1111 API at RandomSeed.

txt2img

prompt: (string, required)	A text string that describes the image you want to generate.
negative_prompt: (string)	A text description on what you don't want to see in the image
steps: (integer, defaults to 50)	The number of denoising steps used by the AI model when generating an image from the text prompt. A higher number results in longer generation time, and does not necessarily guarantee higher image quality.
seed: (integer, defaults to -1)	Value that determines the output of a random number generator. You can get ALMOST the same image if you provide the same generation settings and the seed.
sampler_name: (string, required)	Name of the sampler method you will be using to generate an image. Image output varies slightly depending on the method. Samples available include the following: Euler a Euler LMS Heun DPM DPM2 a DPM++ 2S a DPM++ 2M DPM ++ SDE DPM fast DPM adaptive LMS Karras DPM2 a Karras DPM++ 2S a Karras DPM++ 2M Karras DPM++ SDE Karras DDIM PLMS UniPC
n_iter (integer, defaults to 1)	Specifies the number of iterations per batch generation.
batch_size (integer, defaults to 1)	Specifies the number of batches you want to run in parallel. You can use this parameter to run bulk image generations along with n_iter. You can do n_iter = 10 and batch_size = 10 to generate 100 images in 10 iterations
width (integer, defaults to 512)	Width of the image you want to generate,
height (integer, defaults to 512)	Height of the image that you want to generate
cfg_scale (float, defaults to 7.0)	Governs how closely it follows the prompt. 7 or 7.5 usually works best. You can increase the value if the generated image doesn’t match your prompt. Low numbers (0-6 ish): You're telling SD that it can ignore your prompt. Mid-range (6-10 ish): You're telling SD you'd like it to do what you're asking, but you don't mind a bit of variation. High numbers (10+): We found that high cfg often results in a really bad image.
restore_faces: (boolean)	Allows you to improve the face if set to true. Can make a face slightly different.
enable_hr: (boolean, defaults to false)	Set this to true if you want to do highres upscaling. Be careful though because it can result in AUTOMATIC1111 being killed if a host machine does not have enough VRAM.
hr_upscaler (string)	Method you can use to do highres upscaling. Method you can use to do highres upscaling. Some of the popular methods: Latent, Latent (antialiased), Latent (bicubic), Latent (bicubic antialiased), Latent (nearest), Latent (nearest-exact), None, Lanczos, Nearest, ESRGAN_4x, R-ESRGAN 4x+, R-ESRGAN 4x+ Anime6B
denoising_strength (float, defaults to 0.75)	Controls the likeness of the generated image during highres upscaling A value between 0.5 and 1 is optimal for maintaining image consistency.
alwayson_scripts (object)	You can specify various script extensions here. For instance, you can use the controlnet extension script here to generate different kinds of images. Make sure to clone the controlnet repo to /stable-diffusion-webui/extensions directory in automatic1111 repo. controlnet object args object [] module string (required) - refer to the table below model string (required) - refer to the table below input_image string (required) - base64 encoded starting image processor_res integer (required) - set this to 512. There’s an ongoing bug that results in the cv2 error if it’s not set. weight float - determines how closely the control map should follow the prompt. By default it is 1 and it will deviate from the control map as it gets closer to 0. lowvram boolean - flag to use less GPU memory. Only recommended if you have less than 8gb VRAM control_mode integer - 0, 1, 2 or “Balanced”, “My prompt is more important”, “ControlNet is more important”. pixel_perfect boolean resize_mode integer - 0, 1, 2 or “Just Resize”, “Crop and Resize”, “Resize and Fill”

ControlNet Model to Module Mapping

model	module
control_v11p_sd15_canny	canny
control_v11p_sd15_mlsd	MLSD
control_v11p_sd15_depth	depth, depth_leres, depth_leres_boost
control_v11p_sd15_normalbae	normal-bae
control_v11p_sd15_seg	segmentation
control_v11p_sd15_inpaint	inpaint, inpaint_only, inpaint_only+lama
control_v11p_sd15_lineart	lineart, lineart_coarse, lineart_standard
control_v11p_sd15s2_lineart_anime	lineart_anime
control_v11p_sd15_openpose	openpose, openpose_face, openpose_faceonly, openpose_full, openpose_hand
control_v11p_sd15_scribble	scribble_pidinet, scribble_hed, scribble_xdog, fake_scribble
control_v11p_sdd15_softedge	softedge_hed
control_v11p_sd15_tile	tile_resample, tile_colorfix, tile_colorfix+sharp
control_v11e_sd15_shuffle	shuffle
Control_v11e_Sd15_ip2p aka instruct Pix2Pix	null
control_v1p_sd15_grcode_monster	null

img2img

prompt: (string, required)	Text string that describes the image you want to generate
negative_prompt: (string, defaults to empty string)	A text description on what you don't want to see in the image
steps: (integer, defaults to 50)	The number of denoising steps used by the AI model when generating an image from the text prompt. A higher number results in longer generation time, and does not necessarily guarantee higher image quality.
sampler_name: (string, required)	Name of the sampler method you will be using to generate an image. Image output varies slightly depending on the method. Samples available include the following: Euler a Euler LMS Heun DPM2 DPM2 a DPM++ 2S a DPM++ 2M DPM ++ SDE DPM fast DPM adaptive LMS Karras DPM2 a Karras DPM++ 2S a Karras DPM++ 2M Karras DPM++ SDE Karras DDIM PLMS UniPC
n_iter (integer, defaults to 1)	Specifies the number of iterations per batch generation.
batch_size (integer, defaults to 1)	Specifies the number of batches you want to run in parallel. You can use this parameter to run bulk image generations along with n_iter. You can do n_iter = 10 and batch_size = 10 to generate 100 images in 10 iterations
width (integer, defaults to 512)	Width of the image you want to generate.
height (integer, defaults to 512)	Height of the image that you want to generate.
cfg_scale (float, defaults to 7.0)	Governs how closely it follows the prompt. 7 or 7.5 usually works best. You can increase the value if the generated image doesn’t match your prompt. Low numbers (0-6 ish): you're telling SD that it can ignore your prompt. Mid-range (6-10 ish): you're telling SD you'd like it to do what you're asking, but you don't mind a bit of variation. High numbers (10+): we found that high cfg often results in a really bad image.
include_init_images (boolean)	Set to true if you plan to include starting image(s) in the payload, or if you want to do inpainting on the image.
denoising_strength (float, defaults to 0.75)	Determines how close the generated image will be to the provided image. The closer it is to 0, the closer it will be to the provided image.
int_images string[]	List of base64 encoded images that will be used for generation
mask: string, defaults to null	Base64 encoded image that will be used for inpainting. You need to specify the region of the starting image that you want modified.
inpainting_mask_invert: integer, defaults to 0	Determines whether you want to inpaint the masked object or outside of the masked object. 0 will inpaint the masked object. 1 will inpaint the region outside of the masked object. This parameter is required for inpainting.
inpainting_fill: integer, defaults to 0:	Determines how masked content should be displayed. It can be either 0, 1, 2, or 3. 0 -> “fill” 1 -> “original” 2 -> “latent noise” 3 -> “latent nothing” This parameter is required for inpainting.
resize_mode: integer, defaults to 0	It can be either 0, 1, 2, or 3. 0 -> “just resize” 1 -> “crop and resize” 2 -> “resize” 3 -> “fill” This parameter is required for inpainting.
inpaint_full_res_padding: integer, defaults to 0	Padding around the masked object to inpaint in full resolution This parameter is required for inpainting.
inpaint_full_res: boolean, defaults to true	Refers to the inpaint area. It can be either true or false. If set to true, inpaint area will be the same resolution as the starting image. If set to false, inpaint area will stretch to fit the starting image. This parameter is required for inpainting.
image_cfg_scale: float	Determines how much the result resembles your starting image, so a lower value means a stronger effect. This is the exact opposite of cfg_scale.

Example Calls

We’re going to show you how you can use the API to do some basic image generations. Make sure that you have cloned the auto1111 repo from here, and have it running locally on your PC.

txt2img generation:

curl http://127.0.0.1:7860/sdapi/v1/txt2img \
  -H "Content-Type: application/json" \
  -d '{
   “prompt”: “Cute cat”,
   “negative_prompt”: “ugly, out of frame”,
   “width”: 1028, 
   “height”: 1028,
   “cfg_scale”: 7,
   “sampler_name”: “Euler a”,
   “steps”: 25,
   “n_iter”: 1,
   “seed”: 3315716384
  }'

img2img generation:

curl http://127.0.0.1:7860/sdapi/v1/img2img \
  -H "Content-Type: application/json" \
  -d '{
   “prompt”: “cute dog”,
   “negative_prompt”: “ugly, out of frame”,
   “width”: 1028, 
   “height”: 1028,
   “cfg_scale”: 7,
   “sampler_name”: “Euler a”,
   “steps”: 25,
   “n_iter”: 1,
   “seed”: -1,
   “include_init_images”: true,
   “init_images”: [“....”], //using the base64 encoded image from the txt2img api call from above
   “denoising_strength”: 0.6
  }'

Inpainting example:

We’re going to inpaint over the dog’s ears for this example.

curl http://127.0.0.1:7860/sdapi/v1/img2img \
  -H "Content-Type: application/json" \
  -d '{
   “prompt”: “cute dog, (smaller ears:1.4)”,
   “negative_prompt”: “ugly, out of frame”,
   “width”: 1028, 
   “height”: 1028,
   “cfg_scale”: 7,
   “sampler_name”: “Euler a”,
   “steps”: 25,
   “n_iter”: 1,
   “seed”: -1,
   “include_init_images”: true,
   “init_images”: [“....”], //using the base64 encoded image from the img2img api call from above
   “mask”: “...”, // base64 encoded mask object. It should cover the dog’s ears
   “denoising_strength”: 0.6,
   “image_cfg_scale”: 0.7,
   “inpainting_fill” : 1,
   “inpaint_full_res”: false,
   “inpainting_mask_inverse”: 0,
   “inpaint_full_res_padding”: 32
  }'

QR Monster ControlNet (Illusion Diffusion)

QR Monster Controlnet is taking the internet by storm. If you want to generate images like this, or this, you can make the request like below:

curl http://127.0.0.1:7860/sdapi/v1/img2img \
  -H "Content-Type: application/json" \
  -d '{
   “prompt”: “cute dog”,
   “negative_prompt”: “ugly, out of frame”,
   “width”: 1028, 
   “height”: 1028,
   “cfg_scale”: 7,
   “sampler_name”: “Euler a”,
   “steps”: 25,
   “n_iter”: 1,
   “seed”: -1,
   “include_init_images”: true,
   “init_images”: [“....”], //using the base64 encoded image from the img2img api call from above
   “denoising_strength”: 0.7,
   “controlnet_args”: [
      {
         “input_image”: “...”, //base64 encoded image. You can use this image but convert it to base64 string: https://i.imgur.com/9NRTUvK.png
          “module”: null,
          “model”: “control_v1p_sd15_qrcode_monster”
      }
   ]
  }'

Conclusion

We hope this documentation gives you a good starting point for generating images with AUTO1111 API. As you work with the API, don't be afraid to tweak the settings and observe its impact. Refer to the documentation and examples as needed. With some practice, you'll be leveraging AUTOMATIC1111 to create amazing AI artworks.1 Let us know if you make something cool!