Tost AI Prototypes

26 Services Available

Global Settings

RunPod Token

Webhook URL

Theme

MaterialMVP

3D

MaterialMVP takes a 3D mesh (.glb) and a reference image, applies multi-view PBR diffusion to generate consistent material maps (albedo, normal, roughness, metallic), and outputs an updated GLB with embedded PBR materials for real-time rendering.

Parameters:

3 configurable options

Code Project Page

ReconViaGen

3D

ReconViaGen improves multi-view 3D reconstruction by combining generative priors with reconstruction guidance. Unlike prior methods that suffer from holes, noise, and inconsistency, it ensures more complete, accurate, and view-consistent 3D models.

Parameters:

11 configurable options

Code Project Page

Puppeteer

3D

The Puppeteer API processes a 3D mesh file and performs automated character rigging, returning a fully rigged mesh ready for animation with thumbnail generation.

Parameters:

3 configurable options

Code Project Page Research Paper

Audiowaveform

AUDIO

The Audio Waveform Processing API allows you to generate visual waveforms from audio files.

Parameters:

3 configurable options

Wan2.2-FLF-to-Video

VIDEO

This API generates videos by transforming a start image into an end image based on user prompts. Users can customize resolution, frame rate, steps, and other settings. Results are sent to a webhook with a download link and execution details.

Parameters:

21 configurable options

Code Project Page Research Paper

Wan2.2-Orbit

VIDEO

This API enables orbit-style video generation from a single input image. It applies smooth 360° rotation around the subject.

Parameters:

20 configurable options

Code Project Page Research Paper

GLTF to FBX

CONVERSION

The GLTF-to-FBX Converter is a service that converts 3D models from the GLTF/GLB format to FBX format.

Parameters:

2 configurable options

Image to Normal

IMAGE

The Image-to-Normal process converts a standard image into a normal map representation. Normal maps encode surface orientation information in RGB colors, which can be used in 3D rendering to simulate detailed surface textures without increasing polygon count.

Parameters:

9 configurable options

Code Project Page Research Paper

Seamless

IMAGE

The Seamless Texture Generator is a service that creates tileable, high-resolution textures based on user-defined prompts. These textures can be used in 3D modeling, digital art, game development, textile design, and other creative applications.The generator uses AI to ensure the resulting texture can repeat seamlessly in all directions.

Parameters:

13 configurable options

Code Project Page Research Paper

Inpainting

IMAGE

Inpainting lets you modify or fill specific areas of an image, remove objects, restore damage, or blend new elements seamlessly.

Parameters:

12 configurable options

Code Project Page

Sketch-to-Image

IMAGE

The Sketch-to-Image API turns images into stylized outputs using JuggernautXL, SDXL Lightning 4step LoRA, and ControlNet Union SDXL 1.0. Customize prompts, guidance, and strength, with results sent asynchronously via webhook.

Parameters:

12 configurable options

Code Project Page Research Paper

ACE-Step

AUDIO

The Ace-Step Audio Generation API creates high-quality audio from text prompts, supporting both instrumental and lyric-based tracks. It offers adjustable parameters for style, tone, and structure, and delivers results asynchronously via a webhook.

Parameters:

10 configurable options

Code Project Page Research Paper

UniRig

3D

The UniRig API processes a 3D mesh file (.glb) and performs automated character rigging, returning a fully rigged mesh ready for animation. This system leverages advanced neural rigging techniques to handle a wide variety of meshes with minimal user input.

Parameters:

3 configurable options

Code Project Page Research Paper

PartCrafter

3D

PartCrafter generates 3D models from input images using advanced neural network techniques. It can create detailed 3D representations with configurable parameters for mesh generation, inference steps, and output quality.

Parameters:

13 configurable options

Code Project Page Research Paper

Hunyuan3D-2.1

3D

Hunyuan3D-2.1 is an advanced generative 3D modeling system developed by Tencent, capable of producing high-fidelity 3D models from input images. It integrates sophisticated neural rendering and geometry synthesis techniques for realistic, production-ready assets with customizable thumbnail generation.

Parameters:

15 configurable options

Code Project Page Research Paper

Panorama

IMAGE

Panorama generates immersive 360° panoramic environments and cubemap textures from input images. It creates high-quality panoramic views with customizable parameters for resolution, guidance, and processing options.

Parameters:

10 configurable options

Code Project Page Research Paper

Lucy Edit Dev

VIDEO

Lucy Edit is a video editing model that performs instruction-guided edits on videos using free-text prompts — it supports a variety of edits, such as clothing & accessory changes, character changes, object insertions, and scene replacements while preserving the motion and composition perfectly.

Parameters:

9 configurable options

Qwen-Image-Edit

IMAGE

Qwen-Image-Edit-2509 introduces multi-image editing support for combinations such as person+person, person+product, and person+scene. It also improves single-image editing consistency, enhancing person, product, and text edits while preserving identity and style. With native ControlNet support for depth maps, edge maps, keypoints, and more, the model enables precise and flexible image editing.

Parameters:

20 configurable options

Code Project Page Research Paper

FLUX.1 [dev]

IMAGE

FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.

Parameters:

15 configurable options

Code Project Page

FLUX.1 Wireframe [dev]

IMAGE

FLUX.1 Wireframe [dev] LoRA is a LoRA for FLUX.1 [dev], capable of generating an image based on a text description while following the structure of the given wireframe image.

Parameters:

9 configurable options

Code Project Page

EVF-SAM2

IMAGE

Early Vision-Language Fusion for Text-Prompted Segment Anything Model

Parameters:

4 configurable options

Code Project Page Research Paper

Qwen3 Omni 30B A3B Instruct

LLM

Qwen3-Omni, the natively end-to-end multilingual omni-modal foundation models. It is designed to process diverse inputs including text, images, audio, and video, while delivering real-time streaming responses in both text and natural speech.

Parameters:

15 configurable options

Code Project Page Research Paper

MoGe-2

3D

MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details

Parameters:

6 configurable options

Code Project Page Research Paper

Qwen-Image-Consistence

IMAGE

Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Parameters:

21 configurable options

Code Project Page Research Paper

Index-TTS2

AUDIO

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech.

Parameters:

24 configurable options

Code Project Page Research Paper

Qwen-Image-Lora

IMAGE

Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Parameters:

22 configurable options

Code Project Page Research Paper

11

Image

3

Video

7

3D

3

Audio

1

Conversion

1

LLM