TostAI Logo

Tost AI Prototypes

26 Services Available

Global Settings

MaterialMVP
3D
MaterialMVP takes a 3D mesh (.glb) and a reference image, applies multi-view PBR diffusion to generate consistent material maps (albedo, normal, roughness, metallic), and outputs an updated GLB with embedded PBR materials for real-time rendering.

Parameters:

3 configurable options

ReconViaGen
3D
ReconViaGen improves multi-view 3D reconstruction by combining generative priors with reconstruction guidance. Unlike prior methods that suffer from holes, noise, and inconsistency, it ensures more complete, accurate, and view-consistent 3D models.

Parameters:

11 configurable options

Puppeteer
3D
The Puppeteer API processes a 3D mesh file and performs automated character rigging, returning a fully rigged mesh ready for animation with thumbnail generation.

Parameters:

3 configurable options

Audiowaveform
AUDIO
The Audio Waveform Processing API allows you to generate visual waveforms from audio files.

Parameters:

3 configurable options

Wan2.2-FLF-to-Video
VIDEO
This API generates videos by transforming a start image into an end image based on user prompts. Users can customize resolution, frame rate, steps, and other settings. Results are sent to a webhook with a download link and execution details.

Parameters:

21 configurable options

Wan2.2-Orbit
VIDEO
This API enables orbit-style video generation from a single input image. It applies smooth 360° rotation around the subject.

Parameters:

20 configurable options

GLTF to FBX
CONVERSION
The GLTF-to-FBX Converter is a service that converts 3D models from the GLTF/GLB format to FBX format.

Parameters:

2 configurable options

Image to Normal
IMAGE
The Image-to-Normal process converts a standard image into a normal map representation. Normal maps encode surface orientation information in RGB colors, which can be used in 3D rendering to simulate detailed surface textures without increasing polygon count.

Parameters:

9 configurable options

Seamless
IMAGE
The Seamless Texture Generator is a service that creates tileable, high-resolution textures based on user-defined prompts. These textures can be used in 3D modeling, digital art, game development, textile design, and other creative applications.The generator uses AI to ensure the resulting texture can repeat seamlessly in all directions.

Parameters:

13 configurable options

Inpainting
IMAGE
Inpainting lets you modify or fill specific areas of an image, remove objects, restore damage, or blend new elements seamlessly.

Parameters:

12 configurable options

Sketch-to-Image
IMAGE
The Sketch-to-Image API turns images into stylized outputs using JuggernautXL, SDXL Lightning 4step LoRA, and ControlNet Union SDXL 1.0. Customize prompts, guidance, and strength, with results sent asynchronously via webhook.

Parameters:

12 configurable options

ACE-Step
AUDIO
The Ace-Step Audio Generation API creates high-quality audio from text prompts, supporting both instrumental and lyric-based tracks. It offers adjustable parameters for style, tone, and structure, and delivers results asynchronously via a webhook.

Parameters:

10 configurable options

UniRig
3D
The UniRig API processes a 3D mesh file (.glb) and performs automated character rigging, returning a fully rigged mesh ready for animation. This system leverages advanced neural rigging techniques to handle a wide variety of meshes with minimal user input.

Parameters:

3 configurable options

PartCrafter
3D
PartCrafter generates 3D models from input images using advanced neural network techniques. It can create detailed 3D representations with configurable parameters for mesh generation, inference steps, and output quality.

Parameters:

13 configurable options

Hunyuan3D-2.1
3D
Hunyuan3D-2.1 is an advanced generative 3D modeling system developed by Tencent, capable of producing high-fidelity 3D models from input images. It integrates sophisticated neural rendering and geometry synthesis techniques for realistic, production-ready assets with customizable thumbnail generation.

Parameters:

15 configurable options

Panorama
IMAGE
Panorama generates immersive 360° panoramic environments and cubemap textures from input images. It creates high-quality panoramic views with customizable parameters for resolution, guidance, and processing options.

Parameters:

10 configurable options

Lucy Edit Dev
VIDEO
Lucy Edit is a video editing model that performs instruction-guided edits on videos using free-text prompts — it supports a variety of edits, such as clothing & accessory changes, character changes, object insertions, and scene replacements while preserving the motion and composition perfectly.

Parameters:

9 configurable options

Qwen-Image-Edit
IMAGE
Qwen-Image-Edit-2509 introduces multi-image editing support for combinations such as person+person, person+product, and person+scene. It also improves single-image editing consistency, enhancing person, product, and text edits while preserving identity and style. With native ControlNet support for depth maps, edge maps, keypoints, and more, the model enables precise and flexible image editing.

Parameters:

20 configurable options

FLUX.1 [dev]
IMAGE
FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.

Parameters:

15 configurable options

FLUX.1 Wireframe [dev]
IMAGE
FLUX.1 Wireframe [dev] LoRA is a LoRA for FLUX.1 [dev], capable of generating an image based on a text description while following the structure of the given wireframe image.

Parameters:

9 configurable options

EVF-SAM2
IMAGE
Early Vision-Language Fusion for Text-Prompted Segment Anything Model

Parameters:

4 configurable options

Qwen3 Omni 30B A3B Instruct
LLM
Qwen3-Omni, the natively end-to-end multilingual omni-modal foundation models. It is designed to process diverse inputs including text, images, audio, and video, while delivering real-time streaming responses in both text and natural speech.

Parameters:

15 configurable options

MoGe-2
3D
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details

Parameters:

6 configurable options

Qwen-Image-Consistence
IMAGE
Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Parameters:

21 configurable options

Index-TTS2
AUDIO
IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech.

Parameters:

24 configurable options

Qwen-Image-Lora
IMAGE
Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Parameters:

22 configurable options

11

Image

3

Video

7

3D

3

Audio

1

Conversion

1

LLM