Seedance 2.0 Multimodal Video Support

· Update

1. Seedance 2.0 Model Access

This model is currently available for Enterprise users only.

Seedance 2.0 multimodal reference-to-video supports reference images (0–9), reference videos (0–3), reference audio clips (0–3), and optional text prompts.

2. Prompt Guidelines

Chinese and English prompts are supported. For best results, we recommend keeping Chinese prompts under 500 characters and English prompts under 1,000 words. Overly long prompts may scatter key information, causing the model to miss details or focus only on partial instructions.

3. Image Input Requirements

Supported formats: JPEG, PNG, WEBP, BMP, TIFF, and GIF.

Aspect ratio: 0.4–2.5.

Width and height: 300–6,000 px.

File size: each image must be under 30 MB. The total request body must not exceed 64 MB. Avoid Base64 encoding for large files.

Image count: 1 image for first-frame image-to-video, 2 images for first-and-last-frame image-to-video, and 1–9 images for Seedance 2.0 multimodal reference-to-video.

4. Video Input Requirements

Supported formats: MP4 and MOV.

Resolution: 480p or 720p.

Duration: each video must be 2–15 seconds. Up to 3 reference videos can be used, with a total duration of no more than 15 seconds.

Aspect ratio: 0.4–2.5.

Width and height: 300–6,000 px.

Frame pixels: 409,600–927,408 pixels. For example, 640 × 640 meets the minimum, while 834 × 1112 meets the maximum.

File size: each video must be under 50 MB.

Frame rate: 24–60 FPS.

5. Audio Input Requirements

Supported formats: WAV and MP3.

Duration: each audio clip must be 2–15 seconds. Up to 3 reference audio clips can be used, with a total duration of no more than 15 seconds.

File size: each audio clip must be under 15 MB. The total request body must not exceed 64 MB.

6. Check Compliance Before Generation

If your input includes people in images, videos, or audio, run Compliance Check before generation.

7. Generate After Passing Compliance

A green checkmark means the asset has passed detection and is ready for generation.

8. Batch Asset Compliance Check

For large sets of people-related images, videos, or audio clips, run batch asset detection in advance to streamline the compliance workflow.

9. Access the Asset Library

10. Create Asset Group

11. Upload Assets & Check the Stored Status

The Stored status means the asset has passed detection and is ready to use. Uploaded assets can include images, videos, and audio.

Image requirements: JPEG, PNG, WEBP, BMP, TIFF, GIF, HEIC/HEIF; aspect ratio 0.4–2.5; width and height 300–6,000 px; under 30 MB per image.

Video requirements: MP4 or MOV; 480p or 720p; 2–15 seconds per video; aspect ratio 0.4–2.5; width and height 300–6,000 px; total pixels 409,600–927,408; under 50 MB per video; 24–60 FPS.

Audio requirements: WAV or MP3; 2–15 seconds per clip; under 15 MB per audio clip.

12. Open Assets in Video Node

13. Select Asset Group

14. Pick a Virtual Portrait

15. Use Approved Image in Video Node

The video node will display compliant assets that are ready to use.