Wan2.1 Text to Video
Wan2.1 is a cutting-edge suite of video foundation models that excels in text-to-video (T2V) generation, pushing the boundaries of what's possible. It consistently outperforms existing open-source and commercial solutions across multiple benchmarks.
Key Features of Wan2.1 Text to Video
-
SOTA Performance: Consistently outperforms existing open-source and commercial models across multiple benchmarks.
-
Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information.
-
Architecture: Designed on the mainstream diffusion transformer paradigm with innovations like a novel spatio-temporal variational autoencoder (VAE).
-
T2V-14B: Supports both 480P and 720P resolutions. It establishes a new SOTA performance benchmark.
-
T2V-1.3B: Supports 480P resolution. While capable of generating videos at 720P, the 480P resolution provides more stable results
Additional Information
-
The models are licensed under the Apache 2.0 License, granting freedom of use while ensuring compliance with the license provisions.
-
Extensive manual evaluations confirm that Wan2.1 outperforms both closed-source and open-source models
Other Popular Models
storydiffusion
Story Diffusion turns your written narratives into stunning image sequences.

fooocus
Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.

instantid
InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

codeformer
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
