Midjourney-Style Video Generation
Turn simple text prompts into stunning Midjourney-style videos with AI — fully automated, beautifully upscaled, and ready for creators and developers to unleash their ideas.
More Like This
Discover more flows that match your style.
AI Video Loop Maker
Create mesmerizing endless video loops effortlessly.
iPhone Wallpapers
Turn your ideas into stunning iPhone wallpapers using AI. Explore this workflow to create, customize, and upscale in seconds!
AI Watermark Remover
Remove unwanted watermarks from your images effortlessly with AI Watermark Remover!
What this Workflow Does
This workflow takes a simple text description (a basic image prompt) and turns it into a high-quality AI-generated video clip. It does so by:
- Improving the prompt to match "Midjourney"-style aesthetics.
- Generating a beautiful image.
- Animating that image into a short video.
- Upscaling the video to full HD quality.
In short:
Text ➔ Beautiful Midjourney-style Prompt ➔ Image ➔ Short Animated Video ➔ Full HD Video
Step-by-Step Breakdown
1. Text Input
- Block: Text
- You start with a simple input like:
"Photograph of an astronaut meditating and levitating in the air in the middle of a field of yellow flowers..." - This is a basic user-written prompt, not yet optimized for fancy AI generation.
- You start with a simple input like:
2. Prompt Enhancement (GPT-4o)
- Block: GPT-4o
- The prompt is sent to GPT-4o with a system instruction saying:
"You are an expert in generating Midjourney themed images. Please convert the prompt to a Midjourney-like style."
- GPT-4o rewrites the simple description into a detailed, aesthetic prompt closer to what high-end models like Midjourney would understand.
- For example, it might add:
- Specific details (lighting, atmosphere, artistic style, camera settings).
- Stylistic flourishes (cinematic feel, ultra-detailed textures).
- The prompt is sent to GPT-4o with a system instruction saying:
3. Image Generation (Flux-1.1 Pro Ultra)
- Block: Flux-1.1 Pro Ultra
- The enhanced prompt is fed into Flux-1.1 Pro Ultra, a very high-quality image generation model.
- Output: A realistic or artistic image of the astronaut meditating in the flower field.
4. Video Generation (Google Veo 2)
- Block: Google Veo 2
- The prompt and the image are passed into Google Veo 2, a video generation model.
- It generates a 5-second video clip based on the image:
- Duration: 5 seconds
- Aspect ratio: 16:9 (perfect for widescreen)
- It uses a random seed for slight variations.
- The video shows a short, likely "moving" scene based on the astronaut image (like flowers waving, slight camera motion, etc.).
5. Video Upscaling (ESRGAN Video Upscaler)
- Block: ESRGAN Video Upscaler
- The generated video (which might be lower-res) is passed through a video upscaler:
- Model used: RealESRGAN_x4plus
- Resolution: FHD (Full HD 1920x1080)
- This step sharpens and enhances the video to high quality, removing any noise or blurriness.
- The generated video (which might be lower-res) is passed through a video upscaler:
6. Final Output
- The upscaled, Full HD 5-second video is the final result ready for download or further use (like posting on social media, adding to a portfolio, or making a part of a bigger project).
Why This is Useful
- Takes a basic idea and automates the process of creating a professional-level animated visual.
- Saves hours of manual work: no need to manually prompt Midjourney, Photoshop images, or animate separately.
- Great for:
- Marketing content
- Concept art
- Storyboards
- Short animations for reels, posts, or videos
Ways to improve and customize
More Control over Animation Styles
Current Situation: Google Veo is making a video from a single image + prompt. You rely on Veo’s internal animation logic (it decides camera moves, object motion, etc.).
How to Improve: Add a "Motion Prompt" separately: (e.g., “gentle slow zoom-in on astronaut, flowers swaying slightly, soft wind movement”)
Pass this as an extra control input if Veo (or future video models) supports fine-grained motion prompts.
Result: You can generate different types of animations: slow zoom, parallax effect, timelapse, pan, etc.
Models Used in the Pixelflow
veo-2-image2video
Discover Google Veo 2, an AI-powered image-to-video model with 4K resolution, realistic motion, and cinematic effects for creators and developers.

flux-1.1-pro-ultra
Create stunning visuals effortlessly with Flux 1.1 Pro Ultra. Experience unparalleled image quality and speed.

esrgan-video-upscaler
ESRGAN Video Upscaler: Experience sharper, clearer 4k videos with ESRGAN. This AI-powered video upscaler boosts resolution and reduces artifacts, making your video content look its best. Best Topaz alternative.

gpt-4o
GPT-4o (“o” for “omni”) is our most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models. GPT-4o is available in the OpenAI API to paying customers.
