4 Generative AI Video Concepts

4 Generative AI Video Concepts

It can be really easy to feel overwhelmed by all the AI video options available today. However, let’s break them down to better understand each type and feel simply whelmed. We’ll explore four different AI video formats: Text-to-Video, Video-to-Video, Image-to-Video, and Deepfakes. While there are multiple tools for each, this article will focus primarily on the concepts behind them.


Text-to-video can be likened to a typewriter that produces movies. To create a video, you write a prompt—a visual description of what you want to see. Some tools also allow the use of a negative prompt, where you describe what you don’t want in the video. Additionally, a seed, which is a random number that initializes the generation process, plays a crucial role.

Using the same prompt and seed results in a consistent video, while slight modifications to the prompt can alter the output slightly but retain a similar overall style. When writing video prompts, think more like a screenwriter and less like a Director of Photography. Words like “cinematic” or “shallow depth of field” tend to impact the video more than technical details like camera models or lens focal lengths.

Similar prompt, same Seed

Popular text-to-video tools for filmmakers include Runway Gen-2, Xeroscope Full Journey, and Pika. These platforms typically produce videos ranging from 3 to 16 seconds. Another method, used in Stable Diffusion with an extension called Deform, allows for longer videos and dynamic changes to the prompts over time.


Video-to-video involves taking an existing video and applying a new prompt, which could be either text or an image, to alter its style. This technique has been popularized by tools like Runway Gen-1, where the motion remains the same but the style of the video changes drastically.

In Stable Diffusion, an extension called ControlNet enables similar types of animations, allowing for creative transformations of video content.


Image-to-video is excellent for those seeking more control over their generated video’s appearance. Starting with an image rather than a text description gives a clearer initial direction.

Tools like Pika Labs allow you to add a text prompt to introduce elements such as a horse or a dune buggy into the frame. Others, like Runway, produce stunning videos but may offer less control or might not faithfully replicate the source image.

Stable Diffusion offers the option of using guided images, where you can change the reference image at intervals to direct the video’s development, or use an initial image for a consistent starting point.


Deepfakes bring characters to life, capable of talking and interacting in surprisingly realistic ways. Despite their potential for misuse, many positive applications exist, such as digital presenters or AI avatars.

AI avatars

Tools like Synthesia and HeyGen use deepfake technology for beneficial purposes. Additionally, video-to-video deepfakes have been possible since 2020 with tools like Wave2Lip, which, although trained on low-resolution videos, have paved the way for more advanced uses of image-based deepfakes.

As we continue to explore the possibilities of AI in video production, the only limits will be those of imagination, creativity, and originality. So, take inspiration from novels, art, and your personal experiences. Jot down your ideas, because in the world of AI video, human creativity remains the most crucial skill.

Read related articles: