neural frames logo
Seven most common mistakes people make with text-to-video

Seven most common mistakes people make with text-to-video

In the dynamic landscape of digital media, the power of video content is taking center stage. As an increasingly dominant form of communication, video offers an engaging, interactive, and visually appealing way to convey information. One particular area of innovation that's garnering attention is text-to-video technology. This emerging field offers a broad spectrum of exciting use cases, from education and training to entertainment and marketing.

The concept of text-to-video is simple yet powerful: transforming written content into visually stunning videos. This has the potential to revolutionize how we consume and share information, making it more accessible, engaging, and impactful.

The current status of text-to-video technology is rapidly evolving. Cutting-edge algorithms are being developed to improve the quality and realism of the generated videos, while also making the process more intuitive and user-friendly. Open-source libraries play a crucial role in this landscape, offering a collaborative platform for innovation and development.

One such library that's making waves in this domain is deforum. This open-source project is designed to democratize the text-to-video space, providing a powerful toolkit for developers and content creators alike. Whether you're looking to generate promotional videos from product descriptions or educational content from textbooks, deforum could be the key to unlocking the potential of text-to-video technology.

neural frames offers access to the most cutting-edge advances in text-to-video, leveraging what deforum has been built upon and in this blog post we shall revisit the seven most common misuses of neural frames

1) 6 Models to choose from

neural frames offers a variety of models to choose from. While there are many, many more models out there in the wild, we hand-selected six of those that we find the most astonishing to date. Those are three allrounder models (OpenJourney, Deliberate, DreamShaper) and three specialist models (Realistic Vision, Analog Diffusion, Anything).

The specialists allow for photorealistic, analog photography, or comic/manga styles but can ONLY depict those ones. The allrounders are good for anything, particularly Deliberate and Dreamshaper can produce mind-blowing results.

2) The creepy faces from afar

When faces are small on the images, they typically look pretty bad

The way these models are set up, they can currently not really output nice faces of people far away. Make sure to keep the faces relatively large on the images, either by adding something like "portrait" or "close-up" to the prompt or just by selecting an image where the faces look OK.

3) Custom models

The AI models know basically all vocabulary from the internet, so when you train a custom model on yourself or some other object, you are left with the question: How do I name this object such that the model knows I am referring to this new object that I am showing it?

The solution people have come up with, is to add cryptic letters such as "sks person" or "sks object". In neural frames, we use that phrase to depict the objects of the custom models and can produce astonishing visuals of really anything you want.

sks person as a pixelated gangster rapper

4) 16:9 issues

The models are not trained on 16:9, so sometimes the results with 16:9 or 9:16 can be a bit weird. For instance, sometimes they duplicate objects or make them weirdly bigger or so. If you encounter issues like that, you can alternatively go with 4:3 format which typically works much better. You could also add things in the negative prompt such as "two, double" to try to tell the model to focus on solo objects. But that doesn't always work.

5) Camera movement 1: Keep object in frame

Try to practice your camera movement. With only one movement setting such as "Chill" or "Loco", often times, the camera will move so far that the object of your video actually moves out of frame. Simply add another Box in the timeline with reverse settings and you are good to go!

Control over the camera is what separated pro's from beginner's on neural frames.

6)  Camera movement 2: Smooth movement

If you want to change direction, such as in the above example, it is usually desirable to add another box in the timeline instead of just changing whatever Box you are on. Oftentimes it is better to shorten the current Box, add another one behind, with a certain prompt fade window in between. The promptfade window will interpolate between the parameters of the two settings and therefore cause a smooth transition between those settings.

7) Video duration

You can change the video video duration by editing the last time value on the timeline (in the beginning it's a 30). Currently, the maximum allowed duration is 5 minutes!

Happy rendering!

Built by a physicist in sleepless nights. If you see anything go sideways or have suggestions for how to improve, PLEASE let me know: contact(at) This website is greatly inspired by Deforum, but doesn't actually use it. For inspiration on prompts, I recommend Lexica. IMPRINT