From the course: Midjourney: Tips and Techniques for Creating Images

How text to image prompts work

- [Instructor] In this video, we're going to look at the basics of how text to image prompts work. Our story starts with diffusion models. This represents the north star of generative AI capabilities today when it comes to still and even video imagery. This is because of experimentation over several decades that have led to these diffusion models and what is now allowing for this exponential growth. Diffusion models fall in the generative model class, and generative AI models are a class of machine learning that can generate new data based on training data. In this instance, imagery when it comes to Midjourney. So here's the basics of the diffusion model process at a glance. We take an image, from that image, you generate a random noise pattern on top of it, until you arrive at a fully random noise pattern. Then you do the reverse process. You take this fully random noise pattern, you de-noise it, and then from that you attempt to create the original image with variations introduced by the noise. Okay, so this is one part of the text to image prompt. Text to image models are a combination. On one end we have a language model, the input of a natural language description that produces an image matching that description. And then you have the generative image model, which is like a diffusion, in this case, a diffusion model, and that produces an image conditioned on the representation of the language model. When you think language model, think none other than ChatGPT, 'cause this is essentially how you're going to write text prompts with Midjourney V5 and beyond. So there's some basic information as it comes to text to image prompting. Keep this in mind as we begin our journey with creating our first prompts.

Contents