What is Midjourney AI and how does it work?

  • 4 min read
  • May 15, 2023
star wars wes anderson style ai image

Have you ever wished you could conjure a picture straight out of your imagination? You now can within a matter of minutes, thanks to image generators like Midjourney. It doesn’t matter if you lack artistic skills or haven’t even held a paintbrush in your life. Artificial intelligence can do all of the heavy lifting – all you need is a bit of text that describes the image you have in mind. But where did Midjourney come from all of a sudden and how does it work? Here’s everything you need to know.

What is Midjourney?

MidJourney Community Showcase

Matt Horne / Android Authority

Midjourney is an example of generative AI that can convert natural language prompts into images. It’s only one of many machine learning-based image generators that have emerged of late. Despite that, it has risen to become one of the biggest names in AI alongside DALL-E and Stable Diffusion.

With Midjourney, you can create high-quality images from simple text-based prompts. You don’t need any specialized hardware or software to use it either as it works entirely through the Discord chat app. The only downside? You’ll have to subscribe to a Midjourney plan before you can start generating images. That’s unlike much of the competition, which generally provides at least a few image generations for free.

Still, the barrier to entry with Midjourney is extremely low and anyone can use it to generate real-looking images within a matter of minutes. The results can range from uncanny to visually stunning, depending on the prompt.

Midjourney can generate stunning images that look extremely convincing.

In some cases, images from Midjourney have even deceived experts in photography and other domains. Likewise, you may have seen some extremely convincing AI-generated images on social media. Examples range from Pope Francis dressed in a puffer jacket to Trump supposedly getting arrested days before the actual event. But we’ve also seen some creative generations like a Star Wars scene in the style of Wes Anderson (pictured above).

Unlike DALL-E, which is backed by ChatGPT’s creator OpenAI, Midjourney describes itself as a self-funded and independent project. Moreover, it hasn’t received any external funding to date. On the other hand, OpenAI has raised as much as $10 billion from Microsoft and a handful of other investors. So given Midjourney’s humble roots, its results are quite impressive.

How does Midjourney work?

midjourney example prompt

Calvin Wankhede / Android Authority

We don’t know everything about Midjourney’s inner workings because it’s closed-source and runs on proprietary code. That said, we know enough about the underlying technology to offer a general explanation.

Midjourney relies on two relatively new machine learning technologies, namely large language and diffusion models. You may already be familiar with the former if you’ve used AI chatbots like ChatGPT. A large language model first helps Midjourney understand the meaning of whatever you type into your prompts. This is then converted into what is known as a vector, which you can imagine as a numerical version of the prompt. Finally, the vector guides another complex process known as diffusion.

Midjourney uses a diffusion model to turn random noise into beautiful art.

Diffusion has only become popular within the past decade or so, which explains the sudden onslaught of AI image generators. In a diffusion model, you have a computer gradually add random noise to its training dataset of images. Over time, it learns how to recover the original image by reversing the noise. With enough training, the model can then generate brand-new images through denoising a random image.

So what does it look like from the perspective of an AI image generator? When you enter a text prompt like “white cats set in a post-apocalyptic Times Square,” it starts off with a field of visual noise. You can think of this first step as equivalent to television static. The image doesn’t look like anything at this point. However, a trained AI model can use latent diffusion to subtract the noise in steps. And eventually, it will yield a picture that resembles objects and ideas in the real world.

As a side note, this is also why you typically need to wait a minute or two for an AI-generated image to fully develop. If you stop the process earlier, you’ll get a noisy image that hasn’t gone through enough denoising steps.

How much does Midjourney cost?

barack obama ai image

While we’ve seen chatbots like ChatGPT and Bing Chat offer nearly unlimited usage for free, the same cannot be said for image generators. Virtually all of them have some limits in place, with Midjourney not even offering a free trial. This is because each image generation task requires a lot of computing power, specifically graphics processing units (GPUs). Furthermore, each GPU has finite video memory, which is used in large amounts for the denoising process.

So with that in mind, it’s not surprising that a state-of-the-art AI image generator will cost you some money. We have a dedicated guide on Midjourney’s pricing and subscription tiers, but you’ll have to pay a minimum of $10 per month. That nets you 3.3 hours of GPU time, good for roughly 200 image generations.

Midjourney’s higher-end plans grant you unlimited images in Relaxed mode, but you’ll have to wait as long as 10 minutes. If you don’t need the absolute best quality, we recommend checking out alternative AI image generators instead. While most free options haven’t caught up to Midjourney yet, they’re still plenty of fun to use.


Midjourney was trained on existing image samples, including art from various sources, to generate brand-new pictures. Some artists believe that AI image generators have infringed on their copyright by using their work for training. However, the other side argues that the training process falls under the category of fair use.

No, Midjourney cannot create a full video. But if you only want a process video of Midjourney’s image generation process, you can add the –video parameter to the end of your prompts.

Midjourney uses a machine learning technique known as diffusion, but it’s unclear if it’s based on the open-source Stable Diffusion model.

No, Midjourney is a closed-source and proprietary tool developed by a San Francisco-based research startup. It aims to turn profitable.

Midjourney is owned by an independent research firm with the same name. The image generator was founded in San Francisco by David Holz, who also co-founded the hand-tracking company Leap Motion a decade prior.