Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

If a reference image has an overpowering element (like bright green eyeshadow or bubblegum), it can hijack the generation. Instead of complex negative prompts, simply crop the distracting element out of the reference image and re-upload it to guide the AI toward your intended focus.

Related Insights

Instead of writing prompts from scratch, upload visual references (like a mood board) to ChatGPT. Ask it to describe the visual qualities and language of the images, then use that output as a detailed prompt for AI image generators to replicate the desired style.

Avoid writing long, paragraph-style prompts from the start as they are difficult to troubleshoot. Instead, begin with a condensed, 'boiled down' prompt containing only core elements. This establishes a working baseline, making it easier to iterate and add details incrementally.

Instead of relying on complex text prompts, use a curated mood board as a direct visual input. Generative models like Midjourney can interpret the aesthetic, color, and style from images more effectively than from descriptive words, acting as a powerful communication shortcut.

To generate more aesthetic and less 'uncanny' images, include specific camera, lens, and film stock metadata in prompts (e.g., 'Leica, 50mm f1.2, Kodak Tri-X'). This acts as a filter, forcing the model to reference its training data associated with professional photography, yielding higher-quality results.

Instead of random prompting, break down any desired photo into its fundamental components like shot type, lighting, camera, and lens. Controlling these variables gives you precise, repeatable results and makes iteration faster, as you know exactly which element to adjust.

The initial phase of prompting shouldn't aim for a perfect image. Instead, the goal is to generate quickly and analyze the results to understand how the AI is interpreting your inputs (mood board, prompts, s-refs). This diagnostic step is crucial for efficient iteration.

Midjourney's mood board feature can average out the aesthetics of multiple images, leading to generic results. For more precise control, use individual images as style references (`s-refs`). This allows the model to pull more distinct and impactful stylistic elements.

To get superior results from image generators like Midjourney, structure prompts around three core elements: the subject (what it is), the setting (where it is, including lighting), and the style. Defining style with technical photographic terms yields better outcomes than using simple adjectives.

Instead of accepting an AI's first output, request multiple variations of the content. Then, ask the AI to identify the best option. This forces the model to re-evaluate its own work against the project's goals and target audience, leading to a more refined final product.

For tasks like fixing hands, adding specific objects (e.g., a MacBook), or upscaling, use reasoning models like Nanonana. Think of it as a conversational Photoshop. This avoids complex prompting in Midjourney for fine-grained edits and allows for more precise control over final image details.

Crop Dominant Features from Reference Images to Refine Midjourney's Focus | RiffOn