LocalGhost avatar on Tensor.Art

LocalGhost

749409464859341833
Just4Fun. 😊 Feel free to remix all my images. 🤩👌👍😁👉Follow my Instagram: localghost2000 ( @onnaicla )
914
Followers
153
Following
37.1K
Runs
6
Downloads
107.8K
Likes
347
Stars
Latest
Most Liked
🎬 Guide to Writing Multi-Shot Text-to-Video Prompts for Seedance 2.0

🎬 Guide to Writing Multi-Shot Text-to-Video Prompts for Seedance 2.0

Hello Everyone! 👋If you're just getting started with Seedance 2.0 or want to improve the quality of your AI-generated videos, this guide is for you. One of the most effective ways to create cinematic, consistent, and engaging videos is by using multi-shot prompts. Instead of describing everything in a single scene, you'll learn how to structure your prompts like a filmmaker, breaking your story into clear shots that flow naturally from beginning to end.Let's dive in! 🎬Why Use Multi-Shot Prompts? Many beginners write prompts like this:"A girl sits in a cafe drinking coffee, the camera slowly moves closer."The result can be unpredictable. The camera may move oddly, character consistency may break, and the video often feels flat.The secret is simple:Think like a film director, not an image generator user.Seedance 2.0 performs much better when scenes are organized into clear, sequential shots.🎥 Think Like a DirectorBefore writing your prompt, answer these three questions:1. Who is the main character?Examples:A Japanese schoolgirlAn office workerAn astronaut2. What is happening?Examples:Waiting for someoneRunning late for schoolDriving a vintage car3. What is the sequence of events?Example:Walking down the streetChecking the timePanicking and runningThis sequence becomes your multi-shot structure.Basic Multi-Shot StructureThe simplest format:Shot 1: [Scene description] Shot 2: [Scene description] Shot 3: [Scene description]Avoid cramming everything into one scene.Let the story flow naturally from shot to shot.The Simple Seedance FormulaEvery shot should ideally contain:Location Character Action Camera Movement MoodExample:Shot 1: A young Japanese schoolgirl stands beside a quiet railway crossing on a bright morning. She gently adjusts her school bag while looking into the distance. Medium shot. Slow camera push-in. Peaceful atmosphere.The Most Reliable Shot StructureShot 1 = Establishing ShotIntroduce the location and character.Example:Shot 1: A beautiful Japanese schoolgirl stands beside a quiet railway crossing on a sunny morning. The wind gently moves her long hair. Wide shot. Slow cinematic push-in.Purpose:✅ Establish the setting✅ Introduce the character✅ Define the moodShot 2 = Action ShotThe character does something.Example:Shot 2: The girl hears a distant train horn and turns her head. She smiles softly and takes a step forward. Medium shot. The camera slowly orbits around her.Purpose:✅ Add movement✅ Increase engagementShot 3 = Payoff ShotThe key moment or ending.Example:Shot 3: The train passes behind her. She looks toward the camera and smiles warmly. Close-up shot. Gentle lens compression. Bright cinematic atmosphere.Purpose:✅ Deliver the climax✅ Leave a memorable impressionCommon Camera MovementsPush InThe camera moves closer to the subject.Slow camera push-in.Best for:Emotional momentsDramaRomancePull BackThe camera moves away from the subject.Camera slowly pulls back.Best for:EndingsRevealing environmentsOrbitThe camera circles around the subject.Camera slowly orbits around the subject.Best for:Character-focused scenesFashion shotsEmotional momentsTracking ShotThe camera follows the character.Tracking shot following her movement.Best for:WalkingRunningDrivingCrane UpThe camera rises upward.Camera cranes upward.Best for:Cinematic endingsLandscape revealsThe 15-Second Story FormulaThis structure works exceptionally well for short videos:Shot 1 = Setup Shot 2 = Development Shot 3 = PayoffExample:SetupThe character discovers something.DevelopmentThe character reacts.PayoffA surprise or conclusion occurs.ExampleTheme: Running Late for SchoolShot 1: A beautiful Japanese schoolgirl runs along a sunny residential street while looking at her wristwatch. Wide tracking shot. Bright morning atmosphere. Shot 2: She notices the time is already 9:00 AM. Her eyes widen in panic. Medium shot. The camera quickly pushes in. Shot 3: She finally reaches the school gate, breathing heavily. Then she notices a sign that says "Sunday." She freezes in confusion. Close-up shot. Comedic atmosphere.This type of structured storytelling is usually much more engaging than a single disconnected scene.Common Beginner Mistakes❌ Too Many Actions in One ShotBad:She runs, jumps, laughs, turns around, opens a door, sits down, drinks coffee, and reads a book.The model is trying to do too much at once.Better:Shot 1: Running Shot 2: Opening the door Shot 3: Drinking coffee❌ Random Location ChangesBad:Shot 1: Beach Shot 2: Space station Shot 3: Medieval castleSudden environment changes often cause character inconsistency.❌ Excessive Unimportant DetailsBad:The chair is made of imported oak wood from...Seedance cares much more about:CharacterActionCameraVisual environmentThe Golden Rules of Seedance 2.0 ✨Use 3 to 5 ShotsThis range is usually the most stable for short videos.Repeat Key Character DescriptionsDon't be afraid to mention the main character repeatedly in each shot.This helps maintain consistency.Keep Camera Instructions SimpleChoose one primary movement:Push-inOrbitTrackingPull-backAvoid stacking multiple camera movements in the same shot.Focus on One StoryA 15-second video is not a two-hour movie.One strong idea almost always performs better than ten ideas squeezed together.🎵 Audio in Seedance 2.0Besides visuals and camera movements, Seedance 2.0 can also interpret audio descriptions within your prompt. Adding audio helps create a more immersive and cinematic experience by telling the model not only what the audience should see, but also what they should hear.Basic Audio StructureA simple format looks like this:Shot 1: [Visual Description] Audio: [Sound Description]Example:Shot 1: A young woman stands beside a railway crossing on a sunny afternoon. The wind gently moves her hair. Wide shot. Slow camera push-in. Audio: Gentle wind blowing, distant train crossing bell, soft ambient city sounds.Types of Audio You Can UseAmbient SoundsAmbient sounds establish the atmosphere of the scene.Examples:Audio: Birds chirping, gentle wind, distant traffic.Audio: Ocean waves, seagulls, light sea breeze.Audio: Rainfall, distant thunder, dripping water.Sound Effects (SFX)These sounds are directly related to actions or objects in the scene.Examples:Audio: Footsteps on wet pavement.Audio: Car engine idling softly.Audio: Door creaking open.Audio: Paper rustling.DialogueDialogue can be included when a character speaks.Examples:Audio: She softly says, "What are you looking at?"Audio: He whispers, "I finally found it."Background MusicMusic helps reinforce the mood and emotion of a scene.Examples:Audio: Soft emotional piano music.Audio: Upbeat pop music.Audio: Epic orchestral soundtrack.Audio: Lo-fi chill background music.Recommended Multi-Shot FormatFor modern video generation workflows, a complete prompt often follows this structure:Shot 1: [Visual Description] Shot 2: [Visual Description] Shot 3: [Visual Description] Audio: [Audio Description]Complete ExampleShot 1: Wide establishing shot of an empty train platform at dusk after rainfall, wet ground scattered with shallow puddles reflecting dim station lights, soft blue-gray evening sky with lingering clouds, faint mist drifting in the air, subtle ambient motion from dripping water and gentle wind, atmosphere calm and slightly melancholic. Shot 2: Medium shot—A Japanese schoolgirl with long straight hair and full bangs sits alone on a bench under a dim station light, posture slightly slouched, loosely holding her phone without using it, her gaze unfocused toward the tracks, distant train headlights begin to emerge behind her, soft wind brushing her hair and uniform. Shot 3: Close-up—Her phone screen lights up, displaying a simple message: “Did you get home safely?” The soft glow illuminates her face in the dim environment, her thumb hovers above the screen, she pauses briefly, her eyes soften as a subtle emotional shift begins. Shot 4: Cutaway—A train rushes past the platform with a low rumble, wind flows through the station, her hair and skirt gently sway, reflections ripple across the puddles, streaks of moving light glide across her face, the moment feels quietly transitional. Shot 5: Close-up—She exhales slowly, her shoulders relax, a faint and fragile smile appears, she begins typing a reply, warmth subtly replaces the earlier emptiness. Shot 6: Wide shot from behind—She lowers her phone, holding that small smile, she gazes ahead for a brief moment as if gathering strength, then stands up, adjusts her bag, and walks away along the platform, her figure gradually fading into the distance. Audio: Distant train rumble, soft and low, Gentle evening wind and subtle station ambience, Light water dripping in the background, Soft notification sound when the message appears, Very light emotional piano entering toward the latter half, carrying through the ending, No dialogueAudio Tips for Seedance 2.0 ✨Keep Audio Descriptions SimpleGood:Audio: Gentle rain, distant thunder.Less Effective:Audio: The rain should sound as if it is falling at approximately...Short and clear descriptions are usually interpreted more reliably.Match Audio to the ActionIf a character is running:Audio: Fast footsteps.If a character is driving:Audio: Engine hum, road noise.If a character is speaking:Audio: Dialogue.The audio should naturally support what is happening on screen.Layer Sounds NaturallyA good cinematic soundscape often combines:Ambient + SFX + MusicExample:Audio: Ocean waves, seagulls, soft piano music.This combination creates a richer and more immersive viewing experience.That's all for this guide! 🎥Hopefully this guide gives you a solid foundation for creating better multi-shot prompts in Seedance 2.0. Don't be afraid to experiment with different shot sequences, camera movements, and storytelling styles. The more you practice, the more natural it becomes.Thank you for reading, and happy prompting! ✨
1
🎬 Mastering Multi-Character Prompts in Text-to-Image Generation

🎬 Mastering Multi-Character Prompts in Text-to-Image Generation

Hello everyone, in this article we’re going to explore how to properly write text-to-image prompts for generating two or more characters in a single frame. This is a common challenge for many creators, and if not handled correctly, it often leads to mixed attributes, unclear identities, or messy compositions. Let’s break it down step by step so you can create cleaner, more consistent multi-character results.A practical guide to keeping your characters from blending into chaosCreating a single character in a text-to-image prompt is already an art.Creating two or more characters in one frame? That’s where things turn into a delicate choreography.Without proper structure, models tend to:Merge characters into oneSwap attributes randomlyIgnore who’s doing whatThis guide will help you avoid that and consistently generate clean, well-defined multi-character scenes.🧠 The Core QuestionIs it enough to describe each character, or should we give them names?Short Answer:You should use both.Descriptions alone are not enough. Names alone are not enough.The strongest prompts combine:Unique identifiers (names/labels)Clear individual descriptionsSpatial positioningInteraction cuesThink of it like directing actors on a stage.Without names and direction, everyone improvises into confusion.🎭 Why Descriptions Alone Often FailExample:“Two girls, one with long hair, one with short hair”This often leads to:Mixed features (both characters end up with medium hair)Clothing confusionIdentity blendingTo the model, this still feels like one loosely defined concept, not two separate individuals.🏷️ Why Naming Characters HelpsGiving each character a name or label creates separate identity anchors.Example:Aiko → Character 1Mina → Character 2Now the model can track:Who has long hairWho is smilingWho is interactingIt’s like assigning roles in a script instead of saying “someone does something.”✨ Best Practices for Multi-Character Prompts1. Use Clear IdentifiersYou don’t have to use real names. Any consistent label works:Names: Aiko, MinaNeutral labels: Girl A, Girl BPositional labels: Left girl, Right girl👉 The key is consistency throughout the prompt2. Describe Each Character SeparatelyAvoid mixing descriptions in one sentence.✅ Good StructureAiko: long black hair with bangs, wearing a white dress, smiling gently Mina: short brown bob hair with bangs, wearing a denim jacket, looking at Aiko❌ Problematic StructureTwo girls with long and short hair wearing dresses and jackets👉 Separation prevents attribute bleeding3. Define Spatial Positioning (Critical)Without positioning, characters may:Overlap unnaturallySwitch placesAppear disconnectedExamples:“Aiko on the left, Mina on the right”“Mina sitting, Aiko standing behind her”“Standing side by side”👉 This anchors characters in the scene like coordinates on a map.4. Add Interaction Between CharactersWithout interaction, characters feel like:“Two unrelated NPCs randomly spawned in the same frame”Add relational cues:“Looking at each other”“Holding hands”“Laughing together”“Aiko touching Mina’s shoulder”👉 Interaction reinforces who is who and creates narrative cohesion.🔥 Side-by-Side Comparison❌ Weak PromptTwo girls, one with long hair and one with short hair, smiling in a parkLikely issues:Mixed featuresNo clear identityFlat composition✅ Strong PromptAiko: a 20-year-old Japanese woman with long black hair and bangs, wearing a white summer dress, standing on the left Mina: a 19-year-old Japanese woman with short brown bob hair and bangs, wearing a denim jacket and skirt, standing on the right, looking at Aiko Both smiling, soft sunlight, park backgroundResult:Clear separationStable identitiesNatural composition🧩 Advanced Tips (For More Control)🔁 Reinforce Names During ActionsRepeat identifiers when describing actions:“Mina looks at Aiko while Aiko smiles back”This reduces confusion in complex scenes.✂️ Avoid Overly Long Blended SentencesLong sentences with multiple subjects increase ambiguity.Break them into structured segments instead.🧬 Keep Attributes LocalizedDon’t stack attributes across characters in one phrase.Bad:“Aiko and Mina wearing white and black dresses”Good:“Aiko wearing a white dress, Mina wearing a black dress”🎯 Use Consistent StructureA clean pattern improves model understanding:Recommended flow:Character A (full description)Character B (full description)PositioningInteractionEnvironment & lighting🧠 Mental Model: Treat It Like a ScriptThink of your prompt as:🎬 A casting sheet (who are the characters?)📍 A stage layout (where are they?)🎭 A direction note (what are they doing?)When you do this, the model stops guessing and starts following.👉 Best formula:Name + Description + Position + Interaction = Consistent ResultsIf you push this structure further, you can handle:3+ charactersComplex interactionsEven multiple versions of the same character in one frameAnd suddenly, your prompts stop feeling like chaos… and start feeling like direction. 🎬✨🎯 ConclusionIn multi-character prompt writing, relying on just descriptions is not enough, and using only names without clear details is also insufficient. The most reliable approach is to combine both, supported by clear positioning and interaction between characters.By assigning each character a distinct identity, describing them separately, defining where they are in the scene, and specifying how they interact, you give the model a structured “map” to follow. This greatly reduces confusion and prevents common issues like attribute mixing or character merging.In short, the key formula is simple but powerful:Name + Description + Position + Interaction = Consistent ResultsThank you for reading, and I hope this article helps you improve your prompt-writing skills and achieve more stable, high-quality results in your creations. Feel free to experiment with these techniques and adapt them to your own style. Happy creating, and see you in the next exploration! ✨👉 Example I made (Z-Image & Flux)best quality, ultra high res, photorealistic, realistic photography, documentary style realism, cinematic candid moment, Japanese high school classroom, warm afternoon sunlight through classroom windows, soft cinematic lighting, slice of life atmosphere, authentic youthful energy, dynamic group interaction, classroom blackboard, desks slightly moved aside, school bags and notebooks scattered naturally, four beautiful Japanese school girls standing together before a group photo session, overlapping body language, natural laughter, frozen candid motion, highly detailed facial expressions, realistic skin texture, subtle natural makeup full body shot, eye-level camera angle, natural composition, Yui Tanaka, 18-year-old Japanese girl, beautiful elegant face, very long straight black hair, full bangs, slim hourglass figure, navy sailor school uniform, black knee socks, brown loafers, gently reaching out to fix Aoi’s crooked ribbon, body slightly leaning toward the group, soft laughing expression Aoi Nakamura, 18-year-old Japanese girl, adorable shy facial expression, twin braids, soft straight bangs, curvy figure, classic sailor uniform, red ribbon tie, black shoes, embarrassed smile, lightly covering her mouth with one hand, shoulders turning away bashfully while laughing Mika Sato, 18-year-old Japanese girl, sporty beauty face, short bob haircut, full bangs, energetic smile, proportional athletic figure, modern blazer-style school uniform, plaid skirt, white sneakers, dramatically leaning forward in a fake serious model pose, one hand on her hip, other hand pointing playfully toward imaginary camera Rin Kobayashi, 19-year-old Japanese girl, stylish beautiful face, wavy medium-length dark brown ombre hair, side-parted full bangs, long sidelocks, hair over shoulder, confident expression, cream cardigan over school uniform, black tights, loafers, curvy figure, holding smartphone up like an unofficial photo director, laughing while giving playful pose instructions to Mika, other hand lightly touching Mika’s shoulder natural interaction, friendship chemistry, realistic classroom environment, subtle motion blur feeling, candid pre-photo atmosphere, cinematic depth of field, realistic shadows, soft warm color grading, immersive storytelling, lifelike proportions, highly detailed eyes and facial features, modern Japanese school aesthetic👉 Result:
1
🎥 Mastering Camera Angles & Perspective in Text-to-Image

🎥 Mastering Camera Angles & Perspective in Text-to-Image

Hi everyone,In text-to-image generation, the camera acts as an invisible storyteller. It determines what the viewer notices, how the subject feels, and why an image carries emotion. By choosing the right camera position and point of view, even a simple prompt can evolve into a cinematic, powerful, or iconic visual.In this article, I will walk you through the most commonly used camera angles and perspectives in text-to-image prompting, explained in a simple, practical, and easy-to-apply way, complete with prompt examples.🧭 Overview: Common Camera Angles & Points of ViewBefore diving into detailed explanations, here is a quick overview of the camera angles discussed in this guide:👁️ Eye-Level Shot – neutral, natural perspective⬆️ High-Angle Shot – looking down on the subject⬇️ Low-Angle Shot – looking up at the subject🦅 Bird’s-Eye View – straight top-down perspective🐜 Worm’s-Eye View – extreme low-angle from the ground🎭 Over-the-Shoulder Shot (OTS) – narrative, viewpoint-based framing🔍 Close-Up / Extreme Close-Up – focus on details or emotionEach angle carries a distinct visual meaning. The sections below explain how and when to use them effectively in text-to-image prompts.👁️ 1. Eye-Level Shot📌 DescriptionThe camera is positioned at the same height as the subject’s eyes.🎯 Visual ImpressionNatural and realisticNeutral and balancedCreates a sense of equality🧩 Best Used ForPortraits, lifestyle scenes, fashion catalogs, everyday moments.📝 Prompt ExampleA young woman standing on a city sidewalk, eye-level camera angle, natural proportions, realistic lighting, casual urban atmosphere⬆️ 2. High-Angle Shot📌 DescriptionThe camera is placed above the subject and tilted downward.🎯 Visual ImpressionSubject appears smallerVulnerable, gentle, or isolated mood🧩 Best Used ForEmotional scenes, loneliness, observational perspectives.📝 Prompt ExampleA lone girl sitting on a bench, high-angle shot, camera looking down, quiet mood, soft shadows, cinematic composition⬇️ 3. Low-Angle Shot📌 DescriptionThe camera is positioned below the subject and tilted upward.🎯 Visual ImpressionPowerful and dominantHeroic or authoritative presence🧩 Best Used ForMain characters, fashion editorials, strong or confident figures.📝 Prompt ExampleA confident female model wearing a red blazer, low-angle camera view, dramatic lighting, powerful presence🦅 4. Bird’s-Eye View (Top-Down View)📌 DescriptionThe camera is placed directly above the subject, facing straight down.🎯 Visual ImpressionGraphic and structuredEmphasizes patterns, shapes, and layoutFeels abstract or conceptual🧩 Best Used ForStreet scenes, conceptual design, spatial exploration.📝 Prompt ExampleBird’s-eye view of a woman crossing a red street, top-down perspective, strong geometric composition, minimal shadows🐜 5. Worm’s-Eye View📌 DescriptionThe camera is placed extremely low, almost touching the ground, looking upward.🎯 Visual ImpressionDramatic and monumentalExaggerated scale and depth🧩 Best Used ForArchitecture, avant-garde fashion, extreme cinematic shots.📝 Prompt ExampleWorm’s-eye view of a fashion model walking between tall buildings, exaggerated perspective, cinematic scale🎭 6. Over-the-Shoulder Shot (OTS)📌 DescriptionThe camera is positioned behind one subject’s shoulder, framing another subject in front.🎯 Visual ImpressionNarrative and immersiveViewer feels present in the scene🧩 Best Used ForDialogue scenes, storytelling, cinematic compositions.📝 Prompt ExampleOver-the-shoulder shot from behind a man, focusing on a woman’s face, shallow depth of field, intimate cinematic mood🔍 7. Close-Up & Extreme Close-Up📌 DescriptionThe camera tightly frames the subject’s face or a specific detail.🎯 Visual ImpressionIntimate and emotionalStrong psychological focus🧩 Best Used ForFacial expressions, beauty shots, intense emotional moments.📝 Prompt ExampleExtreme close-up portrait of a woman’s eyes, eye-level camera, soft lighting, ultra-high detail, emotional expression🛠️ Practical Tips for Writing Camera Prompts✅ Mention the camera angle early in the prompt✅ Combine angle with shot distance (wide shot, medium shot, close-up)✅ Match the angle with the intended emotion or story✅ Avoid stacking too many camera angles in one prompt📝 Simple Combined ExampleLow-angle medium shot of a stylish woman in a red-and-black outfit, cinematic lighting, confident mood🌐 Model Compatibility: Is This Universal?Yes, the use of camera position and point of view is universal across text-to-image models. However, how accurately each model interprets and applies camera angles can vary.Think of camera angles as a shared visual language. Every model understands the vocabulary, but each has its own accent.🧠 How Different Models Interpret Camera Angles🔹 FLUX🟢 Highly accurate and literalStrong adherence to camera position and perspectivePrompt order matters significantlyExcellent for cinematic, editorial, and realistic compositions📌 Best practice: Place the camera angle early in the prompt.🔹 SDXL🟢 Stable and reliableUnderstands standard camera terminology wellExtreme angles may be softenedPerforms best when angle and framing are combined📌 Best practice: Combine camera angle with shot distance.🔹 Stable Diffusion (SD 1.5 & derivatives)🟡 Conceptually correct, but more interpretiveCamera angles influence composition more than literal positioningExtreme perspectives can be inconsistent📌 Best practice: Reinforce angles with descriptive cues.🔹 z-Image🟡🟢 Aesthetic-driven interpretationCamera angles are recognized but may be stylizedVisual mood sometimes takes priority over technical accuracy📌 Best practice: Pair angles with mood or artistic descriptors.🔹 Pony & Anime-Based Models🟡 Stylistic rather than technicalCamera angles are interpreted as visual feelingPerspective is often exaggerated or illustrated📌 Best practice: Combine camera terms with illustration context.📌 Universal Prompt Formula (Safe for All Models)[Camera Angle] + [Shot Distance] + [Subject] + [Environment] + [Mood / Style]📝 Example:Low-angle medium shot of a woman in a red blazer, urban background, dramatic lighting, confident mood⚠️ Important Note: About Prompt Accuracy & Model InterpretationWhile camera angles and prompt structures provide strong guidance, no text-to-image model can guarantee 100% literal compliance with every instruction.This is not a flaw in the prompt, nor an error in understanding camera concepts. It is a natural characteristic of how generative models work.Why Prompts Are Not Followed 100%?🔹 Probabilistic GenerationText-to-image models generate images based on probability, patterns, and learned visual associations, not strict rules. Even clear instructions may be interpreted creatively.🔹 Concept PrioritizationModels often prioritize dominant elements such as subject, style, or mood over technical details like precise camera placement.🔹 Training Bias & Dataset InfluenceSome camera angles are more common in training data than others. Rare or extreme perspectives may be softened or approximated.🔹 Model-Specific InterpretationEach model translates camera terminology differently, ranging from literal (FLUX, SDXL) to stylistic (anime-based models).What This Means for Creators✅ Camera prompts should be treated as directional guidance, not rigid commands✅ Iteration, refinement, and rerolling are normal parts of the creative process✅ Small adjustments in wording can significantly affect results📌 In practice, a well-written camera prompt dramatically increases the likelihood of achieving the intended perspective, even if it is not perfectly replicated every time.🎬 Final ThoughtsCamera position and point of view form a shared visual language across all major text-to-image models. While results may vary, understanding and applying these concepts gives creators far more control than relying on vague descriptions alone.Think of prompts not as strict instructions, but as directorial cues. The clearer the direction, the closer the model gets to your vision.Thank you for reading. I hope this article proves useful and helps you gain better control over camera angles and perspective, so you can create images that not only look good, but truly match the vision you have in mind. Happy experimenting, and may your prompts always lead to satisfying results. 🤩✨
7
A quick guide in the sequences/orders of using LoRA

A quick guide in the sequences/orders of using LoRA

Hello everyone, in this guide I will give a brief explanation of how to use LoRA, especially in terms of the sequence/order of placement.Question: I want to generate a text2image image. In addition to using the Base Model, I will also use several LoRAs. The question is, will the order of the LoRAs affect the image? For example: the first LoRA is Extra Detailer, the second LoRA is Enhanced Lightning, and the third LoRA is Realistic Skin.The short answer: yes, the order of LoRA can have an effect, but it's not always dramatic. Think of LoRA as a transparent layer layered on top of a base painting. The order determines who gets the last word in the overlapping areas. 🎨Let's analyze it carefully.1️⃣ How LoRA works (brief but precise)Most LoRA models work by adding biases or weight modifications to the base model on the same layer.If two LoRA models touch the same area (e.g., facial details, lighting, or skin texture), then:LoRAs loaded later tend to be more dominantEspecially if the weight is similar or quite highIf LoRA touches different areas, the sequence is barely noticeable.2️⃣ Analyze the LoRA example you are asking aboutOrder:Extra DetailerEnhanced LightningRealistic SkinWhat happened?Extra DetailerAdding micro-details, textures, and sharpness. This is global in nature.Enhanced LightningChange contrast, highlights, shadows, sometimes color tones.Realistic Skin (last)Usually focuses on the face and skin: pores, softness, color blending.📌 Since Realistic Skin was loaded last:He can override the rough detail effects of Extra Detailer.The skin becomes more natural, not too "crispy"Lighting is still felt, but adjusted to the skin➡️ This is a pretty reasonable order for realistic results.3️⃣ Example if the order is reversedFor example:Realistic SkinExtra DetailerEnhanced LightningThe results can be:The skin is initially realisticThen “sharpened again” by Extra Detailer → pores can be too strongThe final lighting can make the skin look plastic or over-contrast.It's not wrong, but the characters are different.4️⃣ Rules of thumb (that practitioners actually use)Use this principle:Global LoRA first → specific LoRA laterIdeal order:🔹 Style / Detail / Texture LoRA🔹 Lighting / Mood / Color LoRA🔹 Anatomy / Face / Skin / Character LoRAIn other words:The ones you most want to “maintain” the effect of, put last.5️⃣ Important notes that are often forgottenWeight is more influential than orderLoRA weight 1.2 at the beginning can still override LoRA weight 0.6 at the endClip vs UNetIf LoRA targets are different (e.g. one is dominant in CLIP, one in UNet), the order becomes less significant.Sampler & CFGHigh CFG can make LoRA conflicts more visibleConclusion 🎯✔️ The LoRA order does matter, especially if:The effects overlap each otherSimilar weightTouching the face / lighting / details togetherQuestion: If I use the following LoRA settings globally: Face, Detailer, Lighting, Realistic, Correction (hand/finger), Background Detail. And all the weights are the same, say they all use 0.8 (the default weight). What would be a good LoRA order?Answer: Imagine you're putting together a movie scene on a mini-stage. There's a set crew, a lighting designer, a makeup artist, and a final editor. They all have the same volume (weight 0.8), so whoever comes in last will have the most say in the final touches. 🎬Assuming all weights are the same (0.8) and that LoRAs potentially touch overlapping areas, this is the most stable and “sane” sequence globally:🔑 Main principles (in summary)General → specific → corrective → finishing touchesThose who shape the world are done first, those who correct mistakes come later.✅ Recommended LoRA order1️⃣ Background detailsReason:Most global and insensitiveDetermine the texture of the environment, architecture, foliage, etc.Hardly any conflict with the face or anatomyIt's like building a stage set first.2️⃣ LightingReason:Lighting affects the perception of all objectsIt is better to be “accepted” by the face and skin, not to be inflicted on it.The lights are installed before the actor is made up.3️⃣ DetailerReason:Increasing global sharpnessIf placed too late, it can damage the skin or hands.This is basic sharpening before the touch of realism.4️⃣ RealisticReason:Adjusting natural texture: skin blending, tone, micro-imperfectionTaming the effects of previous over-detailingMake-up stage so it doesn't look digital.5️⃣ FaceReason:Very sensitiveOften adjust the proportions, eyes, nose, expressionMust come after lighting & realism for consistencyThe main actors are made up after the stage and lights are ready.6️⃣ Correction (hand/finger)Reason:Corrective in natureOften “override” other LoRAsThe safest place is at the end.The editor comes last, correcting anything strange.🧾 Final order (summarized)Background Detail → Lighting → Detailer → Realistic → Face → Hand/Finger Correction ⚠️ Important note (small but valuable gold)With the same weight (0.8), this sequence really feels the effect.If the hands still don't fit:hand correction can be increased to 0.9–1.0 without messing up the faceIf the face becomes too “cartoonish”:lower the face LoRA to 0.7, don't change the order yet🎯 Conclusion✔️ The LoRA order is not about right or wrong, but who has the final say.Question: I use Flux and SDXLAnswer: For Flux and SDXL, the rules are philosophically similar, but the sense of influence of the LoRA order is different. Think of Flux as fast-flowing, adaptive water, while SDXL is like thick oil paint that leaves brush marks. Therefore, the order needs to be adjusted slightly. 🎨Below I give a separate optimal sequence, assuming:All LoRA weight = 0.8LoRA Types: Face, Detailer, Lighting, Realistic, Hand/finger correction, Background detailsTarget: realistic and stable🔷 FLUX (all variants)Flux is more “context-respecting,” so LoRA coming in last is less brutal, but still impactful.✅ Best order for FLUXBackground Detail → Lighting → Detailer → Face → Realistic → Hand/Finger Correction Why is it different?Realistic behind the faceFlux blends more smoothly, so realism can “reset” the face without destroying identity.Detailing should not be too finalIf it's too far back, Flux tends to make the skin too rough.📌 Flux Notes:Hand correction in the last position is almost always the safest.Flux is relatively tolerant, so this sequence is stable even for full body🔷 SDXLSDXL is much more sensitive to the “last word.” The LoRA sequence in SDXL has a more pronounced effect.✅Best order for SDXLBackground Detail → Lighting → Detailer → Realistic → Face → Hand/Finger Correction Why is the face later?LoRA faces in SDXL often:change the structureexpressioneye & nose shapeIf placed before realism, the result can feel “polished” and lose character.📌 SDXL Notes:Realistic functions as a balance of detailThe face should “lock identity” at almost the end🧠 Quick summaryModel Key sequenceFlux: background → lighting → detailer → face → realistic → correctionSDXL: background → lighting → detailer → realistic → face → correction⚠️ Hassle-free tuning tipsBroken hand?👉 increase LoRA correction to 0.9–1.0Skin too sharp?👉 lower Detailer to 0.6–0.7Does face feel “less alive”?👉 increase the face LoRA 0.85 without changing the order🎯 Conclusion✔️ LoRA order still matters in Flux and SDXL, but:Flux = more flexibleSDXL = more sensitive✔️ Weight overrides all sequences/orders 👉 Higher weight beats smaller weight for similar or identical LoRAThat's all, folks. I created this guide based on my experience in generating text into images. Since I mostly use the Flux and SDXL/Pony Base Models, I think it won't be much different for other Base Models, so feel free to try it yourself.Final Note: This is just a guide, so it's not absolute; it all comes down to your creativity and imagination.
5
Guide to creating full body images using Flux

Guide to creating full body images using Flux

Hi everyone, in this guide I will try to explain the correct way to create a full body image using Flux (other models will most likely be able to do this too). You may have experienced the difficulty of creating full-body images, as the results are usually cropped, usually with a 2:3 image ratio (768x1152) and other variations. One effective way is to change the image ratio to be larger in height and smaller in width (e.g., 800x2048). But actually, that's unnecessary because Flux is very "stubborn."Why is Flux so “stubborn” about full body?1️⃣ Data bias: Flux “likes face & torso”The Flux model is heavily biased towards the face, chest and waist because:The training dataset is full of portrait, half-body, fashion cropFace = high detail = high aesthetic value according to the modelWhen you write:full body shot, wide angle shotThe model reads it, but its internal priority remains the face. So the camera “zooms in” on its own.2️⃣ The term “wide angle” ≠ camera distanceThis is a classic trap.Prompt → What the model understandswide angle → lens distortionfull body → intention, not a guaranteecinematic → lighting & mood❌ It doesn't mean: the camera is far away✔️ Models can still use wide lenses but close up3️⃣ Human = main object → auto cropFor human subjects, Flux automatically:Zoom in on the subjectSacrifice leg firstfocus on expression & top clothingThat's why the background is rarely "used".The RIGHT way to force full body in FluxThis isn't one trick. It's a combination of techniques.🧠 Main principlesDon't just say "full body" Force it with physical context and framing.✅ Technique 1: Use a physical anchor (THIS IS IMPORTANT)The model adheres more to object relations than to camera terms.❌ Bad:full body shot, wide angle✅ Good:standing on the ground, feet visible, head to toe visible, shoes touching the floorModels cannot crop the feet if "feet touching the ground".✅ Technique 2: Use “distance language”Replace camera language with physical distance.Effective example:camera placed far away, subject small in frame, full height visibleOr:long distance shot, entire body visible within the frame✅ Technique 3: Use “environment dominance”Make the background more important than the human.large environment, vast background, subject occupying small portion of the frameFlux will “move the camera away”.✅ Technique 4: Add anti-crop instructionFlux is quite responsive to explicit prohibitions.no cropped body, no half body, no close upThis isn't an official negative prompt, but it's still influential.🔥 Examples of prompts that are PROVEN to be more compliantSimple example:a young woman standing on a street, full height visible from head to toe, feet clearly visible touching the ground, camera placed far away, subject small in frame, wide environment background, entire body inside the frame, no cropped body, natural daylightOr a more “Flux-friendly” version:long distance shot of a woman, standing upright, head to toe visible, shoes visible on the ground, wide environment, camera far away, subject centered but small in frame⚠️ Things to AVOID🚫 Don't just rely on:full body shotwide anglecinematicfashion photographyThat's a cosmetic prompt, not a structural instruction.The summary 🎯Flux naturally cuts through the human body“full body” is not enoughObject + camera distance + environment relation is keyForce models with physical logic, not photographic terms.👩 FULL BODY TEMPLATE IMAGE OF A WOMAN (Or Man if you want to change it)Flux is more aggressive in zooming in on women, so this template is tighter on the legs & framing. Example:a young adult woman, standing upright on the ground, full height visible from head to toe, legs fully visible, feet clearly visible touching the ground, entire body fully inside the frame, no cropped body, no half body, no close up, camera placed far away, long distance shot, subject small in frame, wide environment surrounding the subject, wearing a summer dress and flat sandals, full outfit visible including footwear, located in a wide park with trees and open space, background dominates the scene, large scale background, natural body proportions, realistic female anatomy, balanced natural standing pose, soft daylight, high detail, realistic, flux dev style🧠 Why this template works“feet touching the ground” = Flux dare not cut“subject small in frame” = camera moves away slightlyEnvironment dominance = background priority > faceExplicit crop ban = suppress portrait biasThis isn't magic. It's stubborn negotiation with a model. 😄🔧 Additional tips (optional but powerful)If it's still cut off → add:full body framed with extra space above head and below feetFor an establishing backdrop:vast environment, large scale sceneFor safe poses:standing straight, arms relaxed at sidesWill "camera placed far away" make the impression that the human object in the photo is too far away / zoomed-out?Short answer:Maybe yes, maybe not.It depends on the combination, not the phrase alone.Now we dissect it with the Flux version of camera logic.Why does “camera placed far away” feel scary?Because if it stands alone, Flux reads it as:small subjectdominant environmentextreme distanceThe result could be something like:🧍‍♂️ ➝ human dot in the background oceanBut… in the template above, why is it safe?Because the phrase is CLIPPED by another instruction.Safe combination:camera placed far away, subject centered, full height visible, natural body proportionsIt means:camera back away ➝ so that the feet are inbut the subject remains the center of the framenot an extreme wide establishing shotHow to set a “safe distance” (sweet spot)🔧 Opt 1: Safe version (RECOMMENDED)Change far away ➝ moderate distancecamera placed at a moderate distance, full body visible, subject clearly visibleThis is the most stable for Flux.🔧 Opt 2: Stay “far away” but lockedcamera placed far away, subject clearly visible, subject occupying medium portion of the frameThe prompt medium portion holds back excessive zoom-out.🔧 Opt 3: Framing language, not distanceFlux is more compliant to this:full body framed with space above the head and below the feetWithout mentioning the distance at all.Quick table: distance word effects in Fluxcamera placed far away ➝ safe for feet, risky too smallmoderate distance ➝ most balancedlong distance shot ➝ dominant environmentsmall subject in frame ➝ very far awaymedium subject in frame ➝ clear full bodyMy final recommendation 🎯For 90% of human full body cases:camera placed at a moderate distance, full body framed from head to toe, subject centered and clearly visibleIt's far enough for feet, close enough for details.Conclusion:❌ “far away” is not poison⚠️ But it must be clamped✅ “moderate distance + framing language” = most stable🧠 Think of Flux like a stabborn photographer, not a passive camera.Below are 3 camera distances templates.I wrote the core template and then a ready-to-use example so it's easy to copy.🧍‍♀️ CLOSE FULL BODYImpression: dominant subject, clear clothing details, secure feetfull body framed tightly from head to toe, subject occupying large portion of the frame, camera placed at a close but full-body distance, space visible above the head and below the feet, entire body inside the frame, feet clearly visible touching the ground, no cropped body, no close up🧍 MEDIUM FULL BODYImpression: most balanced, safe for 80% of use casesfull body clearly visible from head to toe, subject occupying medium portion of the frame, camera placed at a moderate distance, balanced framing, wide environment visible but not dominant, feet clearly visible touching the ground, no cropped body🌆 WIDE / ESTABLISHING SHOT (BUT FULL BODY IS SAFE)Impression: strong location, small but intact humanestablishing shot, wide environment dominating the scene, subject occupying small portion of the frame, full body visible from head to toe, entire body inside the frame, feet visible touching the ground, subject clearly identifiableImportant for Flux:Don't mix small subjects in frame with close full body shots.Always repeat head to toe + feet touching groundFor women, repeating the leg 2× is normalDoes the prompt fit at the beginning of the prompt before the subject and details or at the end of the prompt after the subject and other details?Answer:Place it at the BEGINNING of the prompt.NOT at the end. NOT in the middle. IN FRONT.Now a really useful explanation for FluxHow Flux “reads” prompts (practical version)Flux doesn't read like a human reads a sentence. It works like this:Beginning = framing & compositionMiddle = subject & poseEnd = cosmetics (style, lighting, mood)It means:What you write in the first 25–30% of the prompt determines the camera.If “full body” and “feet visible” appear later, Flux has often already “decided” on the portrait framing.IDEAL sequence for full body (must follow)[1] Framing & Camera (FRONT) [2] Main subject (human) [3] Pose & physical [4] Clothing & visual details [5] Environment & background [6] Lighting & styleHard rules (brief but important)✅ MANDATORY at the beginningfull body / head to toefeet touching groundframing (close / medium / wide)subject portion in frameno cropped body⚠️ CAN be in the middleposegenderageexpression🎨 AT THE ENDcinematicfashionstreetlightingmoodrealismWhy does it often fail if it is placed at the end?Because Flux:Determine the crop earlyDon't “repeat” framing unless you force it.Trust initial instructions more than final revisionsConclusion 🎯Framing = frontDetail = backFull body fails → almost always because the framing comes too lateFinal note: This is just a guide, you don't need to 100% copy the prompt I made, you can modify it according to your own taste as long as you stick to the prompt placement rules. ✌👍👌That's all, folks. Hopefully, this guide helps you create text2image with Flux, especially full-body images, with satisfying results. 😁✨🤩
14
2