Tensor.Art
Create

Tensor.Art

Creation

Get start with Stable Diffusion!

ComfyFlow

ComfyUI's amazing experience!

Host My Model

Share my models,get more attention!

Online Training

Make LoRA Training easier!
775496361771594264

Flux dev x Tensor

281K
757362842344161975

FLUX.1 TXT2IMG - Advanced

170K
775476832555399406

Family Guy Vibe

11K
725033412326488357

FREE Clay Filter v2.0 免费黏土滤镜 무료 클레이 필터 無料の粘土フィルター bộ lọc đất sét miễn phí filter tanah liat gratis

1.8K
755466932125249279

EK-Ink Art Maker [SD3]

172
753601670778269655

2 images Mixer (IPAdapter)

4.7K
699760252813429437

Ai Logo Fast Designer

5.2K
723821986735976150

(SDXL) From Sketch and Reference Image

867
725371314583450468

🔄Interior Style Change Design🛋️🛏️🛁🪑🚪 🪟

153
760171205494904096

🌺Flux1-dev+Upscale (Ver. kei)🌺

4.6K
747614662461326308

EK Realistic Auto Generator [XL]

122
732427555008789920

LineArt Anything: Tattoo AI Tool

857
761019710643990583

Photo To Disney Style

1.4K
691293476408618214

Sintetico Cityscape 2.0

849
721539864453061290

HHM Styler (XL)

323
710239809792263239

Your Face on Sticker! - Easy 3D Sticker Maker + Face Swap (High Quality)

1.9K
713140287106511469

Wabisabi Interior Design from Architech1904

896
750624326434204260

Comic style Wallpaper 2.0

2K
751595908166100066

EK Art-Full Illustration [SD3]

272
748689675595750566

中割り動画生成

152
756700103135382829

Fantasy Vision SD3

326
766160679137786005

FLUX 2D Platform Game Texture

189
732396661309005594

Your pet🐶🐱 in the animation(Don't hesitate to get your pet tattooed on your body)

920

View All AI Tools

Models

Workflows

Articles

Flux Ultimates Custom Txt 2 Vid Tensor Workfkow

Flux Ultimates Custom Txt 2 Vid Tensor Workfkow

Welcome to Dream Diffusion FLUX ULTIMATE, TXT 2 VID With its own custom workflow made for Tensor Arts Comfy Workspace. The workflow can be downloaded on this page....... ENJOYThis is a 2nd stage Trained checkpoint to its predecessor FLUX HYPER.When you think you had it nailed in the last version and notice a 10% margin that could still be trained........ Well that's what happened ..So now this version has even more font styles, Better adherence, Sharper image clarity and a better grasp for anime, water painting and such on....This model has the same setting parameters as Flux HyperPrompt Example : Logo in neon lights, 3D, colorful, modern, glossy, neon background,with a huge explosion of fire with epic effects, the text reads  "FLUX ULTIMATE , GAME CHANGER ",Set steps at : 20Sampler : DPM++ 2M or EULER Gives best resultsScheduler : SimpleDenoise : 1.00Image Size : 576 x 1024 or 1024 x 576 You can choose any size but this model is optimized for faster rendering with those sizes.Download the links from below and save them to your comfy folders...Comfy Workflow :  https://openart.ai/workflows/maitruclam/comfyui-workflow-for-flux-simple/iuRdGnfzmTbOOzONIiVVVae download this to your Vae folder inside of your model folderDownload them from: https://huggingface.co/black-forest-labs/FLUX.1-schnell/tree/main/vaeClip:  download clip_l.safetensors and t5xxl_fp8_e4m3fn.safetensors download these 2 and save them to your clip folder inside of your models folderDownload them from : https://huggingface.co/comfyanonymous/flux_text_encoders/tree/mainIf you have any questions or issues feel free to drop a comment below and I will get back to you as soon as I can. Enjoy  DICE
36
6
An examination on the effect of Denoise on Flux Img2Img with LoRA, a journey from Boat to Campervan

An examination on the effect of Denoise on Flux Img2Img with LoRA, a journey from Boat to Campervan

I made an AI Tool yesterday ( FLUX IMG2IMG + LORAS + UPSCALE + CHOICE | ComfyUI Workflow | Tensor.Art ) that allows you to combine up to 3 LoRA's and upscale - it has model switching to let you choose whether to turn on 0/1/2/3 if the available LoRA inputs - you can choose the weighting 1 by 1 and swap out the base Flux model and all the LoRA's to your own preferences. I have implemented Radio Button prompting so that the main Trigger words for the LoRA's I use most often are already behind the buttons - and you can use "Custom" to add your own prompt or triggers into the mix.For this test I used a 6k Adobe Stock licensed image of a boat on the beach, with the Model Switcher set to "2" to prevent any bleed from other LoRA's in the tool, everything is upscaled by 4x-Ultrasharp at a factor of 2 (the tool will size your longest edge to 1024 as it processes so you will end up with a 2048 pixel final image ready for facebook servers):Original Input Image: (downsized for article)So the first test was simply putting it through the AI Tool on base Flux model - no denoise - no LoRA at all:Now I have added in the LoRA "TQ - Flux Frozen" by @TracQuoc at .9 Weight, and added .25 Denoise:Next I changed the Denoise to 0.5, you can see subtle changes, a signature has appeared, the boat is starting to change in areas and writing appearing on the side of the boat:At 0.6 Denoise the boat is starting to adapt more and the beach is changing a lot:By 0.65 you can really see dramatic changes as the boat starts to develop wheels, its almost as if the AI has a plan for this one...At 0.7 - the second boat has disappeared all together, the whole boat is on a trailer, the beach is changing into grassland:Now I am stepping to 0.01 increments as all the drama happens between .7 and .8 normally with FluxSo 0.71:0.72: (the boat is definitely changing its shape now, and you can start to see snow)0.73 you can see its becoming a land based vehicle now:0.74 it feels like a towing caravan/trailer:0.75 more detail in the towing section0.76 - everything changes and suddenly we have some kind of Safari Land Cruiser0.77 now its a camper van with a pop up roof:0.78 just some more camper style detailing but nothing dramatic:0.79 There's almost no resemblance to the original scene except sky and colours:0.8 I can't see much change here:Now I will go up in increments of 0.05 again0.85 the Frozen world has taken over, although it still has the style and colour feel of the original to some extent0.9 it's all gone (it ignored inputs over .9 and changed them back to .9)I hope you have found this a useful experiment - and will save you time and coins in playing with img2img and denoise.You can check out all my AI Tools and LoRA's on my profile here: KurtC PhotoEdLet me know if you enjoyed this and I might make some more (this was my first one).
31
6
Radio Buttons are awesome in AI Tools: [How to set-up guide]

Radio Buttons are awesome in AI Tools: [How to set-up guide]

Dear Tensorians,Thanks to the implemented feature of radio buttons for AI Tools, we can use the AI tools with much more fun now. Because I'm the one who insisted to implement it and more importantly the radio button's setting import/export features, I'll give an easy tutorial about them for the beginners. 🤗😉https://tensor.art/template/765762196352358016This is an example AI tool using radio buttons. You can see the cool radio buttons on the right. Yes! The cool thing about radio button GUI is that you don't have to remember or re-type all those crazy prompt words at all any more. You can store them in those buttons and click them! Especially if you have a very wide range of different prompting styles as most users are, you cannot even remember them all. I bet you already have your own backup memo file for those special prompts lol. Yes, we have to do it for important prompts. However, more conveniently, if you make this kind of AI tool with radio button UI, you can just store them online next to you all the time. You can click on the buttons and generate various images whenever you want, even when you are driving (just kidding, don't ever do that lol). Of course you can add extra prompt together with the buttons. (Click "custom" button and you can always input more prompt!)To create the radio buttons, click edit in the ... menu.Then you move to the EDIT page of the AI tool.in the middle of the page, you see the user-configurable settings.By clicking "Add" button, you can choose your AI interface. By clicking "edit" in the prompt's text box, you can enter the radio buttons option page.From the scratch, you can choose the pre-defined groups and buttons. In addition, you can add your own new buttons! Make a button name and its content. The content is part of prompts you want to add for the button's place.After you are done with all the button settings, click "confirm" and then "publish" your AI tool. Then you'll see your cool radio buttons in the AI tool. (Note that there are certain prompt text box nodes in comfyUI unable to edit for buttons. Basic text prompt nodes and more nodes can be used for button edit. You can check it after you publish your workflow into a tool. If it doesn't support the radio buttons, use different prompt text nodes.)Whenever you update your workflow for the AI tool, all the AI tool UI is reset to none!! Yes. It was a real headache at the beginning. However, now we have a cool import/export button for the radio buttons! (Thanks God~ 👯‍♀️⛈💯🤗). BTW, when you edit the button groups, you might choose part of the 6 or 7 groups (e.g, "character settings" and "role" groups) first and add some nice buttons, then later you change your mind and want to add another group, e.g., "style" group, however, if you press the add button for that, your previous button data will be gone!! You restart from the beginning. Be very careful! (You'll understand what I mean when it happens. lol)Before updating your workflow, you can export the radio button settings as a JSON file. Then you can import it back in later anytime you want. More importantly, you can just edit the radio buttons from the editors (like MS visual studio) for easier copy and paste from the existing files. Trust me. This will save your enormous amount of time remaking those terrible buttons all the time whenever the workflow is modified.Sometimes you must want to edit an existing button JSON file for another AI tool. Editing a JSON file is not really an entertaining work. However, it's much better than remaking the whole radio buttons at GUI~ So find the place to edit in the JSON file and change it very carefully. The JSON syntax is not very editor-friendly and error-prone. But you'll get used to it soon by trial and errors. It's always useful to use "find" command to look for the button you want in the file. You'll realize more interesting things while using the button JSON files. I'll leave them for your own pleasant surprise~ LOL.I shared my JSON file of the AI tool in the comfy-chatroom of Discord. Feel free to use it.I hope this article helped you make the radio button UI more easily. Enjoy~ 🤗😉⛈
54
7
Mastering FLUX Prompt Engineering: A Practical Guide with Tools and Examples

Mastering FLUX Prompt Engineering: A Practical Guide with Tools and Examples

FLUX AI Tools:https://tensor.art/template/768387980443488839https://tensor.art/template/759877391077124092https://tensor.art/template/761803391851647087https://tensor.art/template/763734477867329638FLUX Prompt Tools:https://chatgpt.com/g/g-NLx886UZW-flux-prompt-pro ⇦⇦⇦ Although I am doing my best to optimize my AI prompt generation tool, I am currently facing malicious negative reviews from competitors. If you have any suggestions for improvement, please feel free to share, and I will do my best to make the necessary optimizations. However, please refrain from giving unfair ratings, as it really discourages my creative efforts. If you find this GPT helpful, please give it a fair rating. Thank you.AI-generated images are revolutionizing the creative landscape, and mastering the art of prompt engineering is crucial for creating visually stunning outputs with models like FLUX. This guide provides practical steps, examples, and introduces a specialized tool to help you craft the perfect prompts for FLUX.1. Start with Descriptive AdjectivesThe foundation of any good prompt lies in the details. Descriptive adjectives are essential for guiding the AI to produce the nuances you desire. For instance, instead of a simple "cityscape," you might specify "a bustling, neon-lit cityscape at dusk with reflections on wet asphalt." This level of detail helps FLUX understand the specific atmosphere and mood you're aiming for, leading to richer and more visually engaging results.2. Integrate Specific Themes and StylesIncorporating themes or art styles can significantly shape the output. For example, you could combine cyberpunk elements with classic art references: "a cyberpunk city with Baroque architectural details, under a sky filled with digital rain." This blend of styles allows FLUX to draw from various visual traditions, creating a unique and layered image​.3. Utilize Technical SpecificationsBeyond adjectives and themes, technical aspects like lighting, perspective, and camera angles add depth to your images. Consider using prompts such as "soft, diffused lighting" or "extreme close-up with shallow depth of field" to control how FLUX renders the scene. These details can make a significant difference, turning a simple image into a masterpiece by manipulating light and shadow, and focusing attention where it matters most​.4. Combine Multiple ElementsTo achieve a more complex and detailed output, combine several of the above strategies in a single prompt. For example: "A close-up shot of a futuristic warrior standing on a neon-lit street, wearing cyberpunk armor with glowing accents, under a sky filled with dark clouds and lightning." This prompt merges detailed descriptions, stylistic choices, and technical elements to create a vivid and engaging scene​ (Magai).5. Experiment and IteratePrompt engineering is an iterative process. Start with a basic idea and refine it based on the results FLUX generates. If the initial output isn't what you expected, adjust the adjectives, tweak the themes, or alter the technical specifications. Continuous refinement is key to mastering prompt engineering​ (Hostinger).6. Utilize the FLUX Prompt Pro ToolIf you're finding it challenging to craft precise prompts, or if you want to speed up your process, try using the FLUX Prompt Pro tool. This tool is designed to generate accurate English prompts specifically for the FLUX AI model. By inputting your basic idea, the tool helps you flesh out the details, ensuring that your prompts are both clear and comprehensive. It's an excellent way to enhance your creative process and achieve better results faster.Try it here: 🚀FLUX Prompt Pro! 🚀 https://chatgpt.com/g/g-NLx886UZW-flux-prompt-pro7. Practical ExampleLet’s put all these strategies into practice with an example:Basic Idea: A futuristic city.Refined Prompt: "A wide-angle shot of a neon-lit, futuristic city at night, with towering skyscrapers reflecting in rain-soaked streets, cyberpunk style, featuring soft backlighting from holographic billboards, and a lone figure in a trench coat standing on a rooftop."This prompt uses descriptive adjectives, specific themes, technical specifications, and combines multiple elements to create a detailed and dynamic image. By following these steps, you can consistently produce high-quality visuals with FLUX.ConclusionMastering FLUX prompt engineering involves blending creativity with precision. By leveraging descriptive language, specific themes, and technical details, and by iterating on your prompts, you can unlock the full potential of FLUX to generate stunning, personalized images. Don’t forget to use the FLUX Prompt Pro tool to streamline your process and achieve even better results.Keep experimenting, stay curious, and enjoy creating!======================================================If you enjoy listening to great music while creating AI-generated art, I highly recommend subscribing to my SUNO AI music channel. I believe it will help ignite your inspiration and creativity even more. I’ll be regularly updating the channel with new AI-generated music. Thank you all for your support! Feel free to leave suggestions or let me know what music styles you’d like to hear. I’ll be creating more tracks in various styles over time. Here are my AI music channel and featured playlists:Lo-fi music: https://suno.com/playlist/e1087fe1-950a-448b-94f4-ddb17ccf84d0FuturEvoLab AI music: https://suno.com/@futurevolab
203
96
Hunyuan-DiT: Recommendations

Hunyuan-DiT: Recommendations

ReviewHello everyone; I want to share some of my impressions about the Chinese model, Hunyuan-DiT from tencent. First of all let’s start with some mandatory data to know so we (westerns) can figure out what is meant for:Hunyuan-DiT works well as multi-modal dialogue with users (mainly Chinese and English language), the better explained your prompt the better your generation will be, is not necessary to introduce only keywords, despite it understands them quite well. In terms of rating HYDiT 1.2 is located between SDXL and SD3; is not as powerful than SD3, defeats SDXL almost in everything; for me is how SDXL should’ve be in first place; one of the best parts is that Hunyuan-DiT is compatible with almost all SDXL node suit.Hunyuan-DiT-v1.2, was trained with 1.5B parameters.mT5, was trained with 1.6B parameters.Recommeded VAE: sdxl-vae-fp16-fixRecommended Sampler: ddpm, ddim, or dpmmsPrompt as you’d like to do in SD1.5, don’t be shy and go further in term of length; HunyuanDiT combines two text encoders, a bilingual CLIP and a multilingual T5 encoder to improve language understanding and increase the context length; they divide your prompt on meaningful IDs and then process your entire prompt, their limit is 100 IDs or to 256 tokens. T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task.To improve your prompt, place your resumed prompt in the CLIP:TextEncoder node box (if you disabled t5), or place your extended prompt in the T5:TextEncoder node box (if you enabled t5).You can use the "simple" text encode node to only use one prompt, or you can use the regular one to pass different text to CLIP/T5.The worst is the model only benefits from moderated (high for TensorArt) step values: 40 steps are the basis in most cases.Comfyui (Comfyflow) (Example)TensorArt added all the elements to build a good flow for us; you should try it too.AdditionalWhat can we do in the Open-Source plan? (link)Official info for LoRA training (link)ReferencesAnalysis of HunYuan-DiT | https://arxiv.org/html/2405.08748v1Learn more of T5 | https://huggingface.co/docs/transformers/en/model_doc/t5How CLIP and T5 work together | https://arxiv.org/pdf/2205.11487
21
12
Unlock the Power of Detailed Beauty with TQ-HunYuan-More-Beautiful-Detail v1.7

Unlock the Power of Detailed Beauty with TQ-HunYuan-More-Beautiful-Detail v1.7

In the world of digital artistry, achieving that perfect blend of intricate details and stunning visuals can be a game-changer. That's where our latest model, TQ-HunYuan-More-Beautiful-Detail v1.7, comes into play. Designed with precision and a keen eye for aesthetics, this model is your go-to solution for elevating your artwork to new heights.What is TQ-HunYuan-More-Beautiful-Detail v1.7?TQ-HunYuan-More-Beautiful-Detail v1.7 is a state-of-the-art LoRA (Low-Rank Adaptation) model created to enhance the finer details in your digital creations. Whether you're working on portraits, landscapes, or abstract designs, this model ensures that every nuance and subtlety is brought to life with extraordinary clarity and beauty.Why Choose TQ-HunYuan-More-Beautiful-Detail v1.7?Unmatched Detail Enhancement: As the name suggests, this model excels at adding more beautiful details to your artwork. It meticulously enhances textures, refines edges, and highlights intricate patterns, making your creations visually striking.Versatility Across Genres: No matter the style or genre of your artwork, TQ-HunYuan-More-Beautiful-Detail v1.7 adapts seamlessly. From hyper-realistic portraits to fantastical landscapes, this model enhances every element with precision.User-Friendly Integration: Designed for ease of use, integrating TQ-HunYuan-More-Beautiful-Detail v1.7 into your workflow is straightforward. Compatible with various platforms and software, it allows artists of all levels to harness its power without a steep learning curve.Boost Your Creativity: By handling the intricate details, this model frees up your creative energy. Focus on the broader aspects of your work while TQ-HunYuan-More-Beautiful-Detail v1.7 takes care of the fine-tuning, resulting in a harmonious and polished final piece.How to Get StartedGetting started with TQ-HunYuan-More-Beautiful-Detail v1.7 is simple. Visit this link to access the model. Download and integrate it into your preferred digital art software, and watch as your creations transform with enhanced details and breathtaking beauty.Ready to take your art to the next level? Download TQ-HunYuan-More-Beautiful-Detail v1.7 now and start creating masterpieces with more beautiful detail than ever before.
34
4
SD3 - 3D lettering designer

SD3 - 3D lettering designer

SD3 understands prompts better compared to SDXL. You can use this to create interesting 3D lettering. For this purpose, use this WF! You can use a gradient as the background or any image you like. Have fun!Link to workflow: SD3 - 3D lettering designer | ComfyUI Workflow | Tensor.Art
11
Realistic Vision SD3

Realistic Vision SD3

Realistic VisionI am excited to present my latest Realistic checkpoint model based on SD3M. This model has undergone over 100k+ training steps, ensuring high-quality output.About This Model:This is a Photo Realistic model, capable of generating photorealistic images. No trigger words are needed. The model is designed to produce high-detail, high-resolution images that closely mimic real-life photographs.Configuration Used for Training:GPU: A6000x2Dataset: A mix of 5k stock photos and my own datasetBatch Size: 8Optimizer: AdamWScheduler: Cosine with restartsLearning Rate (LR): 1e-05Epoch: Target of 300 epochsCaptioning: WD14 and BLIP mixQuick Guide and Parameters:Clip Encoder: Not requiredVAE: Not requiredSampler: dpmpp_2mScheduler: sgm_uniformSampling Steps: 25+CFG Scale: 3+For better results, try using ComfyUI. Here is a workflow that is low-cost and efficient. Currently, upscaling is not possible due to specific reasons. I have reported the issue to the TA team, and hopefully, it will be fixed soon.Realistic VisionAspect Ratios for Demo:1:1 [1024x1024 square]8:5 [1216x768 landscape]4:3 [1152x896 landscape]3:2 [1216x832 landscape]7:5 [1176x840 landscape]16:9 [1344x768 landscape]21:9 [1536x640 landscape]19:9 [1472x704 landscape]3:4 [896x1152 portrait]2:3 [832x1216 portrait]5:7 [840x1176 portrait]9:16 [768x1344 portrait]9:21 [640x1536 portrait]5:8 [768x1216 portrait]9:19 [704x1472 portrait]Important: Do not include NSFW-related/mature words or censor words in your prompt. Doing so may result in unreliable or undesirable image outcomes.Note:This is not a merged or modified model. It is the original Realistic Vision fine-tuned model. Some users have been spreading incorrect information in the model's comment section. If you have any questions or want to know more, join my Discord server or share your thoughts in the comment section. Thank you for your time.
18
1
SDG - HunyuanDiT loras released

SDG - HunyuanDiT loras released

HunyuanDiT - Perfect cute animehttps://tensor.art/models/755812883138538240?source_id=nz-ypFjjk0C7pPcibn708xQiEnhance character appearance details, eyes, hair, colors, and drawings in anime styleHunyuanDiT - Realistic detailshttps://tensor.art/models/755789054659947864/HunyuanDiT-Realistic-details-V1Add more realistic details for imagesHunyuanDIT - Vivid colorhttps://tensor.art/models/755810413532312715?source_id=nz-ypFjjk0C7pPcibn708xQiEnhance vivid colors and details in photosHunyuan - Beauty Portraithttps://tensor.art/models/755789995257798458?source_id=nz-ypFjjk0C7pPcibn708xQiortrait within more details hair, skin...
4
2
Hunyuan model online training tutorial

Hunyuan model online training tutorial

EnglishToday, Iwill teach you how to use TensorArt to train an Hunyuan model online.Step 1: Open “Online Training.On the left side, you will see the dataset window, which is empty by default. You can upload some images to create a dataset or upload a dataset zip file. The zip file can include annotation files, following the same format as kohya-ss, where each image file corresponds to a text annotation file with the same name.In the model theme section on the right, you can choose from options such as anime characters, real people, 2.5D, standard, and custom.Here, we select “Base” and choose the Hunyuan model as the base model.For the base model parameter settings, we recommend setting the number of repetitions per image to 4 and the number of epochs to 16.、After uploading a processed dataset, if your dataset annotations include character names, you don’t need to specify a trigger word. Otherwise, you should assign a simple trigger word to your model, such as a character name or style name.Next, select an annotation file from the dataset to use as a preview prompt.If you want to use Professional Mode, click the button in the top right corner to switch to Professional Mode.In Professional Mode, it is recommended to double the learning rateand use the cosine_with_restarts learning rate scheduler. For the optimizer, you can choose AdamW8bit.Enable label shuffling and ensure that the first token remains unchanged (especially if you have a character name trigger word as the first token).Disable the noise offset feature, and you can set the convolution DIM to 8 and Alpha to 1.In the sample settings, add the Negative prompts, and then you can start the training process.In the training queue, you can view the current loss value chart and the four sample images generated for each epoch.Finally, you can choose the epoch with the best results to download to your local machine or publish directly on TensorArt.After a few minutes, your model will be deployed and ready.日本語今日、私はTensorArtを使用してHunyuanモデルをオンラインでトレーニングする方法を教えます。ステップ1: 「オンライントレーニング」を開きます。左側にデータセットウィンドウが表示され、デフォルトでは空です。データセットを作成するために画像をアップロードするか、データセットのzipファイルをアップロードできます。zipファイルには、kohya-ssと同じ形式のアノテーションファイルを含めることができ、各画像ファイルには同じ名前のテキストアノテーションファイルが対応しています。右側のモデルテーマセクションでは、アニメキャラクター、実在の人物、2.5D、標準、カスタムなどのオプションから選択できます。ここでは「Base」を選択し、Hunyuanモデルをベースモデルとして選びます。ベースモデルのパラメーター設定では、画像ごとの繰り返し回数を4、エポック数を16に設定することをお勧めします。 処理済みのデータセットをアップロードした後、データセットのアノテーションにキャラクター名が含まれている場合は、トリガーワードを指定する必要はありません。それ以外の場合は、キャラクター名やスタイル名など、モデルに簡単なトリガーワードを割り当ててください。 次に、プレビュー用プロンプトとして使用するために、データセットからアノテーションファイルを選択します。プロフェッショナルモードを使用したい場合は、右上隅のボタンをクリックしてプロフェッショナルモードに切り替えます。プロフェッショナルモードでは、学習率を倍増することをお勧めします。また、cosine_with_restarts学習率スケジューラーを使用してください。オプティマイザーとしては、AdamW8bitを選択できます。ラベルシャッフルを有効にし、最初のトークンが変更されないようにします(特にキャラクター名トリガーワードが最初のトークンの場合)。ノイズオフセット機能を無効にし、畳み込みDIMを8、Alphaを1に設定できます。サンプル設定でNegative promptsを追加し、その後、トレーニングプロセスを開始できます。トレーニングキューでは、現在の損失値チャートと各エポックごとに生成された4つのサンプル画像を表示できます。最後に、最良の結果が得られたエポックを選択して、ローカルマシンにダウンロードするか、直接TensorArtで公開できます。数分後には、モデルがデプロイされ、使用可能になります。한국인오늘은 TensorArt를 사용하여 Hunyuan 모델을 온라인에서 훈련하는 방법을 알려드리겠습니다.1단계: “온라인 훈련”을 엽니다.왼쪽에는 기본적으로 비어 있는 데이터셋 창이 표시됩니다. 데이터셋을 만들기 위해 이미지를 업로드하거나 데이터셋 zip 파일을 업로드할 수 있습니다. zip 파일에는 kohya-ss와 같은 형식의 주석 파일이 포함될 수 있으며, 각 이미지 파일에는 동일한 이름의 텍스트 주석 파일이 대응됩니다.오른쪽의 모델 테마 섹션에서는 애니메이션 캐릭터, 실제 인물, 2.5D, 표준, 사용자 정의 등 다양한 옵션 중에서 선택할 수 있습니다.여기에서는 “Base”를 선택하고 Hunyuan 모델을 기본 모델로 선택합니다.기본 모델 파라미터 설정에서는 이미지당 반복 횟수를 4로, 에포크 수를 16으로 설정하는 것을 권장합니다. 처리된 데이터셋을 업로드한 후, 데이터셋의 주석에 캐릭터 이름이 포함되어 있으면 트리거 단어를 지정할 필요가 없습니다. 그렇지 않으면 모델에 간단한 트리거 단어를 지정해야 합니다, 예를 들어 캐릭터 이름이나 스타일 이름 등. 다음으로, 미리 보기 프롬프트로 사용할 주석 파일을 데이터셋에서 선택합니다.전문 모드를 사용하려면, 오른쪽 상단의 버튼을 클릭하여 전문 모드로 전환합니다.전문 모드에서는 학습률을 두 배로 늘리는 것이 좋습니다.또한 cosine_with_restarts 학습률 스케줄러를 사용합니다. 옵티마이저로는 AdamW8bit을 선택할 수 있습니다.레이블 셔플을 활성화하고 첫 번째 토큰이 변경되지 않도록 합니다(특히 캐릭터 이름 트리거 단어가 첫 번째 토큰인 경우).노이즈 오프셋 기능을 비활성화하고, 컨볼루션 DIM을 8로, Alpha를 1로 설정할 수 있습니다.샘플 설정에서 Negative prompts를 추가한 후, 훈련 프로세스를 시작할 수 있습니다.훈련 대기열에서 현재 손실 값 차트와 각 에포크에 대해 생성된 4개의 샘플 이미지를 볼 수 있습니다.마지막으로, 가장 좋은 결과를 얻은 에포크를 선택하여 로컬 컴퓨터로 다운로드하거나 직접 TensorArt에 게시할 수 있습니다.몇 분 후, 모델이 배포되고 사용 가능해집니다.Tiếng ViệtHôm nay, tôi sẽ hướng dẫn bạn cách sử dụng TensorArt để đào tạo mô hình Hunyuan trực tuyến.Bước 1: Mở “Đào tạo trực tuyến.”Ở bên trái, bạn sẽ thấy cửa sổ tập dữ liệu, mặc định là trống. Bạn có thể tải lên một số hình ảnh để tạo tập dữ liệu hoặc tải lên tệp zip của tập dữ liệu. Tệp zip có thể bao gồm các tệp chú thích, theo cùng một định dạng như kohya-ss, trong đó mỗi tệp hình ảnh tương ứng với một tệp chú thích văn bản cùng tên.Ở phần chủ đề mô hình bên phải, bạn có thể chọn từ các tùy chọn như nhân vật anime, người thật, 2.5D, tiêu chuẩn và tùy chỉnh.Tại đây, chúng ta chọn “Base” và chọn mô hình Hunyuan làm mô hình cơ bản.Đối với cài đặt tham số của mô hình cơ bản, chúng tôi khuyên bạn nên đặt số lần lặp lại trên mỗi hình ảnh là 4 và số epoch là 16. Sau khi tải lên tập dữ liệu đã xử lý, nếu các chú thích của tập dữ liệu của bạn bao gồm tên nhân vật, bạn không cần phải chỉ định từ kích hoạt. Ngược lại, bạn nên gán một từ kích hoạt đơn giản cho mô hình của mình, chẳng hạn như tên nhân vật hoặc tên phong cách. Tiếp theo, chọn một tệp chú thích từ tập dữ liệu để sử dụng làm lời nhắc xem trước.Nếu bạn muốn sử dụng Chế độ Chuyên nghiệp, hãy nhấp vào nút ở góc trên bên phải để chuyển sang Chế độ Chuyên nghiệp.Trong Chế độ Chuyên nghiệp, nên gấp đôi tỷ lệ học.Và sử dụng bộ lập lịch tỷ lệ học cosine_with_restarts. Đối với bộ tối ưu hóa, bạn có thể chọn AdamW8bit.Kích hoạt xáo trộn nhãn và đảm bảo rằng mã thông báo đầu tiên không thay đổi (đặc biệt nếu bạn có từ kích hoạt tên nhân vật là mã thông báo đầu tiên).Tắt tính năng dịch chuyển tiếng ồn và bạn có thể đặt DIM tích chập là 8 và Alpha là 1.Trong cài đặt mẫu, thêm các Lời nhắc tiêu cực, sau đó bạn có thể bắt đầu quá trình đào tạo.Trong hàng đợi đào tạo, bạn có thể xem biểu đồ giá trị tổn thất hiện tại và bốn hình ảnh mẫu được tạo ra cho mỗi epoch.Cuối cùng, bạn có thể chọn epoch có kết quả tốt nhất để tải xuống máy tính của bạn hoặc xuất bản trực tiếp trên TensorArt.Sau vài phút, mô hình của bạn sẽ được triển khai và sẵn sàng sử dụng.españolHoy, te enseñaré cómo usar TensorArt para entrenar un modelo Hunyuan en línea.Paso 1: Abre “Entrenamiento en línea.”A la izquierda, verás la ventana del conjunto de datos, que está vacía por defecto. Puedes subir algunas imágenes para crear un conjunto de datos o subir un archivo zip del conjunto de datos. El archivo zip puede incluir archivos de anotación, siguiendo el mismo formato que kohya-ss, donde cada archivo de imagen corresponde a un archivo de anotación de texto con el mismo nombre.En la sección de temas del modelo a la derecha, puedes elegir entre opciones como personajes de anime, personas reales, 2.5D, estándar y personalizado.Aquí, seleccionamos “Base” y elegimos el modelo Hunyuan como el modelo base.Para la configuración de parámetros del modelo base, te recomendamos configurar el número de repeticiones por imagen a 4 y el número de épocas a 16. Después de subir un conjunto de datos procesado, si las anotaciones de tu conjunto de datos incluyen nombres de personajes, no necesitas especificar una palabra de activación. De lo contrario, deberías asignar una palabra de activación simple a tu modelo, como un nombre de personaje o un nombre de estilo. A continuación, selecciona un archivo de anotación del conjunto de datos para usarlo como un aviso de vista previa.Si deseas usar el Modo Profesional, haz clic en el botón en la esquina superior derecha para cambiar al Modo Profesional.En el Modo Profesional, se recomienda duplicar la tasa de aprendizaje.Y usar el programador de tasa de aprendizaje cosine_with_restarts. Para el optimizador, puedes elegir AdamW8bit.Habilita el barajado de etiquetas y asegúrate de que el primer token permanezca sin cambios (especialmente si tienes una palabra de activación de nombre de personaje como el primer token).Desactiva la función de desplazamiento de ruido y puedes configurar el DIM de convolución a 8 y Alpha a 1.En la configuración de muestra, añade los Avisos Negativos, y luego puedes comenzar el proceso de entrenamiento.En la cola de entrenamiento, puedes ver el gráfico del valor de pérdida actual y las cuatro imágenes de muestra generadas para cada época.Finalmente, puedes elegir la época con los mejores resultados para descargarla a tu máquina local o publicarla directamente en TensorArt.Después de unos minutos, tu modelo estará desplegado y listo para usar.
10
3
Online Training SD3 Model Tutorial

Online Training SD3 Model Tutorial

EnglishToday, Iwill teach you how to use TensorArt to train an SD3 model online.Step 1: Open “Online Training.On the left side, you will see the dataset window, which is empty by default. You can upload some images to create a dataset or upload a dataset zip file. The zip file can include annotation files, following the same format as kohya-ss, where each image file corresponds to a text annotation file with the same name.In the model theme section on the right, you can choose from options such as anime characters, real people, 2.5D, standard, and custom.Here, we select “Base” and choose the SD3 model as the base model.For the base model parameter settings, we recommend setting the number of repetitions per image to 4 and the number of epochs to 16.、After uploading a processed dataset, if your dataset annotations include character names, you don’t need to specify a trigger word. Otherwise, you should assign a simple trigger word to your model, such as a character name or style name.Next, select an annotation file from the dataset to use as a preview prompt.If you want to use Professional Mode, click the button in the top right corner to switch to Professional Mode.In Professional Mode, it is recommended to double the learning rateand use the cosine_with_restarts learning rate scheduler. For the optimizer, you can choose AdamW8bit.Enable label shuffling and ensure that the first token remains unchanged (especially if you have a character name trigger word as the first token).Disable the noise offset feature, and you can set the convolution DIM to 8 and Alpha to 1.In the sample settings, add the Negative prompts, and then you can start the training process.In the training queue, you can view the current loss value chart and the four sample images generated for each epoch.Finally, you can choose the epoch with the best results to download to your local machine or publish directly on TensorArt.After a few minutes, your model will be deployed and ready.日本語今日は、TensorArtを使用してオンラインでSD3モデルをトレーニングする方法を教えます。ステップ1: 「オンライントレーニング」を開きます。左側にデータセットウィンドウが表示され、デフォルトでは空です。データセットを作成するために画像をアップロードするか、データセットのzipファイルをアップロードできます。zipファイルには、kohya-ssと同じ形式のアノテーションファイルを含めることができ、各画像ファイルには同じ名前のテキストアノテーションファイルが対応しています。右側のモデルテーマセクションでは、アニメキャラクター、実在の人物、2.5D、標準、カスタムなどのオプションから選択できます。ここでは、「ベース」を選択し、SD3モデルをベースモデルとして選びます。ベースモデルのパラメーター設定では、画像ごとの繰り返し回数を4、エポック数を16に設定することをお勧めします。 処理済みのデータセットをアップロードした後、データセットのアノテーションにキャラクター名が含まれている場合は、トリガーワードを指定する必要はありません。それ以外の場合は、キャラクター名やスタイル名など、モデルに簡単なトリガーワードを割り当ててください。 次に、プレビュー用プロンプトとして使用するために、データセットからアノテーションファイルを選択します。プロフェッショナルモードを使用したい場合は、右上隅のボタンをクリックしてプロフェッショナルモードに切り替えます。プロフェッショナルモードでは、学習率を倍増することをお勧めします。また、cosine_with_restarts学習率スケジューラーを使用してください。オプティマイザーとしては、AdamW8bitを選択できます。ラベルシャッフルを有効にし、最初のトークンが変更されないようにします(特にキャラクター名トリガーワードが最初のトークンの場合)。ノイズオフセット機能を無効にし、畳み込みDIMを8、Alphaを1に設定できます。サンプル設定でNegative promptsを追加し、その後、トレーニングプロセスを開始できます。トレーニングキューでは、現在の損失値チャートと各エポックごとに生成された4つのサンプル画像を表示できます。最後に、最良の結果が得られたエポックを選択して、ローカルマシンにダウンロードするか、直接TensorArtで公開できます。数分後には、モデルがデプロイされ、使用可能になります。한국인오늘은 TensorArt를 사용하여 SD3 모델을 온라인으로 훈련하는 방법을 가르쳐 드리겠습니다.1단계: “온라인 훈련”을 엽니다.왼쪽에는 기본적으로 비어 있는 데이터셋 창이 표시됩니다. 데이터셋을 만들기 위해 이미지를 업로드하거나 데이터셋 zip 파일을 업로드할 수 있습니다. zip 파일에는 kohya-ss와 같은 형식의 주석 파일이 포함될 수 있으며, 각 이미지 파일에는 동일한 이름의 텍스트 주석 파일이 대응됩니다.오른쪽의 모델 테마 섹션에서는 애니메이션 캐릭터, 실제 인물, 2.5D, 표준, 사용자 정의 등 다양한 옵션 중에서 선택할 수 있습니다.여기에서는 “Base”를 선택하고 SD3 모델을 기본 모델로 선택합니다.기본 모델 파라미터 설정에서는 이미지당 반복 횟수를 4로, 에포크 수를 16으로 설정하는 것을 권장합니다. 처리된 데이터셋을 업로드한 후, 데이터셋의 주석에 캐릭터 이름이 포함되어 있으면 트리거 단어를 지정할 필요가 없습니다. 그렇지 않으면 모델에 간단한 트리거 단어를 지정해야 합니다, 예를 들어 캐릭터 이름이나 스타일 이름 등. 다음으로, 미리 보기 프롬프트로 사용할 주석 파일을 데이터셋에서 선택합니다.전문 모드를 사용하려면, 오른쪽 상단의 버튼을 클릭하여 전문 모드로 전환합니다.전문 모드에서는 학습률을 두 배로 늘리는 것이 좋습니다.또한 cosine_with_restarts 학습률 스케줄러를 사용합니다. 옵티마이저로는 AdamW8bit을 선택할 수 있습니다.레이블 셔플을 활성화하고 첫 번째 토큰이 변경되지 않도록 합니다(특히 캐릭터 이름 트리거 단어가 첫 번째 토큰인 경우).노이즈 오프셋 기능을 비활성화하고, 컨볼루션 DIM을 8로, Alpha를 1로 설정할 수 있습니다.샘플 설정에서 Negative prompts를 추가한 후, 훈련 프로세스를 시작할 수 있습니다.훈련 대기열에서 현재 손실 값 차트와 각 에포크에 대해 생성된 4개의 샘플 이미지를 볼 수 있습니다.마지막으로, 가장 좋은 결과를 얻은 에포크를 선택하여 로컬 컴퓨터로 다운로드하거나 직접 TensorArt에 게시할 수 있습니다.몇 분 후, 모델이 배포되고 사용 가능해집니다.Tiếng ViệtHôm nay, tôi sẽ hướng dẫn bạn cách sử dụng TensorArt để huấn luyện mô hình SD3 trực tuyến.Bước 1: Mở “Đào tạo trực tuyến.”Ở bên trái, bạn sẽ thấy cửa sổ tập dữ liệu, mặc định là trống. Bạn có thể tải lên một số hình ảnh để tạo tập dữ liệu hoặc tải lên tệp zip của tập dữ liệu. Tệp zip có thể bao gồm các tệp chú thích, theo cùng một định dạng như kohya-ss, trong đó mỗi tệp hình ảnh tương ứng với một tệp chú thích văn bản cùng tên.Ở phần chủ đề mô hình bên phải, bạn có thể chọn từ các tùy chọn như nhân vật anime, người thật, 2.5D, tiêu chuẩn và tùy chỉnh.Tại đây, chúng ta chọn “Cơ bản” và chọn mô hình SD3 làm mô hình cơ sở.Đối với cài đặt tham số của mô hình cơ bản, chúng tôi khuyên bạn nên đặt số lần lặp lại trên mỗi hình ảnh là 4 và số epoch là 16. Sau khi tải lên tập dữ liệu đã xử lý, nếu các chú thích của tập dữ liệu của bạn bao gồm tên nhân vật, bạn không cần phải chỉ định từ kích hoạt. Ngược lại, bạn nên gán một từ kích hoạt đơn giản cho mô hình của mình, chẳng hạn như tên nhân vật hoặc tên phong cách. Tiếp theo, chọn một tệp chú thích từ tập dữ liệu để sử dụng làm lời nhắc xem trước.Nếu bạn muốn sử dụng Chế độ Chuyên nghiệp, hãy nhấp vào nút ở góc trên bên phải để chuyển sang Chế độ Chuyên nghiệp.Trong Chế độ Chuyên nghiệp, nên gấp đôi tỷ lệ học.Và sử dụng bộ lập lịch tỷ lệ học cosine_with_restarts. Đối với bộ tối ưu hóa, bạn có thể chọn AdamW8bit.Kích hoạt xáo trộn nhãn và đảm bảo rằng mã thông báo đầu tiên không thay đổi (đặc biệt nếu bạn có từ kích hoạt tên nhân vật là mã thông báo đầu tiên).Tắt tính năng dịch chuyển tiếng ồn và bạn có thể đặt DIM tích chập là 8 và Alpha là 1.Trong cài đặt mẫu, thêm các Lời nhắc tiêu cực, sau đó bạn có thể bắt đầu quá trình đào tạo.Trong hàng đợi đào tạo, bạn có thể xem biểu đồ giá trị tổn thất hiện tại và bốn hình ảnh mẫu được tạo ra cho mỗi epoch.Cuối cùng, bạn có thể chọn epoch có kết quả tốt nhất để tải xuống máy tính của bạn hoặc xuất bản trực tiếp trên TensorArt.Sau vài phút, mô hình của bạn sẽ được triển khai và sẵn sàng sử dụng.españolHoy, les enseñaré cómo utilizar TensorArt para entrenar un modelo SD3 en línea.Paso 1: Abre “Entrenamiento en línea.”A la izquierda, verás la ventana del conjunto de datos, que está vacía por defecto. Puedes subir algunas imágenes para crear un conjunto de datos o subir un archivo zip del conjunto de datos. El archivo zip puede incluir archivos de anotación, siguiendo el mismo formato que kohya-ss, donde cada archivo de imagen corresponde a un archivo de anotación de texto con el mismo nombre.En la sección de temas del modelo a la derecha, puedes elegir entre opciones como personajes de anime, personas reales, 2.5D, estándar y personalizado.Aquí, seleccionamos “Base” y elegimos el modelo SD3 como el modelo base.Para la configuración de parámetros del modelo base, te recomendamos configurar el número de repeticiones por imagen a 4 y el número de épocas a 16. Después de subir un conjunto de datos procesado, si las anotaciones de tu conjunto de datos incluyen nombres de personajes, no necesitas especificar una palabra de activación. De lo contrario, deberías asignar una palabra de activación simple a tu modelo, como un nombre de personaje o un nombre de estilo. A continuación, selecciona un archivo de anotación del conjunto de datos para usarlo como un aviso de vista previa.Si deseas usar el Modo Profesional, haz clic en el botón en la esquina superior derecha para cambiar al Modo Profesional.En el Modo Profesional, se recomienda duplicar la tasa de aprendizaje.Y usar el programador de tasa de aprendizaje cosine_with_restarts. Para el optimizador, puedes elegir AdamW8bit.Habilita el barajado de etiquetas y asegúrate de que el primer token permanezca sin cambios (especialmente si tienes una palabra de activación de nombre de personaje como el primer token).Desactiva la función de desplazamiento de ruido y puedes configurar el DIM de convolución a 8 y Alpha a 1.En la configuración de muestra, añade los Avisos Negativos, y luego puedes comenzar el proceso de entrenamiento.En la cola de entrenamiento, puedes ver el gráfico del valor de pérdida actual y las cuatro imágenes de muestra generadas para cada época.Finalmente, puedes elegir la época con los mejores resultados para descargarla a tu máquina local o publicarla directamente en TensorArt.Después de unos minutos, tu modelo estará desplegado y listo para usar.
如何使用混元DiT在线训练

如何使用混元DiT在线训练

首先点击右上角的头像,在弹出的下拉框中选择我训练的模型,进入训练中心。如果之前有训练过模型,这里会看到许多训练任务。然后选择在线训练按钮进行一次训练。左侧是数据集窗口,默认没有任何数据。您可以上传一些图片作为数据集,或者上传一个数据集压缩包,压缩包可以包含标注文件,格式和kohya-ss一样,每个图片文件对应一个同名的标注文件txt。右边的模型主题中可以选择二次元人物、真实人物、2.5D、标准以及自定义。训练混元模型这里我们选择标准,在使用底模中选择混元1.2模型。混元模型使用了40depth的块,所以非常大,训练相对速度较慢,需要更高的学习率,默认使用4e-4,默认单张图片重复次数5,优化器AdamW。基础模式下参数选择,推荐单张图片重复次数5,轮数为16。上传一个处理好的数据集后,如果你的数据集标注中有人物名,可以不写触发词。否则你应该给你的模型起一个简单的触发词,例如人物名称或者风格名称。接着从数据集中选择一个标注文件作为预览提示词。如果你想使用专业模式,选择右上角按钮切换到专业模式。专业模式推荐学习率翻倍,然后使用cosine_with_restarts学习率调度器,优化器选择AdamW或者AdamW8bit。开启打乱标签(shuffle),并且保持第1个token(如果你有一个人名触发词在第一个)关闭噪声偏移功能,卷积DIM和Alpha可以选择8和1。在样图设置中追加填写反向提示词,接下来就可以开始训练了。在训练队列中,你可以看到当前loss值变化表以及每轮epoch产生的4张样图。最后可以选择效果最好的epoch下载到本地或者直接在tensorart上发布。
3
SD3 - composition repair

SD3 - composition repair

SD3 can generate interesting images, but it has a huge problem with the human body. However, I noticed that simply reducing the image size to 60% can, in most cases, eliminate issues with image composition as well as extra hands or legs. This workflow does not solve the problem of having six fingers, etc. :)Base model: https://tensor.art/models/751330255836302856/Aderek-SD3-v1 or https://civitai.com/models/600179/aderek-sd3Look at the image below. You might say: "Hey, nothing's wrong here." Well, that's because you're already seeing the generation based on the reduced size. Below, you have the original image.Use composition on to use this trick&tips.Have fun!Support Paweł Tomczuk on Ko-fi! ❤️. ko-fi.com/aderek514 - Ko-fi ❤️ Where creators get support from fans through donations, memberships, shop sales and more! The original 'Buy Me a Coffee' Page.Visit my DeviantArt page: Aderek - Hobbyist, Digital Artist | DeviantArt
9
2
🆘 ERROR | Exception

🆘 ERROR | Exception

Exception (routeId: 7544339967855538950230)Suspect nodes:<string function>. <LayeStyle>, <LayerUtility>, <FaceDetailer>, many <TextBox>, <Bumpmap>After some reseach (on my own) I've found<FaceDetailer> node is completely broken<TextBox> and <MultiLine:Textbox> node will cause this error if you introduce more than 250+ characters, I'm not very sure about this number, but you won't be able to introduce a decent amount of text anymore.More than 40 nodes, despite its function will couse this error.How do i know this? Well I made a functional comfyflow following those rules:https://tensor.art/template/754955251181895419The next functional comfyflow suddelny stopped from generating, it's almost the same flow than the previous, but with <FaceDetailer> and large text strings to polish the prompt. It works again yay!https://tensor.art/template/752678510492967987 proof it really worked (here)I feel bad for you if this error suddenly disrupt your day; feel bad for me cuz I bought the yearly membership of this broken product I can't refound. I'll be happy to delete this bad review if you fix this error.News081124 | <String Function> has been taken down. Comfyflow works slowly (but works)081024 | eveything is broken again lmao, we cant generate outside TAMS.080624 | <reroute> output node could trigger this error when linked to many inputs.072824 | <FaceDetailer> node seems to work again.
4
Upscaling in ComfyUI: ¿Algorithm or Latent?

Upscaling in ComfyUI: ¿Algorithm or Latent?

Hello again! In this little article I want to explain the upscaling methods that I know in ComfyUI and that I have researched. I hope they will help you and that you can use them in the creation of your workflows and AI tools. In addition, remember that if you have any useful knowledge, you can share it in the comments section to enrich the topic. Also, please excuse any spelling mistakes; I am just learning English hehe.¡Let’s get to the point!To the best of my knowledge, there are two widely used ways in ComfyUI to achieve uspcaling (you decide which one to use according to your needs). The two options are: Algorithm Method or Latent Method.Algorithm Method:This is one of the most commonly used method, and is readily available. It consists of loading an upscaling model, and connecting it to the workflow. That way the image pixels are manipulated as the user wishes. It is very similar to the upscale method used in the normal way of creating images in Tensor Art.The following nodes are needed:A. Load Upscale Model.B. Upscale Image (Using Model).These nodes are connected to the workflow between the “VAE Decode” and “Save Image” nodes; as shown in the image. Once this structure is created, you can choose from all the different models offered by the “Load Upscale Model” node, ranging from “2x-ESRGAN.pth” to “SwimIR_4x”. You can use any of the 23 available models and experiment with any of them. You just have to click on the node and the list will be displayed.This can also be achieved in other ways by using another node such as “Upscale Image By”. The structure is simpler to create because only that node is connected between the VAE decode and Save Image as shown in the following image.Once the node is connected, you are free to select the mode in which you want to upscale the image (Upscale_method) and you can also set the scale to which you can recondition the image pixel value (Scale By).Strengths and Weaknesses of the Algorithm Method:Among the strengths of this method are its ease of integration into the workflow and its advantage of choosing between several upscaling model options. It also allows fast generation both in the ComfyUI and in the use of AI tools.However, among its weaknesses, it is not very effective in some specific contexts. For example: the algorithm can upscale the image pixels but does not alter the actual image size; causing the generated image itself to end up being blurred in some cases.Latent Method:This is the other alternative option to the algorithm method. It is focused on highlighting image details and maximizing quality. This method is also one of the most used in the Workflow mode of different visual content creation platforms with artificial intelligence. Here, upscaling is performed while the image is generated from latent space (Latent space is where the IA takes data from the prompt, deconstructs it for analysis and then reconstructs it to represent it in an image).The Latent Upscale node is placed between the two Ksamplers. While the first Ksampler is connected to the “Empty Latent Image” node, the second one is connected to the “VAE Decode” to ensure the correct processing and representation of the generated image.It should be noted that the “Empty Latent Image” node and the “VAE Decode” node are already included by default in the Text2Image templates in WorkFlow mode. (For more information about Text2Image, you can see my other article called “ComfyUi: Text2Image Basic Glossary”).It is important to take into consideration that for this method to work properly, you have to know how to create a correct balance between the original size of the image and its upscaled size. For example, you can generate a 512x512 image and upscale it to 1024x1024; but it is not recommended to make a 512x512 image (square image) and upscale it to 768x11152 (rectangular image) since the shape of the image would not be compatible with its uspcale version. For this reason you have to pay attention to the values of the “Empty Latent Image” and the “Latent Upscale”, so that these are always proportional.In the “Empty Latent Image” node you must place the original image dimensions (for example: 768x1152); while in the “Latent Upscale” node you must place the resized image dimensions (for example: 1152x1728). In this way you are given the freedom to set the image size to your own discretion. For this I always recommend to look at the size and upscale of the normal mode in which we create illustrations, this way we will always know which values to set and which will be compatible. As you can see in the image. You look at those values, and then write them to the nodes listed above.Once everything is connected and configured, you are able to have images of any size you want. You can experiment to your taste.Strengths and Weaknesses of the Latent Method:As strengths this option should be highlighted that it allows you to access excellent quality images if everything is correctly configured. It also allows you to create images of a custom size and upscale with the values you want. It brings out the details in both SD and XL images.As negative points we have to configure everything manually every time you want to change the size of the images or the shape of the same. Also, this method is just a little bit slower in the generation process compared to the algorithm method.Which is better: ¿Algorithm or Latent?Neither method is better than the other. Both are useful in different contexts. Remember that workflows will be different from user to user, because we all have different ways of creating and designing things.It all depends on your taste and whether you want something simpler or more elaborate. I hope the explanation in this article has helped you to make Workflows more complex and to make it easier to make the images you want.Extra Tip:If you do not find any of the nodes outlined in this document. You can double click on any empty place in the workflow and you can search for the name of the node you are looking for. Just remember to type the name without spaces.
12
2
Controlnet with SD3

Controlnet with SD3

Today, I noticed that I can add ControlNet to the SD3 model.The Tiled function works very well, so I incorporated it into my workflow and created a group for generating artistic images based on a given photo or a previously generated image. In the main part of the workflow, I simply set a very short prompt, like "grass, flowers," and I get an image that blends grass and flowers in an arrangement resembling the base photo.https://youtu.be/sv35wKNiFGsControlnet with SD3 | ComfyUI Workflow | Tensor.Art
3
如何使用SD3在线训练

如何使用SD3在线训练

首先点击右上角的头像,在弹出的下拉框中选择我训练的模型,进入训练中心。如果之前有训练过模型,这里会看到许多训练任务。然后选择在线训练按钮进行一次训练。左侧是数据集窗口,默认没有任何数据。您可以上传一些图片作为数据集,或者上传一个数据集压缩包,压缩包可以包含标注文件,格式和kohya-ss一样,每个图片文件对应一个同名的标注文件txt。右边的模型主题中可以选择二次元人物、真实人物、2.5D、标准以及自定义。这里我们选择自定义,在使用底模中选择SD3模型。注意在选择版本中下拉框内选择T5XXL的版本,这样才可以训练T5文本编码器。基础模式下参数选择,推荐单张图片重复次数4,轮数为16。上传一个处理好的数据集后,如果你的数据集标注中有人物名,可以不写触发词。否则你应该给你的模型起一个简单的触发词,例如人物名称或者风格名称。接着从数据集中选择一个标注文件作为预览提示词。如果你想使用专业模式,选择右上角按钮切换到专业模式。专业模式推荐学习率翻倍,然后使用cosine_with_restarts学习率调度器,优化器可以选择AdamW8bit。开启打乱标签(shuffle),并且保持第1个token(如果你有一个人名触发词在第一个)关闭噪声偏移功能,卷积DIM和Alpha可以选择8和1。在样图设置中追加填写反向提示词,接下来就可以开始训练了。在训练队列中,你可以看到当前loss值变化表以及每轮epoch产生的4张样图。最后可以选择效果最好的epoch下载到本地或者直接在tensorart上发布。
3
1
SD3 - training on your own PC

SD3 - training on your own PC

So first, you need to update your version of OneTrainer.Second, u need dowload ALL files and folders (and rename)stabilityai/stable-diffusion-3-medium-diffusers at main (huggingface.co)then u put it:With float16 output lora has only 36MB:This is my setting for a style training:My checkpoint to testing u can dowload for free:Aderek SD3 - v1 | Stable Diffusion Model - Checkpoint | Tensor.Artand my loras: Aderek514's Profile | Tensor.ArtSo, good luck!
11
2
ReActor Node for ComfyUI (Face Swap)

ReActor Node for ComfyUI (Face Swap)

ReActor Node for ComfyUI 👉Downlond👈 https://github.com/lingkops4/lingko-FaceReActor-Nodeworkflowhttps://github.com/lingkops4/lingko-FaceReActor-Node/blob/main/face_reactor_workflows.jsonThe Fast and Simple Face Swap Extension Node for ComfyUI, based on ReActor SD-WebUI Face Swap ExtensionThis Node goes without NSFW filter (uncensored, use it on your own responsibility)| Installation | Usage | Troubleshooting | Updating | Disclaimer | Credits | Note!✨What's new in the latest update✨💡0.5.1 ALPHA1Support of GPEN 1024/2048 restoration models (available in the HF dataset https://huggingface.co/datasets/Gourieff/ReActor/tree/main/models/facerestore_models)👈[]~( ̄▽ ̄)~*ReActorFaceBoost Node - an attempt to improve the quality of swapped faces. The idea is to restore and scale the swapped face (according to the face_size parameter of the restoration model) BEFORE pasting it to the target image (via inswapper algorithms), more information is here (PR#321)InstallationSD WebUI: AUTOMATIC1111 or SD.NextStandalone (Portable) ComfyUI for WindowsUsageYou can find ReActor Nodes inside the menu ReActor or by using a search (just type "ReActor" in the search field)List of Nodes:••• Main Nodes •••💡ReActorFaceSwap (Main Node Download)👈[]~( ̄▽ ̄)~*ReActorFaceSwapOpt (Main Node with the additional Options input)ReActorOptions (Options for ReActorFaceSwapOpt)ReActorFaceBoost (Face Booster Node)ReActorMaskHelper (Masking Helper)••• Operations with Face Models •••ReActorSaveFaceModel (Save Face Model)ReActorLoadFaceModel (Load Face Model)ReActorBuildFaceModel (Build Blended Face Model)ReActorMakeFaceModelBatch (Make Face Model Batch)••• Additional Nodes •••ReActorRestoreFace (Face Restoration)ReActorImageDublicator (Dublicate one Image to Images List)ImageRGBA2RGB (Convert RGBA to RGB)Connect all required slots and run the query.Main Node Inputsinput_image - is an image to be processed (target image, analog of "target image" in the SD WebUI extension);Supported Nodes: "Load Image", "Load Video" or any other nodes providing images as an output;source_image - is an image with a face or faces to swap in the input_image (source image, analog of "source image" in the SD WebUI extension);Supported Nodes: "Load Image" or any other nodes providing images as an output;face_model - is the input for the "Load Face Model" Node or another ReActor node to provide a face model file (face embedding) you created earlier via the "Save Face Model" Node;Supported Nodes: "Load Face Model", "Build Blended Face Model";Main Node OutputsIMAGE - is an output with the resulted image;Supported Nodes: any nodes which have images as an input;FACE_MODEL - is an output providing a source face's model being built during the swapping process;Supported Nodes: "Save Face Model", "ReActor", "Make Face Model Batch";Face RestorationSince version 0.3.0 ReActor Node has a buil-in face restoration.Just download the models you want (see Installation instruction) and select one of them to restore the resulting face(s) during the faceswap. It will enhance face details and make your result more accurate.Face IndexesBy default ReActor detects faces in images from "large" to "small".You can change this option by adding ReActorFaceSwapOpt node with ReActorOptions.And if you need to specify faces, you can set indexes for source and input images.Index of the first detected face is 0.You can set indexes in the order you need.E.g.: 0,1,2 (for Source); 1,0,2 (for Input).This means: the second Input face (index = 1) will be swapped by the first Source face (index = 0) and so on.GendersYou can specify the gender to detect in images.ReActor will swap a face only if it meets the given condition.💡Face ModelsSince version 0.4.0 you can save face models as "safetensors" files (stored in ComfyUI\models\reactor\faces) and load them into ReActor implementing different scenarios and keeping super lightweight face models of the faces you use.To make new models appear in the list of the "Load Face Model" Node - just refresh the page of your ComfyUI web application.(I recommend you to use ComfyUI Manager - otherwise you workflow can be lost after you refresh the page if you didn't save it before that).TroubleshootingI. (For Windows users) If you still cannot build Insightface for some reasons or just don't want to install Visual Studio or VS C++ Build Tools - do the following:(ComfyUI Portable) From the root folder check the version of Python:run CMD and type python_embeded\python.exe -VDownload prebuilt Insightface package for Python 3.10 or for Python 3.11 (if in the previous step you see 3.11) or for Python 3.12 (if in the previous step you see 3.12) and put into the stable-diffusion-webui (A1111 or SD.Next) root folder (where you have "webui-user.bat" file) or into ComfyUI root folder if you use ComfyUI PortableFrom the root folder run:(SD WebUI) CMD and .\venv\Scripts\activate(ComfyUI Portable) run CMDThen update your PIP:(SD WebUI) python -m pip install -U pip(ComfyUI Portable) python_embeded\python.exe -m pip install -U pip💡Then install Insightface:(SD WebUI) pip install insightface-0.7.3-cp310-cp310-win_amd64.whl (for 3.10) or pip install insightface-0.7.3-cp311-cp311-win_amd64.whl (for 3.11) or pip install insightface-0.7.3-cp312-cp312-win_amd64.whl (for 3.12)(ComfyUI Portable) python_embeded\python.exe -m pip install insightface-0.7.3-cp310-cp310-win_amd64.whl (for 3.10) or python_embeded\python.exe -m pip install insightface-0.7.3-cp311-cp311-win_amd64.whl (for 3.11) or python_embeded\python.exe -m pip install insightface-0.7.3-cp312-cp312-win_amd64.whl (for 3.12)Enjoy!II. "AttributeError: 'NoneType' object has no attribute 'get'"This error may occur if there's smth wrong with the model file inswapper_128.onnx💡Try to download it manually from here and put it to the ComfyUI\models\insightface replacing existing oneIII. "reactor.execute() got an unexpected keyword argument 'reference_image'"This means that input points have been changed with the latest updateRemove the current ReActor Node from your workflow and add it againIV. ControlNet Aux Node IMPORT failed error when using with ReActor NodeClose ComfyUI if it runsGo to the ComfyUI root folder, open CMD there and run:python_embeded\python.exe -m pip uninstall -y opencv-python opencv-contrib-python opencv-python-headlesspython_embeded\python.exe -m pip install opencv-python==4.7.0.72That's it!reactor+controlnetV. "ModuleNotFoundError: No module named 'basicsr'" or "subprocess-exited-with-error" during future-0.18.3 installationDownload https://github.com/Gourieff/Assets/raw/main/comfyui-reactor-node/future-0.18.3-py3-none-any.whlPut it to ComfyUI root And run:python_embeded\python.exe -m pip install future-0.18.3-py3-none-any.whlThen:python_embeded\python.exe -m pip install basicsrVI. "fatal: fetch-pack: invalid index-pack output" when you try to git clone the repository"Try to clone with --depth=1 (last commit only):git clone --depth=1 https://github.com/Gourieff/comfyui-reactor-nodeThen retrieve the rest (if you need):git fetch --unshallow
21
13
ComfyUi: Text2Image Basic Glossary

ComfyUi: Text2Image Basic Glossary

Hello! This is my first article; I hope it will be of benefit to the person who reads it. I still have limited knowledge about WorkFlow; but I have researched and learned little by little. If anyone would like to contribute some content; you are totally free to do so. Thank you.I made this article to give a brief and basic explanation about basic concepts about Comfyui or WorkFlow. This is a technology with many possibilities and it would be great to make it easier to use for everyone! What is Workflow?Workflow is one of the two main image generation systems that Tensor Art has at the moment. It corresponds to a generation method that is characterized by a great capacity to stimulate the creativity of the users; also, it allows us to access to some Pro features being Free users.How do I access the WorkFlow mode?To access the WorkFlow mode, you must place the mouse cursor on the “Create” tab as if you were going to create an image by conventional means. Once you have done that; click on the “ComfyFlow” option and you are done.After that, you will see a tab with two options “New WorkFlow” and “Import WorkFlow”. The first one allows you to start a workflow from a template or from scratch; while the second option allows you to load a workflow that you have saved on your pc in a JSON file.If you click on the “New WorkFlow” option, a tab with a list of various templates will be displayed (each template will have a different purpose). But the main one will be “Text2Image”; it will allow us to create images from text, similarly to the conventional method we always use. You can also create a workflow from scratch in the “Empty WorkFlow Template” option but for a better explanation of the basics we will use the “Text2Image”.Once you click on the "Text2Image" option, you must wait a few seconds and a new tab will be displayed with the template, which contains the basics to create an image by means of text. Nodes and Borders: ¿What are they and how do they work?Well, to understand the basics of how a WorkFlow works, it is necessary to have a clear understanding of what Nodes and Border are.Nodes are small boxes that are present in the workflow; each node will have a specific function necessary for the creation, enhancement or editing of the image or video. The basics of Text2Image are the CheckPoint loader, the Clip Text Encoders, the Empty Lantent Image, the Ksampler, the VAE decoder, and Save Image. It should be noted that there are hundreds of other nodes besides these basics and they all have many different functions.On the other hand, the “Borders” are the small colored wires that connect the different nodes. They are the ones that will set which nodes will be directly related. The Borders are ordered by colors that are generally related to a specific function.The purple is related to the Model or Lora used.The yellow one is intended for connection to the model or lora with the space to place the prompt.The red refers to VAE.The orange color refers to the connection between the spaces for placing the prompt and the “Ksampler” node.The fucsia color makes allusion to the latent, which will serve for many things; but for this case it serves to connect the “Empty Latent Image” node with the “Ksampler” node and establish the number and size of the images that will be generated.And the blue color is related to everything that has to do with images; it has many uses but this case is related to the “Save Image” node.What are the Text2Image template Nodes used for?Having this clear is of utmost relevance, since it allows you to know what each node of this basic template is for. It's like knowing what each piece in a lego set is for and understanding how they should be connected to create a beautiful masterpiece! Also, if you get to know what these nodes are for, it will be easier for you to intuit the functionality of its variants and other derived nodes.A) The first one is the node called “Load Chckpoint”, this node has three specific functions. The first one is to load the base model or checkpoint with which an image will be created. The second is the Clip, which will take care of connecting the positive and negative prompts that you write to the checkpoint. And the third is that it connects and helps to load the VAE model. B) The second one is the “Empty Latent Image”; which is the node in charge of processing the image dimensions from the latent space. It has two functions: First, set the width and length of the image; and second, set how many images will be generated simultaneously according to the “Batch Size” option.C) The third is the two “Clip Text Enconder” nodes: in this case there will always be at least two of these nodes, since they are responsible for setting both the positive and negative prompts that you write to describe the image you want. They are usually connected to the "Load Checkpoint" or any LoRa and are also connected to the “Ksampler” node.D) Then, there is a node “Ksampler”. This node is the central point of all WorkFlow; it is the one that sets the most important parameters in the creation of images. It has several functions: the first one is to determine which is the seed of the image and to regulate how much it changes from image to generated image by means of the “control_after_generate” option. The second function is to set how many steps are needed to create the image (you set them as you wish); the third function is to determine which sampling method is used and also what is the scheduler of this method (this helps to regulate how much space is eliminated when creating the image).E) The penultimate one is the VAE decoder. This node is in charge of assisting the processing of the image to be generated: its main function is to be responsible for materializing the written prompt into an image. That is to say, it reconstructs the description of the image we want as one of the final steps to finish the generation process. Then, the information is transmitted to the “Save Image” node to display the generated image as the final product.F) The last node to explain is the “Save Image”. This node has the simple function of saving the generated image and providing the user with a view of the final work that will later be stored in the taskbar where all the generated images are located.Final Consideration:This has been a small summary and explanation about very basic concepts about ComfyUI Mode; you could even say that it is like a small glossary about general terms. I have tried to give a small notion that tries to facilitate the understanding of this image generation tool. There is still a lot to explain, but I will try to cover all the topics; the information would not fit in a single article (ComfyUI is a whole universe of possibilities). ¡Thank you so much for taking the time to read this article!
35
13
Textual Inversion Embeddings  ComfyUI_Examples

Textual Inversion Embeddings ComfyUI_Examples

ComfyUI_examplesTextual Inversion Embeddings ExamplesHere is an example for how to use Textual Inversion/Embeddings.To use an embedding put the file in the models/embeddings folder then use it in your prompt like I used the SDA768.pt embedding in the previous picture.Note that you can omit the filename extension so these two are equivalent:embedding:SDA768.ptembedding:SDA768You can also set the strength of the embedding just like regular words in the prompt:(embedding:SDA768:1.2)Embeddings are basically custom words so where you put them in the text prompt matters.For example if you had an embedding of a cat:red embedding:catThis would likely give you a red cat.
13
1
Art Mediums (127 Style)

Art Mediums (127 Style)

Art MediumsVarious art mediums. Prompted with '{medium} art of a woman MetalpointMiniature PaintingMixed MediaMonotype PrintingMosaic Tile ArtMosaicNeonOil PaintOrigamiPapermakingPapier-mâchéPastelPen And InkPerformance ArtPhotographyPhotomontagePlasterPlastic ArtsPolymer ClayPrintmakingPuppetryPyrographyQuillingQuilt ArtRecycled ArtRelief PrintingResinReverse Glass PaintingSandScratchboard ArtScreen PrintingScrimshawSculpture WeldingSequin ArtSilk PaintingSilverpointSound ArtSpray PaintStained GlassStencilStoneTapestryTattoo ArtTemperaTerra-cottaTextile ArtVideo ArtVirtual Reality ArtWatercolorWaxWeavingWire SculptureWoodWoodcutGlassGlitch ArtGold LeafGouacheGraffitiGraphite PencilIceInk Wash PaintingInstallation ArtIntaglio PrintingInteractive MediaKinetic ArtKnittingLand ArtLeatherLenticular PrintingLight ProjectionLithographyMacrameMarbleMetalColored PencilComputer-generated Imagery (cgi)Conceptual ArtCopper EtchingCrochetDecoupageDigital MosaicDigital PaintingDigital SculptureDioramaEmbroideryEnamelEncaustic PaintingEnvironmental ArtEtchingFabricFeltingFiberFoam CarvingFound ObjectsFrescoAugmented Reality ArtBatikBeadworkBody PaintingBookbindingBronzeCalligraphyCast PaperCeramicsChalkCharcoalClayCollageCollagraphy3d PrintingAcrylic PaintAirbrushAlgorithmic ArtAnimationArt GlassAssemblage
36
2
Anime Vision | Detail Enhancer SD3

Anime Vision | Detail Enhancer SD3

SD3 Anime LoRA is Finally Here!I am thrilled to announce that the SD3 Anime LoRA model is finally available. In addition, I am releasing a new update that includes an SD3 anime checkpoint model.Currently, I am publishing a beta version as I continue to work diligently to perfect the model. I aim to have the final release ready by the end of this month or early August.Stay tuned, as the SD3 Anime beta version will be available within the next couple of days!Here are some guidelines to use this LoRA to its full potential:If you are trying to create any specific subject or object, use trigger word like 'anime style' in your prompt.If you're targeting a character, you can ignore the keyword and go with something like this:For a male character: 'anime boy'For a female character: 'anime girl'Simple, right? You can also use the trigger word 'anime style' most of the time. I've noticed it gives better results.ModelRecommended Parameter :LoRA Weight : 0🆙1VAE : No NeedSampler : DPM++ 2M SGM UniformSteps : 20➡30CFG : 3➡4Upscaler : R-ESRGAN 4x+If you encounter any issues, I recommend using ComfyUI for a better experience. Here's the workflow: ComfyUI Workflow. Open the link, select the LoRA model, choose the LoRA strength, and hit the run button.Join my community, Share your feedback, learn, and have fun with us! 😊Discord➡️https://discord.gg/QQKd7bu97P
24
1
How to set up Radio Button in your AI Tools

How to set up Radio Button in your AI Tools

Hello everyone! ✨ Today I will bring you a super practical tutorial: How to set up a convenient prompt word radio version for your AI Tools! 😎 Save it quickly, and you will never have to worry about how to set prompt words again! 👌Are you ready for the course? Let's get started! 🔍First, the first step is to open the official website of TensorArt. 📂 After opening, you will see a variety of AI tools and resources, which are very rich~ 👀Next, open comfyflow and start making our AI Tool! 🤖 This process is simple and fun, let's explore it together! ✨In comfyflow, we click the "New" button, which will take you to a new interface~ 🖱️💻In this interface, we can start creating our own workflow~ 🌟🎉 Next, we need to fill in the positive prompt words, which is a super critical step! 📝✨In the positive prompt word area, we need to enter the content we want. 📋 Here, the editor simply wrote an example for everyone: "a man". 🤵 This example is just for the convenience of teaching, you can freely play according to your needs~ 🌈🎆🎉 When you have completed the workflow, you can click the "Publish" button in the upper right corner! 🚀✨Don't forget to give your AI Tool an interesting name! 💡 This name will make your tool more attractive~✨ In addition, remember to divide the area correctly, so that you can see it clearly and it is also convenient for your friends to find and use it! 📂🔍🌟 Next, let's complete the next step together! 💪We pull down the current interface and find the user-configurable settings area. 👏 Then click the "Add" button. This step is very critical! 🖱️✨ Everyone must remember to add your positive prompt word node! 🔍✨After adding the node, our next step is to click the "Set" button on the right to proceed to the next step. 🔧✨ This step is crucial! Don't miss it! 😉🚀✨ The next step is also very important! 😊First, click the radio button, then click "Add". 🔘✨ Here, you can add the buttons you want to release to the user! 👍 After selecting, be sure to click "Confirm"! ✔️✨Friends, we have finally reached the last step! 🎉💪 This is an exciting moment! ✨When you have completed all the operations, remember to click the "Publish" button to publish your AI gadget! 🚀✨ Can't wait to see the results? Hurry up and generate a picture yourself to try and experience your results! 🌟🖼️Well, that's all for today's tutorial! 😊 I hope everyone can complete it successfully and create their own AI gadgets! 👏 If you have any questions, don't hesitate to leave a comment in the comment section at any time! ❤️
25
5
Guide to Using SDXL / SDXLモデルの利用手引

Guide to Using SDXL / SDXLモデルの利用手引

Guide to Using SDXLI occasionally see posts about difficulties in generating images successfully, so here is an introduction to the basic setup.1. IntroductionSDXL is a model that can generate images with higher accuracy compared to SD1.5. It produces high-quality representations of human bodies and structures, with fewer distortions and more realistic fine details, textures, and shadows.With SD1.5, generation parameters were generally applicable across different models, so there was no need for specific adjustments.However, while SDXL can still use some SD1.5 techniques without issues, the recommended generation parameters vary significantly depending on the model.Additionally, LoRA and Embeddings (such as EasyNegative) are completely incompatible, requiring a review of prompt construction.Notably, embeddings commonly used in SD1.5 negative prompts are recognized merely as strings in the XL model, so you must replace them with corresponding embeddings or add appropriate tags.This guide explains the recommended parameter settings for using SDXL.2. Basic ParametersVAESelecting "sdxl-vae-fp16-fix.safetensors" will suffice.Many models have this built-in, so specification might not be necessary.Image SizeUsing the presets provided by TensorArt for resolution should be sufficient.Small or excessively large resolutions may not yield appropriate generation results, so please avoid using the sizes that were frequently used with SD1.5 wherever possible.Even if you want to create vertically or horizontally elongated images, do so within the range that does not significantly alter the total pixel count (adjust by increasing height and decreasing width, for example).Sampling MethodChoose the sampler recommended for the model first.Then, select according to your preference.Typically, selecting Euler a or DPM++ 2M SDE Karras should work well.Sampling StepsXL models might generate images effectively with lower steps due to optimizations like LCM or Turbo.Be sure to check the recommended values for the selected model.CFG ScaleThis varies by model, so check the recommended values.Typically, the range is around 2 to 8.Hires.fixFor free users, specifying 1.5x might hit the upper limit, so use custom settings with the following resolutions:768x1152 -> 1024x15361152x768 -> 1536x10241024x1024 -> 1248x1248Choose the upscaler according to your preference.Set the denoising strength to around 0.3 to 0.4.3. PromptSDXL handles natural language better.You can input elements separated by commas or simply write a complete sentence in English, and it will generate images as intended.Using a tool like ChatGPT to create prompts can also be beneficial.However, depending on how the model was additionally trained, it might be better to use existing tags.Furthermore, some models have tags specified to enhance quality, so always check the model’s page.For example:AnimagineXL3.1: masterpiece, best quality, very aesthetic, absurdres is recommended.Pony Models: score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up is recommended.ToxicEchoXL: masterpiece, best quality, aesthetic is recommended.In this way, especially for XL models, particularly anime or illustration models, appropriate tag usage is crucial.4. Negative PromptsForget the negative prompts used in SD1.5. "EasyNegative" is just a string.The embeddings usable on TensorArt are negativeXL_D and unaestheticXLv13.Choose according to your preference.Some models have recommended prompts listed.For AnimagineXLnsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]For ToxicEchoXLnsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name.For photo models, sometimes it is better not to use negative prompts to create a certain atmosphere, so try various approaches.5. Recommended SDXL modelToxicEnvisionXLhttps://tensor.art/models/736585744778443103/ToxicEnvisionXL-v1Recently released high-quality photo model. Yes, I created it.If you are looking for a photo model, you can't go wrong with this one.Check the related posts to see what kind of images can be created.You can create a variety of realistic images, from analog photo styles to gravure, movies, fantasy, and surreal depictions.Although it is primarily a photo-based model, it can also create analog-style images.ToxicEtheRealXLhttps://tensor.art/models/702813703965453448/ToxicEtheRealXL-v1A versatile model that supports both illustrations and photorealistic images. Yes, I created it.The model's flexibility requires well-crafted prompts to determine whether the output is more illustrative or photorealistic.Using LoRA to strengthen the direction might make it easier to use.ToxicEchoXLhttps://tensor.art/models/689378702666043553/ToxicEchoXL-v1A high-performance model specialized for illustrations. Yes, I created it.It features a unique style based on watercolor painting, with custom learning and adjustments.I have also created various LoRA for style changes, so please visit my user page.My current favorite is Beautiful Warrior XL + atmosphere.The model covers a range from illustrations to photos, so give it a try.However, it is weak in generating copyrighted characters, so use LoRA or models like AnimagineXL or Pony for those.ToxicEchoXL can produce unique illustration styles when using character LoRA, making it highly suitable for fan art.6. ConclusionI hope this guide helps those who struggle to generate images as well as others.Well... if you remix from Model Showcase, you can create beautiful images without this guide...SD3 has also been released, so if possible, I would like to create models for that as well.It seems that a commercial license is required for commercial use, though...SDXLモデルの利用手引ここではSDXLの基本的な設定を紹介します。1. はじめにSDXLはSD1.5と比較してより高精度な生成が行えるモデルです。人体や構造物はより高品質で破綻が少なく、微細なディテールがよりリアルに表現され、自然なテクスチャや影を描写します。SD1.5ではどのモデルでも生成パラメータは概ね流用可能で、特に気にする必要はありませんでした。SDXLは一部SD1.5の手法を利用しても問題ありませんが、推奨される生成パラメータがモデルによってもだいぶ変わります。またLoRAやEmbeddings(EasyNegativeなど)も一切互換性はありませんので、プロンプトの構築も見直す必要があります。特にSD1.5のネガティブプロンプトでよく使用されているEmbeddingsをそのままXLモデルで入力しても、ただの文字列としてしか認識されていませんので、対応するEmbeddingsに差し替えるか、適切なタグを追加しなければいけません。このガイドでは、SDXLを使用する際の推奨パラメータ設定について説明します。2. 基本的なパラメータVAEsdxl-vae-fp16-fix.safetensorsを選択しておけば問題ありません。モデルに内蔵されている場合も多いですので、指定しなくても大丈夫な場合もあります。画像サイズ解像度はTensorArtで用意されているプリセットを使えば問題ありません。小さかったり大きすぎる解像度は適切な生成結果を得られなくなりますので、SD1.5でよく使用していたサイズはなるべく使用しないでください。プリセットよりも縦長や横長にしたい場合でも、総ピクセル数を大幅に変更しない範囲で行ってください。(縦を増やしたら横は減らす等で調整)サンプリング法モデルによって推奨されるサンプラーがありますので、まずはそれを選択してください。あとはお好みです。基本は Euler a か DPM++ 2M SDE Karras あたりを選択しておけば大丈夫です。サンプリング回数XLではLCMやターボなど低ステップで生成できるようになっていたりしますので、必ずモデルの推奨値を確認してください。CFG Scaleこれもモデルによって異なりますので推奨値を確認してください。概ね2~8程度です。高解像度修復無料ユーザーだと1.5xを指定すると上限に引っかかってしまいますので、使用する場合はカスタムにして以下の解像度を指定してください768x1152 -> 1024x15361152x768 -> 1536x10241024x1024 -> 1248x1248Upscalerはお好みで指定してください。Denoising strengthは0.3~0.4程度。3. プロンプトSDXLはより自然言語の取り扱いに長けています。要素をコンマで区切って入力するだけではなく、普通に英文を入力するだけでも意図した通りの生成が行えます。ChatGPTなどにプロンプトを作ってもらうのもいいでしょう。ただしモデルが追加学習をどのように行ったかによって、既存のタグで記述したほうがいい場合もあります。また、モデルによっては品質を上げるためのタグが指定されていますので、使用するモデルのページは必ず見るようにしましょう。例えば…AnimagineXL3.1では「masterpiece, best quality, very aesthetic, absurdres」を指定することが推奨されています。Pony系モデルでは「score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up」が基本テンプレートとなっています。ToxiEchoXLでは「masterpiece, best quality, aesthetic」を指定することが推奨されています。このように、XLモデル、特にアニメ・イラストモデルでは適切なタグの使用が求められる場合があります。4. ネガティブプロンプトSD1.5で使用していたネガティブプロンプトは忘れてください。EasyNegativeはただの文字列です。TensorArtで使用できるEmbeddingsは negativeXL_D と unaestheticXLv13 です。お好みで指定してください。推奨されるプロンプトが記載されているモデルもあります。AnimagineXLでは以下のようなプロンプトが推奨されていますので、これをベースに組むのがいいかもしれません。nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]ToxicEchoXLでは以下のようなプロンプトが推奨されていますnsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name,フォトモデルではネガティブプロンプト無しのほうが雰囲気のある画作りができる場合もありますので、色々試してみてください。5. おすすめのSDLXモデル紹介ToxicEnvisionXLhttps://tensor.art/models/736585744778443103/ToxicEnvisionXL-v1最近リリースされた高品質フォトモデル。実写系モデルを探しているならこれを選んでおけば間違いありません。関連する投稿からどういった画像が作成できるか見てみてください。アナログ写真風からグラビア、映画、ファンタジー、非現実的な描写等、様々な実写的な画像が作成できます。基本的にはフォトベースのモデルですが、アナログ画風も作成できたりします。ToxicEtheRealXLhttps://tensor.art/models/702813703965453448/ToxicEtheRealXL-v1イラストからフォトリアルまで幅広く対応したモデル。プロンプトによってイラストかフォトリアルか振れ幅が大きいので、明確にプロンプトの作り込みが必要です。LoRAで方向性を強めると使いやすいかもしれません。ToxicEchoXLhttps://tensor.art/models/689378702666043553/ToxicEchoXL-v1イラスト特化の超高性能モデル。水彩をベースに独自の学習・調整を行っているので、わりと独特な画風を持っています。画風変更に様々なLoRAも作成していますので、是非私のユーザーページへお越しください。https://tensor.art/u/649265516304702656最近のお気に入りはBeautiful Warrior XL + atmosphere です。イラストからフォトまで一通り網羅できるので、是非使ってみてください。なお版権キャラの生成は弱いので、その辺はLoRAかAnimagineXLとかPonyとか使うといいと思います。ToxicEchoXLはキャラLoRAを使うと他のモデルとはタッチの違うイラストが作れますので、ファンアート適正自体は高いです。6. おわりにモデルのサンプルやみんなみたいにうまく生成できないな…という方の助けになれば幸いです。まあ…モデルのショーケースからリミックスすればこんなガイド見なくてもきれいな画像が作れますけどね…SD3もリリースされたので、もし可能ならそちらのモデルも作成してみたいですね。どうも商用利用は有償のライセンスが必要そうですが…
36
Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

📝 - SynthicalThe Dynamics of Negative Prompts in AI: A Comprehensive Study by: Yuanhao Ban UCLA, Ruochen Wang UCLA, Tianyi Zhou UMD, Minhao Cheng PSU, Boqing Gong, Cho-Jui Hsieh UCLAEThis study addresses the gap in understanding the impact of negative prompts in AI diffusion models. By focusing on the dynamics of diffusion steps, the research aims to answer the question: "When and how do negative prompts take effect?". The investigation categorizes the mechanism of negative prompts into two primary tasks: noun-based removal and adjective-based alteration.The role of prompts in AI diffusion models is crucial for guiding the generation process. Negative prompts, which instruct the model to avoid generating certain features, have been less studied compared to their positive counterparts. This study provides a detailed analysis of negative prompts, identifying the critical steps at which they begin to influence the image generation process.FindingsCritical Steps for Negative PromptsNoun-Based Removal: The influence of noun-based negative prompts peaks at the 5th diffusion step. At this critical step, negative prompts initially generate a target object at a specific location within the image. This neutralizes the positive noise through a subtractive process, effectively erasing the object. However, introducing a negative prompt in the early stages paradoxically results in the generation of the specified object. Therefore, the optimal timing for introducing these prompts is after the critical step.Adjective-Based Alteration: The influence of adjective-based negative prompts peaks around the 10th diffusion step. During the initial stages, the absence of the object leads to a subdued response. Between the 5th and 10th steps, as the object becomes clearer, the negative prompt accurately focuses on the intended area and maintains its influence.Cross-Attention DynamicsAt the peak around the 5th step for noun-based prompts, the negative prompt attempts to generate objects in the middle of the image, regardless of the positive prompt's context. As this process approaches its peak, the negative prompt begins to assimilate layout cues from its positive counterpart, trying to remove the object. This represents the zenith of its influence.For adjective-based prompts, during the peak around the 10th step, the negative prompt maintains its influence on the intended area, accurately targeting the object as it becomes clear.The study highlights the paradoxical effect of introducing negative prompts in the early stages of diffusion, leading to the unintended generation of the specified object. This finding suggests that the timing of negative prompt introduction is crucial for achieving the desired outcome.Reverse Activation PhenomenonA significant phenomenon observed in the study is Reverse Activation. This occurs when a negative prompt, introduced early in the diffusion process, unexpectedly leads to the generation of the specified object within the context of that negative prompt. To explain this, researchers borrowed the concept of the energy function from Energy-Based Models to represent data distribution.Real-world distributions often feature elements like clear blue skies or uniform backgrounds, alongside distinct objects such as the Eiffel Tower. These elements typically possess low energy scores, making the model inclined to generate them. The energy function is designed to assign lower energy levels to more 'likely' or 'natural' images according to the model’s training data, and higher energy levels to less likely ones.A positive difference indicates that the presence of the negative prompt effectively induces the inclusion of this component in the positive noise. The presence of a negative prompt promotes the formation of the object within the positive noise. Without the negative prompt, implicit guidance is insufficient to generate the intended object. The application of a negative prompt intensifies the distribution guidance towards the object, preventing it from materializing.As a result, negative prompts typically do not attend to the correct place until step 5, well after the application of positive prompts. The use of negative prompts in the initial steps can significantly skew the diffusion process, potentially altering the background.ConclusionsDo not step less than 10th times, going beyond 25th times does not make the difference for negative prompting.Negative prompts could enhance your positive prompts, depending on how well the model and LoRA have learn their keywords, so they could be understood as an extension of their counterparts.Weighting-up negative keywords may cause reverse activation, breaking up your image, try keeping the ratio influence of all your LoRAs and models equals.Referencehttps://synthical.com/article/Understanding-the-Impact-of-Negative-Prompts%3A-When-and-How-Do-They-Take-Effect%3F-171ebba1-5ca7-410e-8cf9-c8b8c98d37b6?
18
2
[ 🔥🔥🔥 SD3 MEDIUM OPEN DOWNLOAD - 2024.06.12 🔥🔥🔥]

[ 🔥🔥🔥 SD3 MEDIUM OPEN DOWNLOAD - 2024.06.12 🔥🔥🔥]

Finally! It's happening! The Medium version will be released first!+Stability.AICo-CEO Christian Laporte has announced the release of the weights.Stable Diffusion 3 Medium, our most advanced text-to-image model, will soon be available! You can download the weights from Hugging Face starting Wednesday, June 12.SD3 Medium is the SD3 model with 2 billion parameters, designed to excel in areas where previous models struggled. Key features include:• Photorealism: Overcomes common artifacts in hands and faces to deliver high-quality images without complex workflows.• Typography: Provides powerful typography results that surpass the latest large models.• Performance: Optimized size and efficiency make it ideal for both consumer systems and enterprise workloads.• Fine-Tuning: Can absorb fine details from small datasets, perfect for customization and creativity.SD3 Medium weights and code are available for non-commercial use only. If you wish to discuss a self-hosting license for commercial use of Stable Diffusion 3, please fill out the form below and our team will contact you shortly.+ @everyone
34
4
What exactly are the "node" and the "workflow" in AI image platform (explanation for the beginner)

What exactly are the "node" and the "workflow" in AI image platform (explanation for the beginner)

The Traditional Way of Generating AI Images for the BeginnerIf you are a beginner in the AI community, maybe you will be very confused and have no clue about what is "Node", and "Workflow" and their relations with "AI Tools" in the TensorArtTo start with the most simple way. We need to first mention how the user generates an image using a "Remixing" button that brings us to the "Normal Creation menu"Needless to say, by just editing the prompt (what you would like to see your picture look like) and negative prompt (what you do not want to see in the output image). Then push the Generate button, and the wonderful AI tool will kindly draw the new illustration serving you within a minute!!!!That sounds great, don't you think? If we imagine how humans spent a huge amount of time in the past to publish just 1 single piece of art. (Yeah, today, in 2024, in my personal opinion, both AI and human abilities are still not fully replaceable, especially in the terms of beautiful perfect hand :P ) However, the backbone or what happens behind the User-friendly menu allows us to "Select model", "Add LoRA", "Add ControlNet", "Set the aspect ratio (the original size of the image)" and so on, all of them are collected "Node" in a very complex "Workflow" PS.1. The Checkpoint or The Model often refers to the same thing. They are the core program that had been trained to draw the illustration. Each one has its strengths and weaknesses (I.E. Anime oriented or Realistic oriented) PS.2. The LoRA (Low-Rank Adaptation) is like an add-on to the Model allowing it to adapt to a different style, theme, and user preference. A concrete example is the Anime Character LoRAPS.3 The ControlNet is like a condition setting of the image. It helps the model to truly understand what is beyond the text prompt can describe. For instance, how a character poses in each direction and the angle of the camera.So here comes "The Comfyflow" (the nickname of the Workflow, people also mentioned it by the name "ComfyUI") which gives me a super headache when I see things like this for the first time in my life!!!!!!!!!(This image is a flow I have spent a lot of time studying, it is a flow for combining what is in the two images into a single one) Yeah, maybe, it is my fault that did not go to class about the workflow from the beginning or search for the tutorial on YouTube the first time (as my first language is not English). But would it be better if we had an instructor to tell us step-by-step here in Tensor.ArtAnd that is the reason why I got inspired to write this article solely for the beginner. So let's start with the main content of the article.What is ComfyFlowComfyFlow or the Workflow is an innovative AI image-generating platform that allows users to create stunning visuals with ease. To get the most out of this tool, it's important to understand two key concepts: "workflow" and "node." Let's break these down in the simplest way possible.What is a Workflow?A workflow is like a blueprint or a recipe that guides the creation of an image. Just as a recipe outlines the steps to make a dish, a workflow outlines the steps and processes needed to generate an image. It’s a sequence of actions that the AI follows to produce the final output.Think of it like this:Recipe (Workflow): Tells you what ingredients to use and in what order.Ingredients (Nodes): Each step or component used in the recipe.Despite the recommended pre-set template that TensorArt kindly gives to the users, from the beginner view's viewpoint without the knowledge of the workflow, it is not that helpful because, after clicking the "Try" button, we will bombarded with the complexity of the Node!!!!!!!What is a Node?Nodes are the building blocks of a workflow. Each node represents a specific action or process that contributes to the final image. In ComfyFlow, nodes can be thought of as individual steps in the workflow, each performing a distinct function.Imagine nodes as parts of a puzzle:Nodes: Individual pieces that fit together to complete the picture (workflow).How Do Workflows and Nodes Work Together? 1-2) Starting Point: Every workflow begins with an initial node, which might be an image input from the user, together with Checkpoint and LoRA serving the role of image references. 3-4) Processing Nodes: These are nodes that draw or modify the image in some way, such as adding color, or texture, or applying filters. 5) Ending Point: The node outputs the completed image which works very closely with the node of the previous stage in terms of sampling and VAE PS. A Variational Autoencoder (VAE) is a generative model that learns input data, such as images, to reconstruct and generate new, similar, or variations of images based on the patterns it has learned.Here is the list of nodes I have used in the normal image-generating images of my Waifu using 1checkpoint, and 2LoRAs to help the reader understand how ComfyFlow worksThe numbers 1-5 represent the overview process of the workflow and the role of each type of node I have mentioned above. However, in the case of more complex tasks like in AI Tools, the number of nodes sometimes is higher than 30!!!!!!!By the way, when starting with an empty ComfyFlow page, the way to add a node is "Right Click" -> "Add Node" -> Scroll down to the top, since the most frequently used node will be over there.1) loaders -> Load CheckPointLike in the normal task creation menu, this node is the one we can choose CheckPoint or the Core model.It is important to note that nodes work together using input/output. The "Model/CLIP/VAE" (the output) circles have to connect to the next one in which it has to correspond. We link them together by left-clicking on the circle's inner area and then drag to the destination. PS. CLIP (Contrastive Language-Image Pre-training) is a model developed by OpenAI that links images and text together in a way that helps AI understand and generate images based on textual descriptions.2) loaders -> Load LoRACheckpoint is very closely related to LoRA and that is a reason why they are connected by the input/output named "model/MODEL", "clip/CLIP"Anyway, since in this example, I have used 2 LoRAs (first for The theme of the picture and the Second for the character reference of my Waifu), two nodes of LoRAs then have to be connected as well. Here we can adjust the strength of the LoRA or the weight like it happens in the normal task generation menu.3) CLIP Text Encode (Prompt)This node is the prompt and negative prompt we normally see in the menu. The input here is only clip (Contrastive Language-Image Pre-training) and the output is "CONDITIONING" User tip: If you click on the output circle of the "Load LoRA" node and drag it to the empty area, the ComfyFlow will pop up a corresponding next node list to create a new one with ease. 4) KSampler & Empty Latent ImageThe sampling method is used to tell the AI how it should start generating visual patterns from the initial noise and everything associated with its adjustment will be set here in this type of sampling node together with "Empty Latent Image" The inputs in this step here are models (from LoRA node), positive and negative (from prompt node) and the output is "Latent"5) VAE Decode & Final output nodeOnce we establish the sampling node, the output named "LATENT" will then have to connect with "samples" Meanwhile the "vae" is the linkage between this one and the "Load Checkpoint" node from the beginning.And when everything is done the "IMAGE" as a final output here will be served at your hand.PS. An AI Tool is a more complex Workflow created to do some specific task such as swapping the face of the human in the original picture with the target face or changing the style of the input illustration to another one and etc.
42
5
PhotoReal Makeup Edition - V3 Slider

PhotoReal Makeup Edition - V3 Slider

PhotoReal Makeup Edition - V3 Slider (no trigger)Introducing the PhotoReal Makeup Edition - V3 Slider! Slide to the right to add beautiful, realistic makeup. Slide to the left to reduce the makeup effect for a more natural look. It's perfect for adjusting the makeup to get just the style you want.Try it out and see the amazing changes you can make!More Information:- Model linkYour feedback is invaluable to me. Feel free to share your experiences and suggestions in the comment section. For more personal interactions, join our Discord server where we can discuss and learn together.Thank you for your continued support!
53
6

Tips for new Users

Intro Hey there! If you're reading this, you're probably new to AI image generation and want to learn more. If you're not, you probably already know more than me :). Yeah, full disclosure: I'm still pretty inexperienced at this whole thing, but I thought I could still share some of the things I've learned with you! So, in no particular order:1. You can like your own posts I doubt there's anyone who doesn't know this already, but if you're posting your favorite generations and you care about getting likes, you can always like them yourself. Sketchy? Kinda. Do I still do it? Yes. And on the topic of getting more likes:2. Likes will often be returned Whenever I receive a like on one of my posts, I'll look at that person's pictures and heart any that I particularly enjoy. I know a lot of people do this, so one of the best ways to get people to notice and like your content is to just browse through posts and be generous with your own likes. It's a great way to get inspiration too!3. Use turbo/lightning LORAs If you find yourself running out of credits, there are ways to conserve them. When I'm iterating on an idea, I'll use a SDXL model (Meina XL) paired with this LORA. This lets me get high quality images in 10 steps for only 0.4 credits! It's really nice, and works with any SDXL model. Unfortunately, if there is a similar method for speeding up SD 1.5 models I don't know it, so it only works with XL.4. Use ADetailer smartly ADetailer is the best solution I've found for improving faces and hands. It's also a little difficult to figure out. So, though I'm still not a professional with it, I thought I could share some of the tricks I've learned. The models I normally use are face_yolo8s.pt and hand_yolo8s.pt. The "8s" versions are better than the "8n" versions, though they are slightly slower. In addition to these models, I'll often add the Attractive Eyes and Perfect Hand LORAs respectively. These are all just little things you can do to improve these notoriously hard parts of image generation. Also, using ADetailer before upscaling the image is cheaper in terms of credits, though the upscaling process can sometimes mess up the hands and face a little bit so there's some give and take there.5. Use an image editing app Wait a minute, I hear you saying, isn't this a guide for using Tensor Art? Yes, but you can still use other tools to improve your images. If I don't like a specific part of my image, I'll download it, open it in Krita (Or Photoshop or Gimp) and work on it. My art skills are pretty bad, (which is why I'm using this site in the first place,) but I can still remove, recolor, or edit certain aspects of the image. I can then reupload it to Tensor Art, and Img2img with a high denoising strength to improve it further. You could also just try inpainting the specific thing you want to change, but I always find it a bit of a struggle to get inpaint to make the changes I want.6. Experiment! The best way to learn is to do, so just start generating images, fiddling with settings, and trying new things. I still feel like I'm learning new stuff every day, and this technology is improving so fast that I don't think anyone will ever truly master it. But we can still try our hardest and hone our skills through experimentation, sharing knowledge, and getting more familiar with these models. And all the anime girls are a big plus too.Outro If you have anything to add, or even a tip you'd like to share, definitely leave a comment and maybe I can add it to this article. This list is obviously not exhaustive, and I'm no where near as talented as some of the people on this platform. Still though, I hope to have helped at least one person today. If that was you, maybe give the article a like? I appreciate it a ton, so if you enjoyed, just let me know. Thanks for reading!
67
3
• MOOD MAGIC SERIES • I. Melancholy

• MOOD MAGIC SERIES • I. Melancholy

MOOD MAGIC: adding emotion to your promptsMelancholy & GloomOvercast: Cloud-covered skies for subdued lighting.Dim Lighting: Limited light sources for creating deep shadows.Muted Colors: Toned-down color palette to convey sadness or desolation.Dusky: Twilight ambiance, suggesting the fading light of day.Foggy: A thick mist that obscures details and softens the scene.Drizzly: Gentle rain that adds a reflective, melancholic quality.Cloudy: Thick clouds that reduce brightness and saturate the scene with grey.Desaturated: Low color saturation to enhance the bleak feel.Shadowed: Prominent shadows that deepen the mood.Moody Lighting: Emotionally charged lighting with strong contrasts.Gloomy: Overall dark and dismal atmosphere.Monochrome: Black and white or single-color dominance to strip away cheer.Underexposed: Darker exposure to mimic a sense of foreboding.Chiaroscuro: Strong contrasts between light and dark, emphasizing turmoil.Hazy: Blurred or smoky atmosphere, creating a sense of mystery or unease.Twilight: Dim natural lighting that can feel lonely or isolating.Stormy: Implication of an approaching or ongoing storm to add tension.Wintery: Cold, barren landscape cues, even in urban settings.Grainy: Visual noise that adds an old or troubled quality.Bleak: Stark, harsh lighting or barren scenery settings.Ominous Clouds: Dark, menacing clouds that threaten bad weather.Subdued Tones: Soft, low-key colors that don't catch the eye.Cold Colors: Blues and greys to suggest chilliness and discomfort.Rusty: Implications of decay and neglect.Aged: A sense of time wearing down the scene, historical weariness.Soft Focus: Slightly out-of-focus elements to create a sense of disorientation or confusion.Tenebrous: Deeply shadowed, almost pitch-dark.Low-Key Lighting: Minimal lighting mostly in darkness with occasional highlights.Pensive: Engaged in, involving, or reflecting deep or serious thought.Yearning: A feeling of intense longing for something typically something that one has lost or been separated from.Weary: Conveying a sense of tiredness or exhaustion, both physical and emotional.Sparse: Minimalist or bare settings that suggest simplicity or emptiness.Brooding: A deep, serious, and sometimes dark contemplation.Silent: Lack of sound or motion, emphasizing solitude or contemplation.Ephemeral: Fleeting or transitory, suggesting the transient nature of moments and emotions.Desolate: Emptiness that conveys a sense of abandonment or loneliness.Poetic: Imbued with a sense of beauty and melancholy, often through lyrical expression.Moody Skies: Cloudy, stormy, or unsettled skies that reflect a turbulent emotional landscape.Cold Light: Harsh, unyielding light that doesn’t warm but isolates subjects.Autumnal: Related to autumn, often seen as a melancholic season due to its association with the end of summer.Faded: Colors or elements that have lost brightness, suggesting the passing of time.Blue Hour: Moody cool natural lighting obtained in the twilight hour just after sunset or just before sunrise.Example using Stable Diffusion SDXL + refinerCheckpoint: RealVis4Cfg: 5.5Steps: 40Sampler: DPM++ 3m SDE KarrasVisualize a close-up portrait of a young woman standing by a foggy window, her gaze distant and contemplative. The room is dimly lit, with only a soft, diffuse light filtering through the heavy overcast outside, casting subtle shadows across her face. The colors are desaturated, emphasizing a palette of cool grays and muted blues that reflect her somber mood. Her expression is serene yet melancholic, with her eyes slightly downcast as if lost in thought. The background is blurred, enhancing the sense of isolation and introspection. This portrait captures the essence of melancholy, framed in a moment of quiet solitude.negative: illustration, cartoon, anime, 3d, digital art, bad quality, CGI, sketch, drawn, blurry, painting, worst quality, low quality, bad anatomy, bad hands, bad body, missing fingers, extra digit, fewer digits
2
Buzz words: LIGHTING

Buzz words: LIGHTING

Getting the lighting right is key to making your AI-generated images look super realistic. This guide gives you the top keywords to use in your prompts to nail the lighting every time. Whether you're after dramatic shadows or soft, natural light, these tips will help your images look lifelike and set the tone to your composition.Ambient light:Soft, even lighting that fills the entire scene, reducing shadows.Chiaroscuro Lighting:A technique that uses strong contrasts between light and dark to create a dramatic, three-dimensional effect.Rim light:Light that outlines the subject, emphasizing its edges and creating a glowing effect.Diffused light:Soft light scattered in many directions, minimizing harsh shadows.Natural light:Light from the sun, moon, or other natural sources, offering realism and variationBacklight:Light coming from behind the subject, creating a silhouette or halo effect.Volumetric light:Light that interacts with particles in the air, such as fog or dust, creating visible light rays and enhancing the sense of depth in the scene.Polarized light:Light that vibrates in parallel planes.Emissive light:Light emitted from surfaces or objects themselves, often used to simulate glowing materials or lights.Directional light:Focused light from a specific direction, creating strong shadows and highlights.Soft light:Gentle light that produces minimal shadows, creating a smoother look.Hard light:Sharp, intense light that casts strong shadows and highlights details.Spotlight:Intense focused beam that highlights a set area or subject.Artificial light:Light from man-made sources allowing precise control over the scene.Holagen, florescent, blacklight, led, xenon, plasma, ultraviolet, incandescent, neon, Infrared, sodium vapor lights, metal halide lights, krypton, photoluminescent, ceramic metal halide, HMI, CCFL, CFLLow key light:Predominantly dark lighting with high contrast, often creating a dramatic or moody atmosphere.High Key Light:Bright, low-contrast lighting that minimizes shadows.Bounce Lighting/Reflected Lighting:Light reflected off a surface to soften the effect and spread it more evenly.Side Lighting:Light coming from the side of the subject.Caustic Lighting:Light patterns created when light is refracted or reflected through transparent or reflective materials, producing intricate and often beautiful effects.Uplighting:Light directed upwards. Great for emphasizing architectural features.Color Gel Lighting:The use of colored filters over lights to alter the color or mood of the scene.Gobo Lighting:Using a stencil or template placed in front of a light source to project patterns or shapes onto a surface.Split Lighting:Lighting that illuminates one half of the subject's face while leaving the other half in shadow, creating a strong, dramatic effectButterfly Lighting:Light placed above and in front of the subject, creating a butterfly-shaped shadow under the nose, often used in glamour photography.Rembrandt Lighting:technique where light creates a triangle of illumination on the cheek opposite the light source, adding depth and character.Specular lighting:Sharp, bright reflections from shiny surfaces, emphasizing glossiness and texture.Natural Breakup Lighting/Dappled Lighting:Using irregular patterns to mimic natural light effects, such as light filtering through leaves.Subsurface Scattering:Light that penetrates the surface of a translucent material, scattering within and then exiting at a different point, adding realism to materials like skin or wax.Golden Hour:Warm golden natural lighting obtained shortly after sunrise or shortly before sunset. Creates long soft shadows.Blue Hour:Moody cool natural lighting obtained in the twilight hour just after sunset or just before sunrise.Clamshell Lighting:portrait lighting setup using two light sources, one above and one below the subject's face.Catch light:A small reflection of the light source in the subject's eyes, adding life and dimension to portraits.Cross lighting:two light sources positioned at opposite sides of the subject, creating dramatic shadows and highlights.Tenebrism:Aggressive contrast between light and dark producing dark and gloomy images.Contre-jour:Lighting technique that produces clear silhouettes by the use of backlighting.Sfumato:Artistic lighting technique soft transitions between colors and tones resulting in a dreamy effect with no clear boundaries. Ie. The Mona Lisa.Ray tracing: Rendering technique that simulates the way the light interacts with the scene. Traces the light from the source, bounces off surfaces and reaches the viewers eye. Three point lighting:Cinematic lighting technique using key light, fill light and backlight. Global Illumination: Computer graphic technique that adds more realistic lighting to 3d scenery. Bloom: simulates the glow around bright light sources, creating a soft halo. Luminescence:emission of light by a substance not resulting from heat. It occurs through various processes such as chemical reactions, electrical energy, or other means.Bioluminescence:A cold light produced out of a chemical reaction inside of a living organism.
2
1
Quickstart Guide to Stable Video Diffusion

Quickstart Guide to Stable Video Diffusion

What is Stable Video Diffusion (SVD)?Stable Video Diffusion (SVD) from Stability AI, is an extremely powerful image-to-video model, which accepts an image input, into which it “injects” motion, producing some fantastic scenes.SVD is a latent diffusion model trained to generate short video clips from image inputs. There are two models. The first, img2vid, was trained to generate 14 frames of motion at a resolution of 576×1024, and the second, img2vid-xt is a finetune of the first, trained to generate 25 frames of motion at the same resolution.The newly released (2/2024) SVD 1.1 is further finetuned on a set of parameters to produce excellent, high-quality outputs, but requires specific settings, detailed below.Why should I be excited by SVD?SVD creates beautifully consistent video movement from our static images!How can I use SVD?ComfyUI is leading the pack when it comes to SVD image generation, with official SVD support! 25 frames of 1024×576 video uses < 10 GB VRAM to generate.It’s entirely possible to run the img2vid and img2vid-xt models on a GTX 1080 with 8GB of VRAM!There’s still no word (as of 11/28) on official SVD support in Automatic1111.If you’d like to try SVD on Google Colab, this workbook works on the Free Tier; https://github.com/sagiodev/stable-video-diffusion-img2vid/. Generation time varies, but is generally around 2 minutes on a V100 GPU.You’ll need to download one of the SVD models, from the links below, placing them in the ComfyUI/models/checkpoints directoryAfter updating your ComfyUI installation, you’ll see new nodes for VideoLinearCFGGuidance and SVD_img2vid _Conditioning. The Conditioning node takes the following inputs;You can download ComfyUI workflows for img2video and txt2video below, but keep in mind you’ll need to have an updated ComfyUI, and also may be missing additional nodes for Video. I recommend using the ComfyUI Manager to identify and download missing nodes!Suggested SettingsThe settings below are suggested settings for each SVD component (node), which I’ve found produce the most consistently useable outputs, with the img2vid and img2vid-xt models.Settings – Img2vid-xt-1.1February 2024 saw the release of a finetuned SVD model, version 1.1. This version only works with a very specific set of parameters to improve the consistency of outputs. If using the Img2vid-xt-1.1 model, the following settings must be applied to produce the best results;The easiest way to generate videosin tensor.art, you can generate videos very easily compared to the explanation above, all you need to do is input the prompt you want, select the model you like, set the ratio and set the frame in the animatediff menu.Output ExamplesLimitationsIt’s not perfect! Currently there are a few issues with the implementation, including;Generations are short! Only <=4 second generations are possible, at present.Sometimes there’s no motion in the outputs. We can tweak the conditioning parameters, but sometimes the images just refuse to move.The models cannot be controlled through text.Faces, and bodies in general, often aren’t the best!
2
5
List of style collection - focusing on anime charactor examples (continue updating)

List of style collection - focusing on anime charactor examples (continue updating)

AI image-generating platforms like Tensor.art offer diverse anime styles, enabling users to create artwork in various distinct masterpieces of art inspired by popular anime aesthetics. These collections aim to cater to different preferences from classic to contemporary anime illustrations within one place.P.S.1 I will continue updating this post maybe every 2 weeks when I find a unique style (both for LoRA and model) that is worth listing here solely from my perspective - Anyway if anyone has a list of favorite styles in mind, feel free to share them here or even create your post. :DP.S.2 People normally mix multiple LoRA at once, and the core model (checkpoint) has a variation in base style depending on the prompt used. Therefore, in the following example, I will choose only a single LoRA or Checkpoint to represent without mixing anything. However, if confusion about the contribution to the style happens, I have to apologize in advance since I am just a beginner in the art community. Here are some examples: Anime Lineart / Manga-like (线稿/線画/マンガ風/漫画风) Style (LORA) https://tensor.art/models/623935989624337542 Spacezin Sketch Style (LoRA) https://tensor.art/models/638083414328801488 Cute Chibi - V.1 (LoRA) https://tensor.art/models/726716640076597245 CAT - Citron Anime Treasure (Checkpoint) https://tensor.art/models/713607777118974323 LizMix V.7.0 (Checkpoint) https://tensor.art/models/721034681811855891 Flower style - (LORA) https://tensor.art/models/699582840586758007 Art Nouveau Style - Oosayam (LoRA) https://tensor.art/models/654562112921690173 Torino Style - v.2.0.09 (LoRA) https://tensor.art/models/705577639974520212 Yody PVC 3D Print - 1.0 (Checkpoint) https://tensor.art/models/673632484975460872 Eldritch Expressionism style (LoRA) https://tensor.art/models/708171473803739178 [Y5] Impressionism Style 印象派风格 (LoRA) https://tensor.art/models/621173217551417505 surrealism - 2024-02-17 (LoRA) https://tensor.art/models/695557949424221333 pop-art - 01 style (LoRA) https://tensor.art/models/697182692602582375 FF Style: Kazimir Malevich | Suprematism (LoRA) https://tensor.art/models/655758742350092928 Hoping these collections (today and in the future) will allow A.I. artists and enthusiasts to generate anime-inspired images effortlessly, blending creativity with advanced AI technology to bring their visions to life. :D
25
2
Prompt reference for "Lighting Effects"

Prompt reference for "Lighting Effects"

Hello. I usually use "lighting/lighting effects" when generating images.I will introduce some of the "words" I use when I want to add something.Please note that these words alone do not provide 100% effectiveness, and the base modelThe effect you get will differ depending on the LoRA sampling method and where you place it in the prompt.Words related to "lighting effects"・ Backlight :  Light from behind the subject・ Colorful lighting :  The impression itself is not colored, but the color changes depending on the light.・ moody lighting :  natural lighting, not direct artificial light・ studio lighting :  A term used to describe the artificial lighting of a photography studio.・ Directional Light :  directional light source is a light source that shines parallel rays in a selected direction.・ Dramatic lighting :  Lighting techniques in the field of photography・ Spot lighting :  A lighting technique that uses artificial light in a small area.・ Cinematic lighting :  A single word that describes several lighting techniques used in movies.・ Bounce Lighting :  Light reflected by a reflex plate, etc.・ Practical Lighting :  Photographs and videos that depict the light source itself in the composition・ Volumetric lighting :  A word derived from 3DCG. It tends to be a picture with a divine golden light source.・ Dynamic lighting :  I don't really understand what it means, but it tends to create high-contrast images.・ Warm lighting :  Creates a warm picture illuminated with warm colors・ Cold lighting :  Lights with a cold light source.・ High-key lighting :  Soft light, minimal shadows, low contrast, resulting in bright frames・ Low-key lighting :  It provides high contrast, but the impression is a little weak.・ Hard light :  Strong light. Highlights appear strong.・ soft light :  A word that refers to faint light.・ strobe lighting :  strong artificial light (stroboscopic lighting)・ Ambient light :  An English word that refers to ambient lighting/indoor lighting.・ flash lighting  :  For some reason, the characters themselves tend to emit light, and there are often flashes of light. (flash lighting photography) ・ Natural lighting :  This tends to create a natural-looking picture that feels contrasting with artificial light.
67
14
The future of AI image generation: endless possibilities -

The future of AI image generation: endless possibilities -

introduction{{For those who are about to start AI image generation}}In recent years, advances in AI technology have brought about revolutionary changes in the field of image generation. In particular, AI-powered illustration generation has become a powerful tool for artists and designers. However, as this technology advances, issues of creativity and copyright arise. In this article, we will explain the possibilities of AI image generation, specific use cases, how to create prompts, how to use LoRA and its effects, keywords for improving image quality, consideration for copyright, etc.Fundamentals of AI image generationAI image generation uses artificial intelligence to learn from data and generate new images. Deep learning techniques are often used for this, and one notable approach is stable diffusion. Stable Diffusion employs a probabilistic method called a diffusion model to gradually remove noise during image generation, resulting in highly realistic, high-quality output.Generating real imagesAI technology is excellent not only for creating cute illustrations, but also for generating realistic images. For example, you can generate high-resolution images that resemble photorealistic landscapes or portraits. By utilizing Stable Diffusion, it is possible to generate more detailed images, which expands the possibilities of application in various fields such as advertising, film production, and game design.Generate cute illustrationsOne of the practical applications of AI image generation is the creation of cute illustrations. This is useful for things like character design and avatar creation, allowing you to quickly generate different styles. This process typically involves collecting a large dataset of illustrations, training an AI model on this data to learn different styles and patterns, and generating new illustrations based on user input or keywords.creativity and AIAI image generation also influences creative ideas. Artists can use her AI-generated images as inspiration for new works or expand on ideas, which can lead to the creation of new styles and concepts never thought of before.Use and effects of LoRALoRA (Low-Rank Adaptation) is a technique used to improve the performance of AI models. Its impacts include:1. Fine-tune models: LoRA allows you to fine-tune existing AI models to learn specific styles and features, allowing for customization based on user needs.2. Efficient learning: LoRA reduces the need for large-scale data collection and training costs by efficiently training models using small datasets.3. Rapid adaptation: LoRA allows you to quickly adapt to new styles and trends, making it easy to generate images tailored to your current needs.For example, LoRA can be leveraged to efficiently achieve high-quality results when generating illustrations in a specific style.Creating a promptWhen instructing an AI to generate illustrations, it's important to create effective prompts. Key points for creating prompts include providing specific instructions, using the right keywords, trial and error, and an optional reference image to help the AI figure out what you're looking for. Keywords for improving image qualityWhen creating prompts for AI image generation, you can incorporate keywords related to image quality improvement to improve the overall quality of the images generated. Useful keywords include "high resolution," "detail," "clean lines," "high quality," "sharp," "bright colors," and "photorealistic."Copyright considerationsImage generation using AI also raises copyright issues. If the dataset used to train your AI model contains copyrighted works, the resulting images may infringe your copyright. When using AI image generation tools, it's important to be aware of the data source, ensure that the generated images comply with copyright laws, and check the license agreement.conclusionAI image generation offers great possibilities for artists and designers, but it also raises challenges related to copyright. By using data responsibly and understanding copyright law, you can leverage AI technology to create innovative work. Leveraging technologies like LoRA can further improve efficiency and quality. Users can adjust the output by incorporating image enhancement keywords into the prompt. Let's explore new ways of expression while being aware of advances in AI technology and the considerations that come with it! !
32
21
Stylistic QR Code with Stable Diffusion

Stylistic QR Code with Stable Diffusion

source: anfu.me (now you can easyly create QRcode with tensor.art inside controlnet, next time i will create guide about that)Yesterday, I created this image using Stable Diffusion and ControlNet, and shared on Twitter and Instagram – an illustration that also functions as a scannable QR code.The process of creating it was super fun, and I’m quite satisfied with the outcome.In this post, I would like to share some insights into my learning journey and the approaches I adopted to create this image. Additionally, I want to take this opportunity to credit the remarkable tools and models that made this project possible.Get into the Stable DiffusionThis year has witnessed an explosion of mind-boggling AI technologies, such as ChatGPT, DALL-E, Midjourney, Stable Diffusion, and many more. As a former photographer also with some interest in design and art, being able to generate images directly from imagination in minutes is undeniably tempting.So I started by trying Midjourney, it’s super easy to use, very expressive, and the quality is actually pretty good. It would honestly be my recommendation for anyone who wants to get started with generative AI art.By the way, Inès has also delved into it and become quite good at it now, go check her work on her new Instagram account  @a.i.nes.On my end, being a programmer with strong preferences, I would naturally seek for greater control over the process. This brought me to the realm of Stable Diffusion. I started with this guide: Stable Diffusion LoRA Models: A Complete Guide. The benefit of being late to the party is that there are already a lot of tools and guides ready to use. Setting up the environment quite straightforward and luckily my M1 Max’s GPU is supported.QR Code ImageA few weeks ago, nhciao on reddit posted a series of artistic QR codes created using Stable Diffusion and ControlNet. The concept behind them fascinated me, and I defintely want to make one for my own. So I did some research and managed to find the original article in Chinese: Use AI to Generate Scannable Images. The author provided insights into their motivations and the process of training the model, although they did not release the model itself. On the other hand, they are building a service called QRBTF.AI to generate such QR code, however it is not yet available.Until another day I found an community model QR Pattern Controlnet Model on CivitAI. I know I got to give it a try!SetupMy goal was to generate a QR code image that directs to my website while elements that reflect my interests. I ended up taking a slightly cypherpunk style with a character representing myself :PDisclaimer: I’m certainly far from being an expert in AI or related fields. In this post, I’m simply sharing what I’ve learned and the process I followed. My understanding may not be entirely accurate, and there are likely optimizations that could simplify the process. If you have any suggestions or comments, please feel free to reach out using the links at the bottom of the page. Thank you!1. Setup EnvironmentI pretty much follows Stable Diffusion LoRA Models: A Complete Guide to install the web ui AUTOMATIC1111/stable-diffusion-webui, download models you are interested in from CivitAI, etc. As a side note, I found that the user experience of the web ui is not super friendly, some of them I guess are a bit architectural issues that might not be easy to improve, but luckily I found a pretty nice theme canisminor1990/sd-webui-kitchen-theme that improves a bunch of small things.In order to use ControlNet, you will also need to install the Mikubill/sd-webui-controlnet extension for the web ui.Then you can download the QR Pattern Controlnet Model, putt the two files (.safetensors and .yaml) under stable-diffusion-webui/models/ControlNet folder, and restart the web ui.2. Create a QR CodeThere are hundreds of QR Code generators full of adds or paid services, and we certainly don’t need those fanciness – because we are going to make it much more fancier 😝!So I end up found the QR Code Generator Library, a playground of an open source QR Code generator. It’s simple but exactly what I need! It’s better to use medium error correction level or above to make it more easy recognizable later. Small tip that you can try with different Mask pattern to find a better color destribution that fits your design.3. Text to ImageAs the regular Text2Image workflow, we need to provide some prompts for the AI to generate the image from. Here is the prompts I used:Prompts(one male engineer), medium curly hair, from side, (mechanics), circuit board, steampunk, machine, studio, table, science fiction, high contrast, high key, cinematic light, (masterpiece, top quality, best quality, official art, beautiful and aesthetic:1.3), extreme detailed, highest detailed, (ultra-detailed)Negative Prompts(worst quality, low quality:2), overexposure, watermark, text, easynegative, ugly, (blurry:2), bad_prompt,bad-artist, bad hand, ng_deepnegative_v1_75tThen we need to go the ControlNet section, and upload the QR code image we generated earlier. And configure the parameters as suggested in the model homepage.Then you can start to generate a few images and see if it met your expectations. You will also need to check if the generated image is scannable, if not, you can tweak the Start controling step and End controling step to find a good balance between stylization and QRCode-likeness.4. I’m feeling lucky!After finding a set of parameters that I am happy with, I will increase the Batch Count to around 100 and let the model generate variations randomly. Later I can go through them and pick one with the best conposition and details for further refinement. This can take a lot of time, and also a lot of resources from your processors. So I usually start it before going to bed and leave it overnight.Here are some examples of the generated variations (not all of them are scannable):From approximately one hundred variations, I ultimately chose the following image as the starting point:It gets pretty interesting composition, while being less obvious as a QR code. So I decided to proceed with it and add add a bit more details. (You can compare it with the final result to see the changes I made.)5. Refining DetailsUpdate: I recently built a toolkit to help with this process, check my new blog post 👉 Refine AI Generated QR Code for more details.The generated images from the model are not perfect in every detail. For instance, you may have noticed that the hand and face appear slightly distorted, and the three anchor boxes in the corner are less visually appealing. We can use the inpaint feature to tell the model to redraw some parts of the image (it would better if you keep the same or similiar prompts as the original generation).Inpainting typically requires a similar amount of time as generating a text-to-image, and it involves either luck or patience. Often, I utilize Photoshop to "borrow" some parts from previously generated images and utilize the spot healing brush tool to clean up glitches and artifacts. My Photoshop layers would looks like this:After making these adjustments, I’ll send the combined image back for inpainting again to ensure a more seamless blend. Or to search for some other components that I didn’t found in other images.Specifically on the QR Code, in some cases ControlNet may not have enough prioritize, causing the prompts to take over and result in certain parts of the QR Code not matching. To address this, I would overlay the original QR Code image onto the generated image (as shown in the left image below), identify any mismatches, and use a brush tool to paint those parts with the correct colors (as shown in the right image below).I then export the marked image for inpainting once again, adjusting the Denoising strength to approximately 0.7. This would ensures that the model overrides our marks while still respecting the color to some degree.Ultimately, I iterate through this process multiple times until I am satisfied with every detail.6. UpscalingThe recommended generation size is 920x920 pixels. However, the model does not always generate highly detailed results at the pixel level. As a result, details like the face and hands can appear blurry when they are too small. To overcome this, we can upscale the image, providing the model with more pixels to work with. The SD Upscaler script in the img2img tab is particularly effective for this purpose. You can refer to the guide Upscale Images With Stable Diffusion for more information.7. Post-processingLastly, I use Photoshop and Lightroom for subtle color grading and post-processing, and we are done!The one I end up with not very good error tolerance, you might need to try a few times or use a more forgiving scanner to get it scanned :PAnd using the similarly process, I made another one for Inès:ConclusionCreating this image took me a full day, with a total of 10 hours of learning, generating, and refining. The process was incredibly enjoyable for me, and I am thrilled with the end result! I hope this post can offer you some fundamental concepts or inspire you to embark on your own creative journey. There is undoubtedly much more to explore in this field, and I eager to see what’s coming next!Join my Discord Server and let’s explore more together!If you want to learn more about the refining process, go check my new blog post: Refining AI Generated QR Code.ReferencesHere are the list of resources for easier reference.ConceptsStable DiffusionControlNetToolsHardwares & Softwares I am using.AUTOMATIC1111/stable-diffusion-webui - Web UI for Stable Diffusioncanisminor1990/sd-webui-kitchen-theme - Nice UI enhancementMikubill/sd-webui-controlnet - ControlNet extension for the webuiQR Code Generator Library - QR code generator that is ad-free and customisableAdobe Photoshop - The tool I used to blend the QR code and the illustrationModelsControl Net Models for QR Code (you can pick one of them)QR Pattern Controlnet ModelControlnet QR Code MonsterIoC Lab Control NetCheckpoint Model (you can use any checkpoints you like)Ghostmix Checkpoint - A very high quality checkpoint I use. You can use any other checkpoints you likeTutorialsStable Diffusion LoRA Models: A Complete Guide - The one I used to get started(Chinese) Use AI to genereate scannable images - Unfortunately the article is in Chinese and I didn’t find a English version of it.Upscale Images With Stable Diffusion - Enlarge the image while adding more details
The Marvel of Tanjore Temple: A Timeless Treasure

The Marvel of Tanjore Temple: A Timeless Treasure

IntroductionThe Tanjore Temple, also known as Brihadeeswarar Temple, is a striking example of India’s architectural grandeur and rich cultural heritage. Nestled in the historic town of Thanjavur in Tamil Nadu, this UNESCO World Heritage Site draws thousands of visitors each year, eager to marvel at its towering vimana (temple tower), intricate carvings, and vibrant history.Historical BackgroundBuilt by the great Chola emperor Raja Raja Chola I in the 11th century, the Tanjore Temple stands as a testament to the ingenuity and vision of ancient Indian architects and artisans. Completed in 1010 AD, it celebrated its millennium in 2010, marking a thousand years of awe-inspiring presence.Architectural SplendorThe VimanaThe most striking feature of the Tanjore Temple is its colossal vimana, which rises to a height of 66 meters. This towering structure is crowned with a massive dome, made from a single piece of granite weighing approximately 80 tons. This engineering marvel leaves historians and architects alike in awe, given the lack of modern machinery during its construction.The SanctumAt the heart of the temple lies the sanctum sanctorum, housing a massive Shiva lingam. The inner walls of the sanctum are adorned with exquisite frescoes and murals, depicting various mythological scenes and showcasing the artistic brilliance of the Chola period.Intricate CarvingsEvery inch of the Tanjore Temple is a canvas of intricate carvings. From the elaborate depictions of deities and mythological narratives on the walls to the ornate pillars and ceilings, the temple is a visual feast. These carvings not only serve as decorative elements but also provide a glimpse into the socio-cultural milieu of the Chola dynasty.Cultural SignificanceReligious ImportanceThe Tanjore Temple is dedicated to Lord Shiva and holds immense religious significance for Hindus. It is one of the largest temples in India and serves as a major pilgrimage site, especially during festivals like Maha Shivaratri. Devotees from across the country flock to the temple to seek blessings and participate in the vibrant festivities.Artistic HeritageThe temple is a treasure trove of Chola art and architecture. The frescoes and murals, in particular, offer invaluable insights into the artistic and cultural landscape of the period. The depictions of dance forms, musical instruments, and attire provide a vivid picture of the era’s cultural richness.Visiting Tanjore TempleBest Time to VisitThe ideal time to visit Tanjore Temple is between October and March when the weather is pleasant. The temple complex is open from early morning till evening, allowing visitors ample time to explore and soak in its magnificence.How to ReachThanjavur is well-connected by road, rail, and air. The nearest airport is Tiruchirappalli International Airport, about 60 kilometers away. Thanjavur Junction is the nearest railway station, with regular trains from major cities like Chennai, Bangalore, and Coimbatore. Buses and taxis are also readily available for local transportation.AccommodationThanjavur offers a range of accommodation options, from budget hotels to luxury resorts, catering to the diverse needs of travelers. Staying in the town allows visitors to explore not just the temple, but also other nearby attractions like the Thanjavur Royal Palace and the Saraswathi Mahal Library.ConclusionThe Tanjore Temple is more than just an architectural marvel; it is a living testament to India’s rich cultural and religious heritage. Its towering vimana, intricate carvings, and historical significance make it a must-visit destination for history enthusiasts, art lovers, and spiritual seekers alike. Plan your visit to this timeless treasure and immerse yourself in the grandeur of the Chola dynasty.
4
[Guide] Make your own Loras, easy and free

[Guide] Make your own Loras, easy and free

This article helped me to create my first Lora and upload it to Tensor.art, although Tensor.art has its own Lora Train , this article helps to understand how to create Lora well.🏭 PreambleEven if you don't know where to start or don't have a powerful computer, I can guide you to making your first Lora and more!In this guide we'll be using resources from my GitHub page. If you're new to Stable Diffusion I also have a full guide to generate your own images and learn useful tools.I'm making this guide for the joy it brings me to share my hobbies and the work I put into them. I believe all information should be free for everyone, including image generation software. However I do not support you if you want to use AI to trick people, scam people, or break the law. I just do it for fun.Also here's a page where I collect Hololive loras.📃What you needAn internet connection. You can even do this from your phone if you want to (as long as you can prevent the tab from closing).Knowledge about what Loras are and how to use them.Patience. I'll try to explain these new concepts in an easy way. Just try to read carefully, use critical thinking, and don't give up if you encounter errors.🎴Making a Lorat has a reputation for being difficult. So many options and nobody explains what any of them do. Well, I've streamlined the process such that anyone can make their own Lora starting from nothing in under an hour. All while keeping some advanced settings you can use later on.You could of course train a Lora in your own computer, granted that you have an Nvidia graphics card with 6 GB of VRAM or more. We won't be doing that in this guide though, we'll be using Google Colab, which lets you borrow Google's powerful computers and graphics cards for free for a few hours a day (some say it's 20 hours a week). You can also pay $10 to get up to 50 extra hours, but you don't have to. We'll also be using a little bit of Google Drive storage.This guide focuses on anime, but it also works for photorealism. However I won't help you if you want to copy real people's faces without their consent.🎡 Types of LoraAs you may know, a Lora can be trained and used for:A character or personAn artstyleA poseA piece of clothingetcHowever there are also different types of Lora now:LoRA: The classic, works well for most cases.LoCon: Has more layers which learn more aspects of the training data. Very good for artstyles.LoHa, LoKR, (IA)^3: These use novel mathematical algorithms to process the training data. I won't cover them as I don't think they're very useful.📊 First Half: Making a DatasetThis is the longest and most important part of making a Lora. A dataset is (for us) a collection of images and their descriptions, where each pair has the same filename (eg. "1.png" and "1.txt"), and they all have something in common which you want the AI to learn. The quality of your dataset is essential: You want your images to have at least 2 examples of: poses, angles, backgrounds, clothes, etc. If all your images are face close-ups for example, your Lora will have a hard time generating full body shots (but it's still possible!), unless you add a couple examples of those. As you add more variety, the concept will be better understood, allowing the AI to create new things that weren't in the training data. For example a character may then be generated in new poses and in different clothes. You can train a mediocre Lora with a bare minimum of 5 images, but I recommend 20 or more, and up to 1000.As for the descriptions, for general images you want short and detailed sentences such as "full body photograph of a woman with blonde hair sitting on a chair". For anime you'll need to use booru tags (1girl, blonde hair, full body, on chair, etc.). Let me describe how tags work in your dataset: You need to be detailed, as the Lora will reference what's going on by using the base model you use for training. If there is something in all your images that you don't include in your tags, it will become part of your Lora. This is because the Lora absorbs details that can't be described easily with words, such as faces and accessories. Thanks to this you can let those details be absorbed into an activation tag, which is a unique word or phrase that goes at the start of every text file, and which makes your Lora easy to prompt.You may gather your images online, and describe them manually. But fortunately, you can do most of this process automatically using my new 📊 dataset maker colab.Here are the steps:1️⃣ Setup: This will connect to your Google Drive. Choose a simple name for your project, and a folder structure you like, then run the cell by clicking the floating play button to the left side. It will ask for permission, accept to continue the guide.If you already have images to train with, upload them to your Google Drive's "lora_training/datasets/project_name" (old) or "Loras/project_name/dataset" (new) folder, and you may choose to skip step 2.2️⃣ Scrape images from Gelbooru: In the case of anime, we will use the vast collection of available art to train our Lora. Gelbooru sorts images through thousands of booru tags describing everything about an image, which is also how we'll tag our images later. Follow the instructions on the colab for this step; basically, you want to request images that contain specific tags that represent your concept, character or style. When you run this cell it will show you the results and ask if you want to continue. Once you're satisfied, type yes and wait a minute for your images to download.3️⃣ Curate your images: There are a lot of duplicate images on Gelbooru, so we'll be using the FiftyOne AI to detect them and mark them for deletion. This will take a couple minutes once you run this cell. They won't be deleted yet though: eventually an interactive area will appear below the cell, displaying all your images in a grid. Here you can select the ones you don't like and mark them for deletion too. Follow the instructions in the colab. It is beneficial to delete low quality or unrelated images that slipped their way in. When you're finished, send Enter in the text box above the interactive area to apply your changes.4️⃣ Tag your images: We'll be using the WD 1.4 tagger AI to assign anime tags that describe your images, or the BLIP AI to create captions for photorealistic/other images. This takes a few minutes. I've found good results with a tagging threshold of 0.35 to 0.5. After running this cell it'll show you the most common tags in your dataset which will be useful for the next step.5️⃣ Curate your tags: This step for anime tags is optional, but very useful. Here you can assign the activation tag (also called trigger word) for your Lora. If you're training a style, you probably don't want any activation tag so that the Lora is always in effect. If you're training a character, I myself tend to delete (prune) common tags that are intrinsic to the character, such as body features and hair/eye color. This causes them to get absorbed by the activation tag. Pruning makes prompting with your Lora easier, but also less flexible. Some people like to prune all clothing to have a single tag that defines a character outfit; I do not recommend this, as too much pruning will affect some details. A more flexible approach is to merge tags, for example if we have some redundant tags like "striped shirt, vertical stripes, vertical-striped shirt" we can replace all of them with just "striped shirt". You can run this step as many times as you want.6️⃣ Ready: Your dataset is stored in your Google Drive. You can do anything you want with it, but we'll be going straight to the second half of this tutorial to start training your Lora!⭐ Second Half: Settings and TrainingThis is the tricky part. To train your Lora we'll use my ⭐ Lora trainer colab. It consists of a single cell with all the settings you need. Many of these settings don't need to be changed. However, this guide and the colab will explain what each of them do, such that you can play with them in the future.Here are the settings:▶️ Setup: Enter the same project name you used in the first half of the guide and it'll work automatically. Here you can also change the base model for training. There are 2 recommended default ones, but alternatively you can copy a direct download link to a custom model of your choice. Make sure to pick the same folder structure you used in the dataset maker.▶️ Processing: Here are the settings that change how your dataset will be processed.The resolution should stay at 512 this time, which is normal for Stable Diffusion. Increasing it makes training much slower, but it does help with finer details.flip_aug is a trick to learn more evenly, as if you had more images, but makes the AI confuse left and right, so it's your choice.shuffle_tags should always stay active if you use anime tags, as it makes prompting more flexible and reduces bias.activation_tags is important, set it to 1 if you added one during the dataset part of the guide. This is also called keep_tokens.▶️ Steps: We need to pay attention here. There are 4 variables at play: your number of images, the number of repeats, the number of epochs, and the batch size. These result in your total steps.You can choose to set the total epochs or the total steps, we will look at some examples in a moment. Too few steps will undercook the Lora and make it useless, and too many will overcook it and distort your images. This is why we choose to save the Lora every few epochs, so we can compare and decide later. For this reason, I recommend few repeats and many epochs.There are many ways to train a Lora. The method I personally follow focuses on balancing the epochs, such that I can choose between 10 and 20 epochs depending on if I want a fast cook or a slow simmer (which is better for styles). Also, I have found that more images generally need more steps to stabilize. Thanks to the new min_snr_gamma option, Loras take less epochs to train. Here are some healthy values for you to try:10 images × 10 repeats × 20 epochs ÷ 2 batch size = 1000 steps20 images × 10 repeats × 10 epochs ÷ 2 batch size = 1000 steps100 images × 3 repeats × 10 epochs ÷ 2 batch size = 1500 steps400 images × 1 repeat × 10 epochs ÷ 2 batch size = 2000 steps1000 images × 1 repeat × 10 epochs ÷ 3 batch size = 3300 steps▶️ Learning: The most important settings. However, you don't need to change any of these your first time. In any case:The unet learning rate dictates how fast your Lora will absorb information. Like with steps, if it's too small the Lora won't do anything, and if it's too large the Lora will deepfry every image you generate. There's a flexible range of working values, specially since you can change the intensity of the lora in prompts. Assuming you set dim between 8 and 32 (see below), I recommend 5e-4 unet for almost all situations. If you want a slow simmer, 1e-4 or 2e-4 will be better. Note that these are in scientific notation: 1e-4 = 0.0001The text encoder learning rate is less important, specially for styles. It helps learn tags better, but it'll still learn them without it. It is generally accepted that it should be either half or a fifth of the unet, good values include 1e-4 or 5e-5. Use google as a calculator if you find these small values confusing.The scheduler guides the learning rate over time. This is not critical, but still helps. I always use cosine with 3 restarts, which I personally feel like it keeps the Lora "fresh". Feel free to experiment with cosine, constant, and constant with warmup. Can't go wrong with those. There's also the warmup ratio which should help the training start efficiently, and the default of 5% works well.▶️ Structure: Here is where you choose the type of Lora from the 2 I mentioned in the beginning. Also, the dim/alpha mean the size of your Lora. Larger does not usually mean better. I personally use 16/8 which works great for characters and is only 18 MB.▶️ Ready: Now you're ready to run this big cell which will train your Lora. It will take 5 minutes to boot up, after which it starts performing the training steps. In total it should be less than an hour, and it will put the results in your Google Drive.🏁 Third Half: TestingYou read that right. I lied! 😈 There are 3 parts to this guide.When you finish your Lora you still have to test it to know if it's good. Go to your Google Drive inside the /lora_training/outputs/ folder, and download everything inside your project name's folder. Each of these is a different Lora saved at different epochs of your training. Each of them has a number like 01, 02, 03, etc.Here's a simple workflow to find the optimal way to use your Lora:Put your final Lora in your prompt with a weight of 0.7 or 1, and include some of the most common tags you saw during the tagging part of the guide. You should see a clear effect, hopefully similar to what you tried to train. Adjust your prompt until you're either satisfied or can't seem to get it any better.Use the X/Y/Z plot to compare different epochs. This is a builtin feature in webui. Go to the bottom of the generation parameters and select the script. Put the Lora of the first epoch in your prompt (like "<lora:projectname-01:0.7>"), and on the script's X value write something like "-01, -02, -03", etc. Make sure the X value is in "Prompt S/R" mode. These will perform replacements in your prompt, causing it to go through the different numbers of your lora so you can compare their quality. You can first compare every 2nd or every 5th epoch if you want to save time. You should ideally do batches of images to compare more fairly.Once you've found your favorite epoch, try to find the best weight. Do an X/Y/Z plot again, this time with an X value like ":0.5, :0.6, :0.7, :0.8, :0.9, :1". It will replace a small part of your prompt to go over different lora weights. Again it's better to compare in batches. You're looking for a weight that results in the best detail but without distorting the image. If you want you can do steps 2 and 3 together as X/Y, it'll take longer but be more thorough.If you found results you liked, congratulations! Keep testing different situations, angles, clothes, etc, to see if your Lora can be creative and do things that weren't in the training data.source: civitai/holostrawberry
21
3
Area Composition

Area Composition

Get more specific generations each time!Have you ever heard of Area composition?Area composition is a technique where you can specify and set custom locations for every element you want to generate. In order to create this simple but effective workflow all you need is:NodesLoad checkpoint: here you select your desired model.Load LoRA: here you select your desired style with any LoRA (this one is optional).Clip Set Last Layer: this node works as your Clip Skip (set it to -2 for better results).Clip text encode: here is where your lovely prompt will be. you will need to have two of these because one will work as your positives and the other as negatives.Ksampler: this node is important because it is like the brain of the main process. here is where your prompt and image size gets read it and transformed into an image. here you can use the sampler and scheduler you like the most (set the denoise strength to 1.0 for better results).Empty latent image: as important as the ksampler, the empty latent image node is where you decide the specific size of your initial image (can be portrait or landscape).Clip text encode: wait, again? yes. just as the last ones, this node will focus on the specific element you want to generate. it is important to keep it simple and only consider the main element to represent (you can have as many nodes for every element you want to generate. keep in mind that these nodes will only work as positives. for this example i will only use 2 clip text encode nodes).MultiArea conditioning: ok so, this is the most important node of the process. here, for explaining purposes, i will call each one of my positives as conditionings.conditioning 0 will be my first positive (the one i made on step 4).conditioning 1 and 2 will be my second and third positive (the one i made on step 7).it is very important to know that for each conditioning you will have to set a desired size for each element. in this example conditioning 0 i set it to 512x718 because is the base prompt and i want all of the canvas to represent it. for conditioning 1, which is my main character, i set it to 384x576 on lower part of the center of the canvas. and for conditioning 2, which is the background /setting, i set it to 512x718 because i want all of the canvas to work as the background. (you may notice that for each conditioning, while setting it's position, a different color will show on the multiarea conditioning node. keep calm, these colors will work just as a visual representation for the position of each element).also important, as you have figured it out, this node works just as a super detailed composition instruction, therefore, this multiarea conditioning node will work as your positive, so be sure to connect it as positive in your ksampler.Upscale latent: until this part of the process we have only created the base image, which means it is time to upscale it. to do so, i have used the upscale latent node. it not only upscale the image to a desired size but also introduces more detail in the process.Ksampler: yes, again. this second ksampler will work along the upscale latent node in order to refine details, so using the same configuration as your first one (step 5) is a good idea. (lowering the denoise strength on this second ksampler will help in avoiding drastic changes. for this example i set it to 0.5).VAE encode: the variational autoencoder or vae node is important because this node will transform the noise and commands into your beautiful masterpiece.Preview/Save image: lastly, what is left to add is the preview/save image node. (this one does not need an explanation, right?).And there you go, you will now be able to generate more personalized images.Intended image to create: cyborg girl inside abandoned building.Do not forget to set this article as favorite if you found it useful.Happy generations!
17
6