Tensor.Art
Create

Tensor.Art

Creation

Get start with Stable Diffusion!
💥 SD3 & DiT

ComfyFlow

ComfyUI's amazing experience!
🎭 TAttoo Event

Host My Model

Share my models,get more attention!
💸 Double Earnings

Online Training

Make LoRA Training easier!
🤖 Make Fun

AI Tools

732025619084381295

16 Nodes

216
27

Japanese Style Tattoo (Ukiyo-e Style🌊)

727246939686696370
5.4K
159

🖊️Tattoo Design Master

Models

Articles

Controlnet with SD3

Controlnet with SD3

Today, I noticed that I can add ControlNet to the SD3 model.The Tiled function works very well, so I incorporated it into my workflow and created a group for generating artistic images based on a given photo or a previously generated image. In the main part of the workflow, I simply set a very short prompt, like "grass, flowers," and I get an image that blends grass and flowers in an arrangement resembling the base photo.https://youtu.be/sv35wKNiFGsControlnet with SD3 | ComfyUI Workflow | Tensor.Art
2
如何使用SD3在线训练

如何使用SD3在线训练

首先点击右上角的头像,在弹出的下拉框中选择我训练的模型,进入训练中心。如果之前有训练过模型,这里会看到许多训练任务。然后选择在线训练按钮进行一次训练。左侧是数据集窗口,默认没有任何数据。您可以上传一些图片作为数据集,或者上传一个数据集压缩包,压缩包可以包含标注文件,格式和kohya-ss一样,每个图片文件对应一个同名的标注文件txt。右边的模型主题中可以选择二次元人物、真实人物、2.5D、标准以及自定义。这里我们选择自定义,在使用底模中选择SD3模型。注意在选择版本中下拉框内选择T5XXL的版本,这样才可以训练T5文本编码器。基础模式下参数选择,推荐单张图片重复次数4,轮数为16。上传一个处理好的数据集后,如果你的数据集标注中有人物名,可以不写触发词。否则你应该给你的模型起一个简单的触发词,例如人物名称或者风格名称。接着从数据集中选择一个标注文件作为预览提示词。如果你想使用专业模式,选择右上角按钮切换到专业模式。专业模式推荐学习率翻倍,然后使用cosine_with_restarts学习率调度器,优化器可以选择AdamW8bit。开启打乱标签(shuffle),并且保持第1个token(如果你有一个人名触发词在第一个)关闭噪声偏移功能,卷积DIM和Alpha可以选择8和1。在样图设置中追加填写反向提示词,接下来就可以开始训练了。在训练队列中,你可以看到当前loss值变化表以及每轮epoch产生的4张样图。最后可以选择效果最好的epoch下载到本地或者直接在tensorart上发布。
3
1
SD3 - training on your own PC

SD3 - training on your own PC

So first, you need to update your version of OneTrainer.Second, u need dowload ALL files and folders (and rename)stabilityai/stable-diffusion-3-medium-diffusers at main (huggingface.co)then u put it:With float16 output lora has only 36MB:This is my setting for a style training:My checkpoint to testing u can dowload for free:Aderek SD3 - v1 | Stable Diffusion Model - Checkpoint | Tensor.Artand my loras: Aderek514's Profile | Tensor.ArtSo, good luck!
8
ReActor Node for ComfyUI (Face Swap)

ReActor Node for ComfyUI (Face Swap)

ReActor Node for ComfyUI 👉Downlond👈The Fast and Simple Face Swap Extension Node for ComfyUI, based on ReActor SD-WebUI Face Swap ExtensionThis Node goes without NSFW filter (uncensored, use it on your own responsibility)| Installation | Usage | Troubleshooting | Updating | Disclaimer | Credits | Note!✨What's new in the latest update✨💡0.5.1 ALPHA1Support of GPEN 1024/2048 restoration models (available in the HF dataset https://huggingface.co/datasets/Gourieff/ReActor/tree/main/models/facerestore_models)👈[]~( ̄▽ ̄)~*ReActorFaceBoost Node - an attempt to improve the quality of swapped faces. The idea is to restore and scale the swapped face (according to the face_size parameter of the restoration model) BEFORE pasting it to the target image (via inswapper algorithms), more information is here (PR#321)InstallationSD WebUI: AUTOMATIC1111 or SD.NextStandalone (Portable) ComfyUI for WindowsUsageYou can find ReActor Nodes inside the menu ReActor or by using a search (just type "ReActor" in the search field)List of Nodes:••• Main Nodes •••💡ReActorFaceSwap (Main Node Download)👈[]~( ̄▽ ̄)~*ReActorFaceSwapOpt (Main Node with the additional Options input)ReActorOptions (Options for ReActorFaceSwapOpt)ReActorFaceBoost (Face Booster Node)ReActorMaskHelper (Masking Helper)••• Operations with Face Models •••ReActorSaveFaceModel (Save Face Model)ReActorLoadFaceModel (Load Face Model)ReActorBuildFaceModel (Build Blended Face Model)ReActorMakeFaceModelBatch (Make Face Model Batch)••• Additional Nodes •••ReActorRestoreFace (Face Restoration)ReActorImageDublicator (Dublicate one Image to Images List)ImageRGBA2RGB (Convert RGBA to RGB)Connect all required slots and run the query.Main Node Inputsinput_image - is an image to be processed (target image, analog of "target image" in the SD WebUI extension);Supported Nodes: "Load Image", "Load Video" or any other nodes providing images as an output;source_image - is an image with a face or faces to swap in the input_image (source image, analog of "source image" in the SD WebUI extension);Supported Nodes: "Load Image" or any other nodes providing images as an output;face_model - is the input for the "Load Face Model" Node or another ReActor node to provide a face model file (face embedding) you created earlier via the "Save Face Model" Node;Supported Nodes: "Load Face Model", "Build Blended Face Model";Main Node OutputsIMAGE - is an output with the resulted image;Supported Nodes: any nodes which have images as an input;FACE_MODEL - is an output providing a source face's model being built during the swapping process;Supported Nodes: "Save Face Model", "ReActor", "Make Face Model Batch";Face RestorationSince version 0.3.0 ReActor Node has a buil-in face restoration.Just download the models you want (see Installation instruction) and select one of them to restore the resulting face(s) during the faceswap. It will enhance face details and make your result more accurate.Face IndexesBy default ReActor detects faces in images from "large" to "small".You can change this option by adding ReActorFaceSwapOpt node with ReActorOptions.And if you need to specify faces, you can set indexes for source and input images.Index of the first detected face is 0.You can set indexes in the order you need.E.g.: 0,1,2 (for Source); 1,0,2 (for Input).This means: the second Input face (index = 1) will be swapped by the first Source face (index = 0) and so on.GendersYou can specify the gender to detect in images.ReActor will swap a face only if it meets the given condition.💡Face ModelsSince version 0.4.0 you can save face models as "safetensors" files (stored in ComfyUI\models\reactor\faces) and load them into ReActor implementing different scenarios and keeping super lightweight face models of the faces you use.To make new models appear in the list of the "Load Face Model" Node - just refresh the page of your ComfyUI web application.(I recommend you to use ComfyUI Manager - otherwise you workflow can be lost after you refresh the page if you didn't save it before that).TroubleshootingI. (For Windows users) If you still cannot build Insightface for some reasons or just don't want to install Visual Studio or VS C++ Build Tools - do the following:(ComfyUI Portable) From the root folder check the version of Python:run CMD and type python_embeded\python.exe -VDownload prebuilt Insightface package for Python 3.10 or for Python 3.11 (if in the previous step you see 3.11) or for Python 3.12 (if in the previous step you see 3.12) and put into the stable-diffusion-webui (A1111 or SD.Next) root folder (where you have "webui-user.bat" file) or into ComfyUI root folder if you use ComfyUI PortableFrom the root folder run:(SD WebUI) CMD and .\venv\Scripts\activate(ComfyUI Portable) run CMDThen update your PIP:(SD WebUI) python -m pip install -U pip(ComfyUI Portable) python_embeded\python.exe -m pip install -U pip💡Then install Insightface:(SD WebUI) pip install insightface-0.7.3-cp310-cp310-win_amd64.whl (for 3.10) or pip install insightface-0.7.3-cp311-cp311-win_amd64.whl (for 3.11) or pip install insightface-0.7.3-cp312-cp312-win_amd64.whl (for 3.12)(ComfyUI Portable) python_embeded\python.exe -m pip install insightface-0.7.3-cp310-cp310-win_amd64.whl (for 3.10) or python_embeded\python.exe -m pip install insightface-0.7.3-cp311-cp311-win_amd64.whl (for 3.11) or python_embeded\python.exe -m pip install insightface-0.7.3-cp312-cp312-win_amd64.whl (for 3.12)Enjoy!II. "AttributeError: 'NoneType' object has no attribute 'get'"This error may occur if there's smth wrong with the model file inswapper_128.onnx💡Try to download it manually from here and put it to the ComfyUI\models\insightface replacing existing oneIII. "reactor.execute() got an unexpected keyword argument 'reference_image'"This means that input points have been changed with the latest updateRemove the current ReActor Node from your workflow and add it againIV. ControlNet Aux Node IMPORT failed error when using with ReActor NodeClose ComfyUI if it runsGo to the ComfyUI root folder, open CMD there and run:python_embeded\python.exe -m pip uninstall -y opencv-python opencv-contrib-python opencv-python-headlesspython_embeded\python.exe -m pip install opencv-python==4.7.0.72That's it!reactor+controlnetV. "ModuleNotFoundError: No module named 'basicsr'" or "subprocess-exited-with-error" during future-0.18.3 installationDownload https://github.com/Gourieff/Assets/raw/main/comfyui-reactor-node/future-0.18.3-py3-none-any.whlPut it to ComfyUI root And run:python_embeded\python.exe -m pip install future-0.18.3-py3-none-any.whlThen:python_embeded\python.exe -m pip install basicsrVI. "fatal: fetch-pack: invalid index-pack output" when you try to git clone the repository"Try to clone with --depth=1 (last commit only):git clone --depth=1 https://github.com/Gourieff/comfyui-reactor-nodeThen retrieve the rest (if you need):git fetch --unshallow
10
ComfyUi: Text2Image Basic Glossary

ComfyUi: Text2Image Basic Glossary

Hello! This is my first article; I hope it will be of benefit to the person who reads it. I still have limited knowledge about WorkFlow; but I have researched and learned little by little. If anyone would like to contribute some content; you are totally free to do so. Thank you.I made this article to give a brief and basic explanation about basic concepts about Comfyui or WorkFlow. This is a technology with many possibilities and it would be great to make it easier to use for everyone! What is Workflow?Workflow is one of the two main image generation systems that Tensor Art has at the moment. It corresponds to a generation method that is characterized by a great capacity to stimulate the creativity of the users; also, it allows us to access to some Pro features being Free users.How do I access the WorkFlow mode?To access the WorkFlow mode, you must place the mouse cursor on the “Create” tab as if you were going to create an image by conventional means. Once you have done that; click on the “ComfyFlow” option and you are done.After that, you will see a tab with two options “New WorkFlow” and “Import WorkFlow”. The first one allows you to start a workflow from a template or from scratch; while the second option allows you to load a workflow that you have saved on your pc in a JSON file.If you click on the “New WorkFlow” option, a tab with a list of various templates will be displayed (each template will have a different purpose). But the main one will be “Text2Image”; it will allow us to create images from text, similarly to the conventional method we always use. You can also create a workflow from scratch in the “Empty WorkFlow Template” option but for a better explanation of the basics we will use the “Text2Image”.Once you click on the "Text2Image" option, you must wait a few seconds and a new tab will be displayed with the template, which contains the basics to create an image by means of text. Nodes and Borders: ¿What are they and how do they work?Well, to understand the basics of how a WorkFlow works, it is necessary to have a clear understanding of what Nodes and Border are.Nodes are small boxes that are present in the workflow; each node will have a specific function necessary for the creation, enhancement or editing of the image or video. The basics of Text2Image are the CheckPoint loader, the Clip Text Encoders, the Empty Lantent Image, the Ksampler, the VAE decoder, and Save Image. It should be noted that there are hundreds of other nodes besides these basics and they all have many different functions.On the other hand, the “Borders” are the small colored wires that connect the different nodes. They are the ones that will set which nodes will be directly related. The Borders are ordered by colors that are generally related to a specific function.The purple is related to the Model or Lora used.The yellow one is intended for connection to the model or lora with the space to place the prompt.The red refers to VAE.The orange color refers to the connection between the spaces for placing the prompt and the “Ksampler” node.The fucsia color makes allusion to the latent, which will serve for many things; but for this case it serves to connect the “Empty Latent Image” node with the “Ksampler” node and establish the number and size of the images that will be generated.And the blue color is related to everything that has to do with images; it has many uses but this case is related to the “Save Image” node.What are the Text2Image template Nodes used for?Having this clear is of utmost relevance, since it allows you to know what each node of this basic template is for. It's like knowing what each piece in a lego set is for and understanding how they should be connected to create a beautiful masterpiece! Also, if you get to know what these nodes are for, it will be easier for you to intuit the functionality of its variants and other derived nodes.A) The first one is the node called “Load Chckpoint”, this node has three specific functions. The first one is to load the base model or checkpoint with which an image will be created. The second is the Clip, which will take care of connecting the positive and negative prompts that you write to the checkpoint. And the third is that it connects and helps to load the VAE model. B) The second one is the “Empty Latent Image”; which is the node in charge of processing the image dimensions from the latent space. It has two functions: First, set the width and length of the image; and second, set how many images will be generated simultaneously according to the “Batch Size” option.C) The third is the two “Clip Text Enconder” nodes: in this case there will always be at least two of these nodes, since they are responsible for setting both the positive and negative prompts that you write to describe the image you want. They are usually connected to the "Load Checkpoint" or any LoRa and are also connected to the “Ksampler” node.D) Then, there is a node “Ksampler”. This node is the central point of all WorkFlow; it is the one that sets the most important parameters in the creation of images. It has several functions: the first one is to determine which is the seed of the image and to regulate how much it changes from image to generated image by means of the “control_after_generate” option. The second function is to set how many steps are needed to create the image (you set them as you wish); the third function is to determine which sampling method is used and also what is the scheduler of this method (this helps to regulate how much space is eliminated when creating the image).E) The penultimate one is the VAE decoder. This node is in charge of assisting the processing of the image to be generated: its main function is to be responsible for materializing the written prompt into an image. That is to say, it reconstructs the description of the image we want as one of the final steps to finish the generation process. Then, the information is transmitted to the “Save Image” node to display the generated image as the final product.F) The last node to explain is the “Save Image”. This node has the simple function of saving the generated image and providing the user with a view of the final work that will later be stored in the taskbar where all the generated images are located.Final Consideration:This has been a small summary and explanation about very basic concepts about ComfyUI Mode; you could even say that it is like a small glossary about general terms. I have tried to give a small notion that tries to facilitate the understanding of this image generation tool. There is still a lot to explain, but I will try to cover all the topics; the information would not fit in a single article (ComfyUI is a whole universe of possibilities). ¡Thank you so much for taking the time to read this article!
13
11
Textual Inversion Embeddings  ComfyUI_Examples

Textual Inversion Embeddings ComfyUI_Examples

ComfyUI_examplesTextual Inversion Embeddings ExamplesHere is an example for how to use Textual Inversion/Embeddings.To use an embedding put the file in the models/embeddings folder then use it in your prompt like I used the SDA768.pt embedding in the previous picture.Note that you can omit the filename extension so these two are equivalent:embedding:SDA768.ptembedding:SDA768You can also set the strength of the embedding just like regular words in the prompt:(embedding:SDA768:1.2)Embeddings are basically custom words so where you put them in the text prompt matters.For example if you had an embedding of a cat:red embedding:catThis would likely give you a red cat.
8
1
Art Mediums (127 Style)

Art Mediums (127 Style)

Art MediumsVarious art mediums. Prompted with '{medium} art of a woman MetalpointMiniature PaintingMixed MediaMonotype PrintingMosaic Tile ArtMosaicNeonOil PaintOrigamiPapermakingPapier-mâchéPastelPen And InkPerformance ArtPhotographyPhotomontagePlasterPlastic ArtsPolymer ClayPrintmakingPuppetryPyrographyQuillingQuilt ArtRecycled ArtRelief PrintingResinReverse Glass PaintingSandScratchboard ArtScreen PrintingScrimshawSculpture WeldingSequin ArtSilk PaintingSilverpointSound ArtSpray PaintStained GlassStencilStoneTapestryTattoo ArtTemperaTerra-cottaTextile ArtVideo ArtVirtual Reality ArtWatercolorWaxWeavingWire SculptureWoodWoodcutGlassGlitch ArtGold LeafGouacheGraffitiGraphite PencilIceInk Wash PaintingInstallation ArtIntaglio PrintingInteractive MediaKinetic ArtKnittingLand ArtLeatherLenticular PrintingLight ProjectionLithographyMacrameMarbleMetalColored PencilComputer-generated Imagery (cgi)Conceptual ArtCopper EtchingCrochetDecoupageDigital MosaicDigital PaintingDigital SculptureDioramaEmbroideryEnamelEncaustic PaintingEnvironmental ArtEtchingFabricFeltingFiberFoam CarvingFound ObjectsFrescoAugmented Reality ArtBatikBeadworkBody PaintingBookbindingBronzeCalligraphyCast PaperCeramicsChalkCharcoalClayCollageCollagraphy3d PrintingAcrylic PaintAirbrushAlgorithmic ArtAnimationArt GlassAssemblage
15
Anime Vision | Detail Enhancer SD3

Anime Vision | Detail Enhancer SD3

SD3 Anime LoRA is Finally Here!I am thrilled to announce that the SD3 Anime LoRA model is finally available. In addition, I am releasing a new update that includes an SD3 anime checkpoint model.Currently, I am publishing a beta version as I continue to work diligently to perfect the model. I aim to have the final release ready by the end of this month or early August.Stay tuned, as the SD3 Anime beta version will be available within the next couple of days!Here are some guidelines to use this LoRA to its full potential:If you are trying to create any specific subject or object, use trigger word like 'anime style' in your prompt.If you're targeting a character, you can ignore the keyword and go with something like this:For a male character: 'anime boy'For a female character: 'anime girl'Simple, right? You can also use the trigger word 'anime style' most of the time. I've noticed it gives better results.ModelRecommended Parameter :LoRA Weight : 0🆙1VAE : No NeedSampler : DPM++ 2M SGM UniformSteps : 20➡30CFG : 3➡4Upscaler : R-ESRGAN 4x+If you encounter any issues, I recommend using ComfyUI for a better experience. Here's the workflow: ComfyUI Workflow. Open the link, select the LoRA model, choose the LoRA strength, and hit the run button.Join my community, Share your feedback, learn, and have fun with us! 😊Discord➡️https://discord.gg/QQKd7bu97P
18
How to set up Radio Button in your AI Tools

How to set up Radio Button in your AI Tools

Hello everyone! ✨ Today I will bring you a super practical tutorial: How to set up a convenient prompt word radio version for your AI Tools! 😎 Save it quickly, and you will never have to worry about how to set prompt words again! 👌Are you ready for the course? Let's get started! 🔍First, the first step is to open the official website of TensorArt. 📂 After opening, you will see a variety of AI tools and resources, which are very rich~ 👀Next, open comfyflow and start making our AI Tool! 🤖 This process is simple and fun, let's explore it together! ✨In comfyflow, we click the "New" button, which will take you to a new interface~ 🖱️💻In this interface, we can start creating our own workflow~ 🌟🎉 Next, we need to fill in the positive prompt words, which is a super critical step! 📝✨In the positive prompt word area, we need to enter the content we want. 📋 Here, the editor simply wrote an example for everyone: "a man". 🤵 This example is just for the convenience of teaching, you can freely play according to your needs~ 🌈🎆🎉 When you have completed the workflow, you can click the "Publish" button in the upper right corner! 🚀✨Don't forget to give your AI Tool an interesting name! 💡 This name will make your tool more attractive~✨ In addition, remember to divide the area correctly, so that you can see it clearly and it is also convenient for your friends to find and use it! 📂🔍🌟 Next, let's complete the next step together! 💪We pull down the current interface and find the user-configurable settings area. 👏 Then click the "Add" button. This step is very critical! 🖱️✨ Everyone must remember to add your positive prompt word node! 🔍✨After adding the node, our next step is to click the "Set" button on the right to proceed to the next step. 🔧✨ This step is crucial! Don't miss it! 😉🚀✨ The next step is also very important! 😊First, click the radio button, then click "Add". 🔘✨ Here, you can add the buttons you want to release to the user! 👍 After selecting, be sure to click "Confirm"! ✔️✨Friends, we have finally reached the last step! 🎉💪 This is an exciting moment! ✨When you have completed all the operations, remember to click the "Publish" button to publish your AI gadget! 🚀✨ Can't wait to see the results? Hurry up and generate a picture yourself to try and experience your results! 🌟🖼️Well, that's all for today's tutorial! 😊 I hope everyone can complete it successfully and create their own AI gadgets! 👏 If you have any questions, don't hesitate to leave a comment in the comment section at any time! ❤️
16
5
Guide to Using SDXL / SDXLモデルの利用手引

Guide to Using SDXL / SDXLモデルの利用手引

Guide to Using SDXLI occasionally see posts about difficulties in generating images successfully, so here is an introduction to the basic setup.1. IntroductionSDXL is a model that can generate images with higher accuracy compared to SD1.5. It produces high-quality representations of human bodies and structures, with fewer distortions and more realistic fine details, textures, and shadows.With SD1.5, generation parameters were generally applicable across different models, so there was no need for specific adjustments.However, while SDXL can still use some SD1.5 techniques without issues, the recommended generation parameters vary significantly depending on the model.Additionally, LoRA and Embeddings (such as EasyNegative) are completely incompatible, requiring a review of prompt construction.Notably, embeddings commonly used in SD1.5 negative prompts are recognized merely as strings in the XL model, so you must replace them with corresponding embeddings or add appropriate tags.This guide explains the recommended parameter settings for using SDXL.2. Basic ParametersVAESelecting "sdxl-vae-fp16-fix.safetensors" will suffice.Many models have this built-in, so specification might not be necessary.Image SizeUsing the presets provided by TensorArt for resolution should be sufficient.Small or excessively large resolutions may not yield appropriate generation results, so please avoid using the sizes that were frequently used with SD1.5 wherever possible.Even if you want to create vertically or horizontally elongated images, do so within the range that does not significantly alter the total pixel count (adjust by increasing height and decreasing width, for example).Sampling MethodChoose the sampler recommended for the model first.Then, select according to your preference.Typically, selecting Euler a or DPM++ 2M SDE Karras should work well.Sampling StepsXL models might generate images effectively with lower steps due to optimizations like LCM or Turbo.Be sure to check the recommended values for the selected model.CFG ScaleThis varies by model, so check the recommended values.Typically, the range is around 2 to 8.Hires.fixFor free users, specifying 1.5x might hit the upper limit, so use custom settings with the following resolutions:768x1152 -> 1024x15361152x768 -> 1536x10241024x1024 -> 1248x1248Choose the upscaler according to your preference.Set the denoising strength to around 0.3 to 0.4.3. PromptSDXL handles natural language better.You can input elements separated by commas or simply write a complete sentence in English, and it will generate images as intended.Using a tool like ChatGPT to create prompts can also be beneficial.However, depending on how the model was additionally trained, it might be better to use existing tags.Furthermore, some models have tags specified to enhance quality, so always check the model’s page.For example:AnimagineXL3.1: masterpiece, best quality, very aesthetic, absurdres is recommended.Pony Models: score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up is recommended.ToxicEchoXL: masterpiece, best quality, aesthetic is recommended.In this way, especially for XL models, particularly anime or illustration models, appropriate tag usage is crucial.4. Negative PromptsForget the negative prompts used in SD1.5. "EasyNegative" is just a string.The embeddings usable on TensorArt are negativeXL_D and unaestheticXLv13.Choose according to your preference.Some models have recommended prompts listed.For AnimagineXLnsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]For ToxicEchoXLnsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name.For photo models, sometimes it is better not to use negative prompts to create a certain atmosphere, so try various approaches.5. Recommended SDXL modelToxicEnvisionXLhttps://tensor.art/models/736585744778443103/ToxicEnvisionXL-v1Recently released high-quality photo model. Yes, I created it.If you are looking for a photo model, you can't go wrong with this one.Check the related posts to see what kind of images can be created.You can create a variety of realistic images, from analog photo styles to gravure, movies, fantasy, and surreal depictions.Although it is primarily a photo-based model, it can also create analog-style images.ToxicEtheRealXLhttps://tensor.art/models/702813703965453448/ToxicEtheRealXL-v1A versatile model that supports both illustrations and photorealistic images. Yes, I created it.The model's flexibility requires well-crafted prompts to determine whether the output is more illustrative or photorealistic.Using LoRA to strengthen the direction might make it easier to use.ToxicEchoXLhttps://tensor.art/models/689378702666043553/ToxicEchoXL-v1A high-performance model specialized for illustrations. Yes, I created it.It features a unique style based on watercolor painting, with custom learning and adjustments.I have also created various LoRA for style changes, so please visit my user page.My current favorite is Beautiful Warrior XL + atmosphere.The model covers a range from illustrations to photos, so give it a try.However, it is weak in generating copyrighted characters, so use LoRA or models like AnimagineXL or Pony for those.ToxicEchoXL can produce unique illustration styles when using character LoRA, making it highly suitable for fan art.6. ConclusionI hope this guide helps those who struggle to generate images as well as others.Well... if you remix from Model Showcase, you can create beautiful images without this guide...SD3 has also been released, so if possible, I would like to create models for that as well.It seems that a commercial license is required for commercial use, though...SDXLモデルの利用手引ここではSDXLの基本的な設定を紹介します。1. はじめにSDXLはSD1.5と比較してより高精度な生成が行えるモデルです。人体や構造物はより高品質で破綻が少なく、微細なディテールがよりリアルに表現され、自然なテクスチャや影を描写します。SD1.5ではどのモデルでも生成パラメータは概ね流用可能で、特に気にする必要はありませんでした。SDXLは一部SD1.5の手法を利用しても問題ありませんが、推奨される生成パラメータがモデルによってもだいぶ変わります。またLoRAやEmbeddings(EasyNegativeなど)も一切互換性はありませんので、プロンプトの構築も見直す必要があります。特にSD1.5のネガティブプロンプトでよく使用されているEmbeddingsをそのままXLモデルで入力しても、ただの文字列としてしか認識されていませんので、対応するEmbeddingsに差し替えるか、適切なタグを追加しなければいけません。このガイドでは、SDXLを使用する際の推奨パラメータ設定について説明します。2. 基本的なパラメータVAEsdxl-vae-fp16-fix.safetensorsを選択しておけば問題ありません。モデルに内蔵されている場合も多いですので、指定しなくても大丈夫な場合もあります。画像サイズ解像度はTensorArtで用意されているプリセットを使えば問題ありません。小さかったり大きすぎる解像度は適切な生成結果を得られなくなりますので、SD1.5でよく使用していたサイズはなるべく使用しないでください。プリセットよりも縦長や横長にしたい場合でも、総ピクセル数を大幅に変更しない範囲で行ってください。(縦を増やしたら横は減らす等で調整)サンプリング法モデルによって推奨されるサンプラーがありますので、まずはそれを選択してください。あとはお好みです。基本は Euler a か DPM++ 2M SDE Karras あたりを選択しておけば大丈夫です。サンプリング回数XLではLCMやターボなど低ステップで生成できるようになっていたりしますので、必ずモデルの推奨値を確認してください。CFG Scaleこれもモデルによって異なりますので推奨値を確認してください。概ね2~8程度です。高解像度修復無料ユーザーだと1.5xを指定すると上限に引っかかってしまいますので、使用する場合はカスタムにして以下の解像度を指定してください768x1152 -> 1024x15361152x768 -> 1536x10241024x1024 -> 1248x1248Upscalerはお好みで指定してください。Denoising strengthは0.3~0.4程度。3. プロンプトSDXLはより自然言語の取り扱いに長けています。要素をコンマで区切って入力するだけではなく、普通に英文を入力するだけでも意図した通りの生成が行えます。ChatGPTなどにプロンプトを作ってもらうのもいいでしょう。ただしモデルが追加学習をどのように行ったかによって、既存のタグで記述したほうがいい場合もあります。また、モデルによっては品質を上げるためのタグが指定されていますので、使用するモデルのページは必ず見るようにしましょう。例えば…AnimagineXL3.1では「masterpiece, best quality, very aesthetic, absurdres」を指定することが推奨されています。Pony系モデルでは「score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up」が基本テンプレートとなっています。ToxiEchoXLでは「masterpiece, best quality, aesthetic」を指定することが推奨されています。このように、XLモデル、特にアニメ・イラストモデルでは適切なタグの使用が求められる場合があります。4. ネガティブプロンプトSD1.5で使用していたネガティブプロンプトは忘れてください。EasyNegativeはただの文字列です。TensorArtで使用できるEmbeddingsは negativeXL_D と unaestheticXLv13 です。お好みで指定してください。推奨されるプロンプトが記載されているモデルもあります。AnimagineXLでは以下のようなプロンプトが推奨されていますので、これをベースに組むのがいいかもしれません。nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]ToxicEchoXLでは以下のようなプロンプトが推奨されていますnsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name,フォトモデルではネガティブプロンプト無しのほうが雰囲気のある画作りができる場合もありますので、色々試してみてください。5. おすすめのSDLXモデル紹介ToxicEnvisionXLhttps://tensor.art/models/736585744778443103/ToxicEnvisionXL-v1最近リリースされた高品質フォトモデル。実写系モデルを探しているならこれを選んでおけば間違いありません。関連する投稿からどういった画像が作成できるか見てみてください。アナログ写真風からグラビア、映画、ファンタジー、非現実的な描写等、様々な実写的な画像が作成できます。基本的にはフォトベースのモデルですが、アナログ画風も作成できたりします。ToxicEtheRealXLhttps://tensor.art/models/702813703965453448/ToxicEtheRealXL-v1イラストからフォトリアルまで幅広く対応したモデル。プロンプトによってイラストかフォトリアルか振れ幅が大きいので、明確にプロンプトの作り込みが必要です。LoRAで方向性を強めると使いやすいかもしれません。ToxicEchoXLhttps://tensor.art/models/689378702666043553/ToxicEchoXL-v1イラスト特化の超高性能モデル。水彩をベースに独自の学習・調整を行っているので、わりと独特な画風を持っています。画風変更に様々なLoRAも作成していますので、是非私のユーザーページへお越しください。https://tensor.art/u/649265516304702656最近のお気に入りはBeautiful Warrior XL + atmosphere です。イラストからフォトまで一通り網羅できるので、是非使ってみてください。なお版権キャラの生成は弱いので、その辺はLoRAかAnimagineXLとかPonyとか使うといいと思います。ToxicEchoXLはキャラLoRAを使うと他のモデルとはタッチの違うイラストが作れますので、ファンアート適正自体は高いです。6. おわりにモデルのサンプルやみんなみたいにうまく生成できないな…という方の助けになれば幸いです。まあ…モデルのショーケースからリミックスすればこんなガイド見なくてもきれいな画像が作れますけどね…SD3もリリースされたので、もし可能ならそちらのモデルも作成してみたいですね。どうも商用利用は有償のライセンスが必要そうですが…
25
Fix EXIF data from EMS-#####-EMS using ExifTool

Fix EXIF data from EMS-#####-EMS using ExifTool

IntroductionYou download your images from this website but the EXIF data for the model / lora looks like:Model: EMS-342970-EMS, or <lora:EMS-45352-EMS:0.500000>.The ExifTool utility can fix this. I am using linux but it should also work for mac/Windows if you follow https://exiftool.org/install.htmlPreReq- Download ExifTool from https://exiftool.org/ and extract the archive into your home drive.- Make a new dot-file called .ExifTool_config in the same folder as exiftool.- linux example: ~/Image-ExifTool-12.86/.ExifTool_config- windows might need cmd like: echo.>.ExifTool_config.ExifTool_config fileEdit the config file. Copy/paste the basic example.- This is perl language, search and replace, and not optimized, but it works. Switch is probably more efficient.- The \+ is an escape for the + in the model name.- The /g at the end searches for all instances.- 'Parameters' is the block it changes in the EXIF.- Add all your desired entries and save the file.You only need to make new entries like:$val =~ s/EMS-151022-EMS/RealCartoon Realistic v11/g; %Image::ExifTool::UserDefined = ( 'Image::ExifTool::Composite' => { MyParameters=> { Require => 'Parameters', ValueConv => q{ # MODEL $val =~ s/EMS-151022-EMS/RealCartoon Realistic v11/g; $val =~ s/EMS-219023-EMS/ShampooMix_v4-fp16-no-ema/g; $val =~ s/EMS-230098-EMS/RealCartoon Realistic v12/g; $val =~ s/EMS-379840-EMS/Lazymix\+ - v4/g; # LoRa $val =~ s/EMS-72516-EMS/Realistic Fusion X - V1/g; $val =~ s/EMS-343944-EMS/A simple nun suit - v1/g; return $val; }, }, }, ); 1; # end Modify the EXIFI use linux, extracted to a folder inside my home folder, and my files are in my Downloads folder, so the command I run is this, where "~/Downloads" has my raw files:perl ~/Image-ExifTool-12.86/exiftool "-Parameters<MyParameters" ~/DownloadsIt will make new files and append "original" to the old, however you can add -overwrite_original to delete the old files once absolutely sure your config file works. This does not forgive. I am not responsible for lost EXIF.Copy into folders based on ModelThis will parse your EXIF for the Model: and grab until the first comma, copy the file into a subfolder of the destination named as the Model. Ideally you already modified the EXIF to fix the model name. In this example the files are in my home Downloads folder on linux.- ~/Downloads/ is the source folder with the files- /path/to/destination/ is the destination parent folder. You need to change this- -r is recursive, if you choose, make it -r -o .- The "-o ." is the copy argument. Remove for move, at your own risk.- If you only run this without first doing the above section you'll get a bunch of EMS-###-EMS folders. The next section will combine everything together into one command.perl ~/Image-ExifTool-12.86/exiftool -if '($Parameters=~/Model/i)' -o . '-Directory</path/to/destination/${Parameters;m/\bModel:\s+(\w+[^,]*)/;$_=$1;}' ~/Downloads/Combine into one commandThis combines the above into one command. This example does a move not a copy. I also renamed my exiftool folder.- ~/Downloads There are two. Rename those to the folder with your EMS-###-EMS pictures- /path/to/destination/ Where you want to move the files after renaming the EMS-###-EMS to Model nameperl ~/ExifTool/exiftool -if '($Parameters=~/Model/i)' "-Parameters<MyParameters" ~/Downloads -overwrite_original -execute '-Directory</path/to/destination/${Parameters;m/\bModel:\s+(\w+[^,]*)/;$_=$1;}' ~/DownloadsNotesYou can rename the Image-ExifTool-12.86 directory or have it wherever.On Windows you might need to change the ' to " when referencing directories.This runs perl code so maybe rename exiftool to something else for safety.ExifTool created by Phil Harvey. Very impressive. Active community and forum at the creator's website.It can do advanced operations and scripting. Above my pay grade.Please understand this post isn't an offer for support. This took me all day to figure out. I don't know what I am doing.
Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

📝 - SynthicalThe Dynamics of Negative Prompts in AI: A Comprehensive Study by: Yuanhao Ban UCLA, Ruochen Wang UCLA, Tianyi Zhou UMD, Minhao Cheng PSU, Boqing Gong, Cho-Jui Hsieh UCLAEThis study addresses the gap in understanding the impact of negative prompts in AI diffusion models. By focusing on the dynamics of diffusion steps, the research aims to answer the question: "When and how do negative prompts take effect?". The investigation categorizes the mechanism of negative prompts into two primary tasks: noun-based removal and adjective-based alteration.The role of prompts in AI diffusion models is crucial for guiding the generation process. Negative prompts, which instruct the model to avoid generating certain features, have been less studied compared to their positive counterparts. This study provides a detailed analysis of negative prompts, identifying the critical steps at which they begin to influence the image generation process.FindingsCritical Steps for Negative PromptsNoun-Based Removal: The influence of noun-based negative prompts peaks at the 5th diffusion step. At this critical step, negative prompts initially generate a target object at a specific location within the image. This neutralizes the positive noise through a subtractive process, effectively erasing the object. However, introducing a negative prompt in the early stages paradoxically results in the generation of the specified object. Therefore, the optimal timing for introducing these prompts is after the critical step.Adjective-Based Alteration: The influence of adjective-based negative prompts peaks around the 10th diffusion step. During the initial stages, the absence of the object leads to a subdued response. Between the 5th and 10th steps, as the object becomes clearer, the negative prompt accurately focuses on the intended area and maintains its influence.Cross-Attention DynamicsAt the peak around the 5th step for noun-based prompts, the negative prompt attempts to generate objects in the middle of the image, regardless of the positive prompt's context. As this process approaches its peak, the negative prompt begins to assimilate layout cues from its positive counterpart, trying to remove the object. This represents the zenith of its influence.For adjective-based prompts, during the peak around the 10th step, the negative prompt maintains its influence on the intended area, accurately targeting the object as it becomes clear.The study highlights the paradoxical effect of introducing negative prompts in the early stages of diffusion, leading to the unintended generation of the specified object. This finding suggests that the timing of negative prompt introduction is crucial for achieving the desired outcome.Reverse Activation PhenomenonA significant phenomenon observed in the study is Reverse Activation. This occurs when a negative prompt, introduced early in the diffusion process, unexpectedly leads to the generation of the specified object within the context of that negative prompt. To explain this, researchers borrowed the concept of the energy function from Energy-Based Models to represent data distribution.Real-world distributions often feature elements like clear blue skies or uniform backgrounds, alongside distinct objects such as the Eiffel Tower. These elements typically possess low energy scores, making the model inclined to generate them. The energy function is designed to assign lower energy levels to more 'likely' or 'natural' images according to the model’s training data, and higher energy levels to less likely ones.A positive difference indicates that the presence of the negative prompt effectively induces the inclusion of this component in the positive noise. The presence of a negative prompt promotes the formation of the object within the positive noise. Without the negative prompt, implicit guidance is insufficient to generate the intended object. The application of a negative prompt intensifies the distribution guidance towards the object, preventing it from materializing.As a result, negative prompts typically do not attend to the correct place until step 5, well after the application of positive prompts. The use of negative prompts in the initial steps can significantly skew the diffusion process, potentially altering the background.ConclusionsDo not step less than 10th times, going beyond 25th times does not make the difference for negative prompting.Negative prompts could enhance your positive prompts, depending on how well the model and LoRA have learn their keywords, so they could be understood as an extension of their counterparts.Weighting-up negative keywords may cause reverse activation, breaking up your image, try keeping the ratio influence of all your LoRAs and models equals.Referencehttps://synthical.com/article/Understanding-the-Impact-of-Negative-Prompts%3A-When-and-How-Do-They-Take-Effect%3F-171ebba1-5ca7-410e-8cf9-c8b8c98d37b6?
13
[ 🔥🔥🔥 SD3 MEDIUM OPEN DOWNLOAD - 2024.06.12 🔥🔥🔥]

[ 🔥🔥🔥 SD3 MEDIUM OPEN DOWNLOAD - 2024.06.12 🔥🔥🔥]

Finally! It's happening! The Medium version will be released first!+Stability.AICo-CEO Christian Laporte has announced the release of the weights.Stable Diffusion 3 Medium, our most advanced text-to-image model, will soon be available! You can download the weights from Hugging Face starting Wednesday, June 12.SD3 Medium is the SD3 model with 2 billion parameters, designed to excel in areas where previous models struggled. Key features include:• Photorealism: Overcomes common artifacts in hands and faces to deliver high-quality images without complex workflows.• Typography: Provides powerful typography results that surpass the latest large models.• Performance: Optimized size and efficiency make it ideal for both consumer systems and enterprise workloads.• Fine-Tuning: Can absorb fine details from small datasets, perfect for customization and creativity.SD3 Medium weights and code are available for non-commercial use only. If you wish to discuss a self-hosting license for commercial use of Stable Diffusion 3, please fill out the form below and our team will contact you shortly.+ @everyone
27
4
What exactly are the "node" and the "workflow" in AI image platform (explanation for the beginner)

What exactly are the "node" and the "workflow" in AI image platform (explanation for the beginner)

The Traditional Way of Generating AI Images for the BeginnerIf you are a beginner in the AI community, maybe you will be very confused and have no clue about what is "Node", and "Workflow" and their relations with "AI Tools" in the TensorArtTo start with the most simple way. We need to first mention how the user generates an image using a "Remixing" button that brings us to the "Normal Creation menu"Needless to say, by just editing the prompt (what you would like to see your picture look like) and negative prompt (what you do not want to see in the output image). Then push the Generate button, and the wonderful AI tool will kindly draw the new illustration serving you within a minute!!!!That sounds great, don't you think? If we imagine how humans spent a huge amount of time in the past to publish just 1 single piece of art. (Yeah, today, in 2024, in my personal opinion, both AI and human abilities are still not fully replaceable, especially in the terms of beautiful perfect hand :P ) However, the backbone or what happens behind the User-friendly menu allows us to "Select model", "Add LoRA", "Add ControlNet", "Set the aspect ratio (the original size of the image)" and so on, all of them are collected "Node" in a very complex "Workflow" PS.1. The Checkpoint or The Model often refers to the same thing. They are the core program that had been trained to draw the illustration. Each one has its strengths and weaknesses (I.E. Anime oriented or Realistic oriented) PS.2. The LoRA (Low-Rank Adaptation) is like an add-on to the Model allowing it to adapt to a different style, theme, and user preference. A concrete example is the Anime Character LoRAPS.3 The ControlNet is like a condition setting of the image. It helps the model to truly understand what is beyond the text prompt can describe. For instance, how a character poses in each direction and the angle of the camera.So here comes "The Comfyflow" (the nickname of the Workflow, people also mentioned it by the name "ComfyUI") which gives me a super headache when I see things like this for the first time in my life!!!!!!!!!(This image is a flow I have spent a lot of time studying, it is a flow for combining what is in the two images into a single one) Yeah, maybe, it is my fault that did not go to class about the workflow from the beginning or search for the tutorial on YouTube the first time (as my first language is not English). But would it be better if we had an instructor to tell us step-by-step here in Tensor.ArtAnd that is the reason why I got inspired to write this article solely for the beginner. So let's start with the main content of the article.What is ComfyFlowComfyFlow or the Workflow is an innovative AI image-generating platform that allows users to create stunning visuals with ease. To get the most out of this tool, it's important to understand two key concepts: "workflow" and "node." Let's break these down in the simplest way possible.What is a Workflow?A workflow is like a blueprint or a recipe that guides the creation of an image. Just as a recipe outlines the steps to make a dish, a workflow outlines the steps and processes needed to generate an image. It’s a sequence of actions that the AI follows to produce the final output.Think of it like this:Recipe (Workflow): Tells you what ingredients to use and in what order.Ingredients (Nodes): Each step or component used in the recipe.Despite the recommended pre-set template that TensorArt kindly gives to the users, from the beginner view's viewpoint without the knowledge of the workflow, it is not that helpful because, after clicking the "Try" button, we will bombarded with the complexity of the Node!!!!!!!What is a Node?Nodes are the building blocks of a workflow. Each node represents a specific action or process that contributes to the final image. In ComfyFlow, nodes can be thought of as individual steps in the workflow, each performing a distinct function.Imagine nodes as parts of a puzzle:Nodes: Individual pieces that fit together to complete the picture (workflow).How Do Workflows and Nodes Work Together? 1-2) Starting Point: Every workflow begins with an initial node, which might be an image input from the user, together with Checkpoint and LoRA serving the role of image references. 3-4) Processing Nodes: These are nodes that draw or modify the image in some way, such as adding color, or texture, or applying filters. 5) Ending Point: The node outputs the completed image which works very closely with the node of the previous stage in terms of sampling and VAE PS. A Variational Autoencoder (VAE) is a generative model that learns input data, such as images, to reconstruct and generate new, similar, or variations of images based on the patterns it has learned.Here is the list of nodes I have used in the normal image-generating images of my Waifu using 1checkpoint, and 2LoRAs to help the reader understand how ComfyFlow worksThe numbers 1-5 represent the overview process of the workflow and the role of each type of node I have mentioned above. However, in the case of more complex tasks like in AI Tools, the number of nodes sometimes is higher than 30!!!!!!!By the way, when starting with an empty ComfyFlow page, the way to add a node is "Right Click" -> "Add Node" -> Scroll down to the top, since the most frequently used node will be over there.1) loaders -> Load CheckPointLike in the normal task creation menu, this node is the one we can choose CheckPoint or the Core model.It is important to note that nodes work together using input/output. The "Model/CLIP/VAE" (the output) circles have to connect to the next one in which it has to correspond. We link them together by left-clicking on the circle's inner area and then drag to the destination. PS. CLIP (Contrastive Language-Image Pre-training) is a model developed by OpenAI that links images and text together in a way that helps AI understand and generate images based on textual descriptions.2) loaders -> Load LoRACheckpoint is very closely related to LoRA and that is a reason why they are connected by the input/output named "model/MODEL", "clip/CLIP"Anyway, since in this example, I have used 2 LoRAs (first for The theme of the picture and the Second for the character reference of my Waifu), two nodes of LoRAs then have to be connected as well. Here we can adjust the strength of the LoRA or the weight like it happens in the normal task generation menu.3) CLIP Text Encode (Prompt)This node is the prompt and negative prompt we normally see in the menu. The input here is only clip (Contrastive Language-Image Pre-training) and the output is "CONDITIONING" User tip: If you click on the output circle of the "Load LoRA" node and drag it to the empty area, the ComfyFlow will pop up a corresponding next node list to create a new one with ease. 4) KSampler & Empty Latent ImageThe sampling method is used to tell the AI how it should start generating visual patterns from the initial noise and everything associated with its adjustment will be set here in this type of sampling node together with "Empty Latent Image" The inputs in this step here are models (from LoRA node), positive and negative (from prompt node) and the output is "Latent"5) VAE Decode & Final output nodeOnce we establish the sampling node, the output named "LATENT" will then have to connect with "samples" Meanwhile the "vae" is the linkage between this one and the "Load Checkpoint" node from the beginning.And when everything is done the "IMAGE" as a final output here will be served at your hand.PS. An AI Tool is a more complex Workflow created to do some specific task such as swapping the face of the human in the original picture with the target face or changing the style of the input illustration to another one and etc.
24
2
PhotoReal Makeup Edition - V3 Slider

PhotoReal Makeup Edition - V3 Slider

PhotoReal Makeup Edition - V3 Slider (no trigger)Introducing the PhotoReal Makeup Edition - V3 Slider! Slide to the right to add beautiful, realistic makeup. Slide to the left to reduce the makeup effect for a more natural look. It's perfect for adjusting the makeup to get just the style you want.Try it out and see the amazing changes you can make!More Information:- Model linkYour feedback is invaluable to me. Feel free to share your experiences and suggestions in the comment section. For more personal interactions, join our Discord server where we can discuss and learn together.Thank you for your continued support!
40
4

Tips for new Users

Intro Hey there! If you're reading this, you're probably new to AI image generation and want to learn more. If you're not, you probably already know more than me :). Yeah, full disclosure: I'm still pretty inexperienced at this whole thing, but I thought I could still share some of the things I've learned with you! So, in no particular order:1. You can like your own posts I doubt there's anyone who doesn't know this already, but if you're posting your favorite generations and you care about getting likes, you can always like them yourself. Sketchy? Kinda. Do I still do it? Yes. And on the topic of getting more likes:2. Likes will often be returned Whenever I receive a like on one of my posts, I'll look at that person's pictures and heart any that I particularly enjoy. I know a lot of people do this, so one of the best ways to get people to notice and like your content is to just browse through posts and be generous with your own likes. It's a great way to get inspiration too!3. Use turbo/lightning LORAs If you find yourself running out of credits, there are ways to conserve them. When I'm iterating on an idea, I'll use a SDXL model (Meina XL) paired with this LORA. This lets me get high quality images in 10 steps for only 0.4 credits! It's really nice, and works with any SDXL model. Unfortunately, if there is a similar method for speeding up SD 1.5 models I don't know it, so it only works with XL.4. Use ADetailer smartly ADetailer is the best solution I've found for improving faces and hands. It's also a little difficult to figure out. So, though I'm still not a professional with it, I thought I could share some of the tricks I've learned. The models I normally use are face_yolo8s.pt and hand_yolo8s.pt. The "8s" versions are better than the "8n" versions, though they are slightly slower. In addition to these models, I'll often add the Attractive Eyes and Perfect Hand LORAs respectively. These are all just little things you can do to improve these notoriously hard parts of image generation. Also, using ADetailer before upscaling the image is cheaper in terms of credits, though the upscaling process can sometimes mess up the hands and face a little bit so there's some give and take there.5. Use an image editing app Wait a minute, I hear you saying, isn't this a guide for using Tensor Art? Yes, but you can still use other tools to improve your images. If I don't like a specific part of my image, I'll download it, open it in Krita (Or Photoshop or Gimp) and work on it. My art skills are pretty bad, (which is why I'm using this site in the first place,) but I can still remove, recolor, or edit certain aspects of the image. I can then reupload it to Tensor Art, and Img2img with a high denoising strength to improve it further. You could also just try inpainting the specific thing you want to change, but I always find it a bit of a struggle to get inpaint to make the changes I want.6. Experiment! The best way to learn is to do, so just start generating images, fiddling with settings, and trying new things. I still feel like I'm learning new stuff every day, and this technology is improving so fast that I don't think anyone will ever truly master it. But we can still try our hardest and hone our skills through experimentation, sharing knowledge, and getting more familiar with these models. And all the anime girls are a big plus too.Outro If you have anything to add, or even a tip you'd like to share, definitely leave a comment and maybe I can add it to this article. This list is obviously not exhaustive, and I'm no where near as talented as some of the people on this platform. Still though, I hope to have helped at least one person today. If that was you, maybe give the article a like? I appreciate it a ton, so if you enjoyed, just let me know. Thanks for reading!
36
• MOOD MAGIC SERIES • I. Melancholy

• MOOD MAGIC SERIES • I. Melancholy

MOOD MAGIC: adding emotion to your promptsMelancholy & GloomOvercast: Cloud-covered skies for subdued lighting.Dim Lighting: Limited light sources for creating deep shadows.Muted Colors: Toned-down color palette to convey sadness or desolation.Dusky: Twilight ambiance, suggesting the fading light of day.Foggy: A thick mist that obscures details and softens the scene.Drizzly: Gentle rain that adds a reflective, melancholic quality.Cloudy: Thick clouds that reduce brightness and saturate the scene with grey.Desaturated: Low color saturation to enhance the bleak feel.Shadowed: Prominent shadows that deepen the mood.Moody Lighting: Emotionally charged lighting with strong contrasts.Gloomy: Overall dark and dismal atmosphere.Monochrome: Black and white or single-color dominance to strip away cheer.Underexposed: Darker exposure to mimic a sense of foreboding.Chiaroscuro: Strong contrasts between light and dark, emphasizing turmoil.Hazy: Blurred or smoky atmosphere, creating a sense of mystery or unease.Twilight: Dim natural lighting that can feel lonely or isolating.Stormy: Implication of an approaching or ongoing storm to add tension.Wintery: Cold, barren landscape cues, even in urban settings.Grainy: Visual noise that adds an old or troubled quality.Bleak: Stark, harsh lighting or barren scenery settings.Ominous Clouds: Dark, menacing clouds that threaten bad weather.Subdued Tones: Soft, low-key colors that don't catch the eye.Cold Colors: Blues and greys to suggest chilliness and discomfort.Rusty: Implications of decay and neglect.Aged: A sense of time wearing down the scene, historical weariness.Soft Focus: Slightly out-of-focus elements to create a sense of disorientation or confusion.Tenebrous: Deeply shadowed, almost pitch-dark.Low-Key Lighting: Minimal lighting mostly in darkness with occasional highlights.Pensive: Engaged in, involving, or reflecting deep or serious thought.Yearning: A feeling of intense longing for something typically something that one has lost or been separated from.Weary: Conveying a sense of tiredness or exhaustion, both physical and emotional.Sparse: Minimalist or bare settings that suggest simplicity or emptiness.Brooding: A deep, serious, and sometimes dark contemplation.Silent: Lack of sound or motion, emphasizing solitude or contemplation.Ephemeral: Fleeting or transitory, suggesting the transient nature of moments and emotions.Desolate: Emptiness that conveys a sense of abandonment or loneliness.Poetic: Imbued with a sense of beauty and melancholy, often through lyrical expression.Moody Skies: Cloudy, stormy, or unsettled skies that reflect a turbulent emotional landscape.Cold Light: Harsh, unyielding light that doesn’t warm but isolates subjects.Autumnal: Related to autumn, often seen as a melancholic season due to its association with the end of summer.Faded: Colors or elements that have lost brightness, suggesting the passing of time.Blue Hour: Moody cool natural lighting obtained in the twilight hour just after sunset or just before sunrise.Example using Stable Diffusion SDXL + refinerCheckpoint: RealVis4Cfg: 5.5Steps: 40Sampler: DPM++ 3m SDE KarrasVisualize a close-up portrait of a young woman standing by a foggy window, her gaze distant and contemplative. The room is dimly lit, with only a soft, diffuse light filtering through the heavy overcast outside, casting subtle shadows across her face. The colors are desaturated, emphasizing a palette of cool grays and muted blues that reflect her somber mood. Her expression is serene yet melancholic, with her eyes slightly downcast as if lost in thought. The background is blurred, enhancing the sense of isolation and introspection. This portrait captures the essence of melancholy, framed in a moment of quiet solitude.negative: illustration, cartoon, anime, 3d, digital art, bad quality, CGI, sketch, drawn, blurry, painting, worst quality, low quality, bad anatomy, bad hands, bad body, missing fingers, extra digit, fewer digits
1
Buzz words: LIGHTING

Buzz words: LIGHTING

Getting the lighting right is key to making your AI-generated images look super realistic. This guide gives you the top keywords to use in your prompts to nail the lighting every time. Whether you're after dramatic shadows or soft, natural light, these tips will help your images look lifelike and set the tone to your composition.Ambient light:Soft, even lighting that fills the entire scene, reducing shadows.Chiaroscuro Lighting:A technique that uses strong contrasts between light and dark to create a dramatic, three-dimensional effect.Rim light:Light that outlines the subject, emphasizing its edges and creating a glowing effect.Diffused light:Soft light scattered in many directions, minimizing harsh shadows.Natural light:Light from the sun, moon, or other natural sources, offering realism and variationBacklight:Light coming from behind the subject, creating a silhouette or halo effect.Volumetric light:Light that interacts with particles in the air, such as fog or dust, creating visible light rays and enhancing the sense of depth in the scene.Polarized light:Light that vibrates in parallel planes.Emissive light:Light emitted from surfaces or objects themselves, often used to simulate glowing materials or lights.Directional light:Focused light from a specific direction, creating strong shadows and highlights.Soft light:Gentle light that produces minimal shadows, creating a smoother look.Hard light:Sharp, intense light that casts strong shadows and highlights details.Spotlight:Intense focused beam that highlights a set area or subject.Artificial light:Light from man-made sources allowing precise control over the scene.Holagen, florescent, blacklight, led, xenon, plasma, ultraviolet, incandescent, neon, Infrared, sodium vapor lights, metal halide lights, krypton, photoluminescent, ceramic metal halide, HMI, CCFL, CFLLow key light:Predominantly dark lighting with high contrast, often creating a dramatic or moody atmosphere.High Key Light:Bright, low-contrast lighting that minimizes shadows.Bounce Lighting/Reflected Lighting:Light reflected off a surface to soften the effect and spread it more evenly.Side Lighting:Light coming from the side of the subject.Caustic Lighting:Light patterns created when light is refracted or reflected through transparent or reflective materials, producing intricate and often beautiful effects.Uplighting:Light directed upwards. Great for emphasizing architectural features.Color Gel Lighting:The use of colored filters over lights to alter the color or mood of the scene.Gobo Lighting:Using a stencil or template placed in front of a light source to project patterns or shapes onto a surface.Split Lighting:Lighting that illuminates one half of the subject's face while leaving the other half in shadow, creating a strong, dramatic effectButterfly Lighting:Light placed above and in front of the subject, creating a butterfly-shaped shadow under the nose, often used in glamour photography.Rembrandt Lighting:technique where light creates a triangle of illumination on the cheek opposite the light source, adding depth and character.Specular lighting:Sharp, bright reflections from shiny surfaces, emphasizing glossiness and texture.Natural Breakup Lighting/Dappled Lighting:Using irregular patterns to mimic natural light effects, such as light filtering through leaves.Subsurface Scattering:Light that penetrates the surface of a translucent material, scattering within and then exiting at a different point, adding realism to materials like skin or wax.Golden Hour:Warm golden natural lighting obtained shortly after sunrise or shortly before sunset. Creates long soft shadows.Blue Hour:Moody cool natural lighting obtained in the twilight hour just after sunset or just before sunrise.Clamshell Lighting:portrait lighting setup using two light sources, one above and one below the subject's face.Catch light:A small reflection of the light source in the subject's eyes, adding life and dimension to portraits.Cross lighting:two light sources positioned at opposite sides of the subject, creating dramatic shadows and highlights.Tenebrism:Aggressive contrast between light and dark producing dark and gloomy images.Contre-jour:Lighting technique that produces clear silhouettes by the use of backlighting.Sfumato:Artistic lighting technique soft transitions between colors and tones resulting in a dreamy effect with no clear boundaries. Ie. The Mona Lisa.Ray tracing: Rendering technique that simulates the way the light interacts with the scene. Traces the light from the source, bounces off surfaces and reaches the viewers eye. Three point lighting:Cinematic lighting technique using key light, fill light and backlight. Global Illumination: Computer graphic technique that adds more realistic lighting to 3d scenery. Bloom: simulates the glow around bright light sources, creating a soft halo. Luminescence:emission of light by a substance not resulting from heat. It occurs through various processes such as chemical reactions, electrical energy, or other means.Bioluminescence:A cold light produced out of a chemical reaction inside of a living organism.
Quickstart Guide to Stable Video Diffusion

Quickstart Guide to Stable Video Diffusion

What is Stable Video Diffusion (SVD)?Stable Video Diffusion (SVD) from Stability AI, is an extremely powerful image-to-video model, which accepts an image input, into which it “injects” motion, producing some fantastic scenes.SVD is a latent diffusion model trained to generate short video clips from image inputs. There are two models. The first, img2vid, was trained to generate 14 frames of motion at a resolution of 576×1024, and the second, img2vid-xt is a finetune of the first, trained to generate 25 frames of motion at the same resolution.The newly released (2/2024) SVD 1.1 is further finetuned on a set of parameters to produce excellent, high-quality outputs, but requires specific settings, detailed below.Why should I be excited by SVD?SVD creates beautifully consistent video movement from our static images!How can I use SVD?ComfyUI is leading the pack when it comes to SVD image generation, with official SVD support! 25 frames of 1024×576 video uses < 10 GB VRAM to generate.It’s entirely possible to run the img2vid and img2vid-xt models on a GTX 1080 with 8GB of VRAM!There’s still no word (as of 11/28) on official SVD support in Automatic1111.If you’d like to try SVD on Google Colab, this workbook works on the Free Tier; https://github.com/sagiodev/stable-video-diffusion-img2vid/. Generation time varies, but is generally around 2 minutes on a V100 GPU.You’ll need to download one of the SVD models, from the links below, placing them in the ComfyUI/models/checkpoints directoryAfter updating your ComfyUI installation, you’ll see new nodes for VideoLinearCFGGuidance and SVD_img2vid _Conditioning. The Conditioning node takes the following inputs;You can download ComfyUI workflows for img2video and txt2video below, but keep in mind you’ll need to have an updated ComfyUI, and also may be missing additional nodes for Video. I recommend using the ComfyUI Manager to identify and download missing nodes!Suggested SettingsThe settings below are suggested settings for each SVD component (node), which I’ve found produce the most consistently useable outputs, with the img2vid and img2vid-xt models.Settings – Img2vid-xt-1.1February 2024 saw the release of a finetuned SVD model, version 1.1. This version only works with a very specific set of parameters to improve the consistency of outputs. If using the Img2vid-xt-1.1 model, the following settings must be applied to produce the best results;The easiest way to generate videosin tensor.art, you can generate videos very easily compared to the explanation above, all you need to do is input the prompt you want, select the model you like, set the ratio and set the frame in the animatediff menu.Output ExamplesLimitationsIt’s not perfect! Currently there are a few issues with the implementation, including;Generations are short! Only <=4 second generations are possible, at present.Sometimes there’s no motion in the outputs. We can tweak the conditioning parameters, but sometimes the images just refuse to move.The models cannot be controlled through text.Faces, and bodies in general, often aren’t the best!
List of style collection - focusing on anime charactor examples (continue updating)

List of style collection - focusing on anime charactor examples (continue updating)

AI image-generating platforms like Tensor.art offer diverse anime styles, enabling users to create artwork in various distinct masterpieces of art inspired by popular anime aesthetics. These collections aim to cater to different preferences from classic to contemporary anime illustrations within one place.P.S.1 I will continue updating this post maybe every 2 weeks when I find a unique style (both for LoRA and model) that is worth listing here solely from my perspective - Anyway if anyone has a list of favorite styles in mind, feel free to share them here or even create your post. :DP.S.2 People normally mix multiple LoRA at once, and the core model (checkpoint) has a variation in base style depending on the prompt used. Therefore, in the following example, I will choose only a single LoRA or Checkpoint to represent without mixing anything. However, if confusion about the contribution to the style happens, I have to apologize in advance since I am just a beginner in the art community. Here are some examples: Anime Lineart / Manga-like (线稿/線画/マンガ風/漫画风) Style (LORA) https://tensor.art/models/623935989624337542 Spacezin Sketch Style (LoRA) https://tensor.art/models/638083414328801488 Cute Chibi - V.1 (LoRA) https://tensor.art/models/726716640076597245 CAT - Citron Anime Treasure (Checkpoint) https://tensor.art/models/713607777118974323 LizMix V.7.0 (Checkpoint) https://tensor.art/models/721034681811855891 Flower style - (LORA) https://tensor.art/models/699582840586758007 Art Nouveau Style - Oosayam (LoRA) https://tensor.art/models/654562112921690173 Torino Style - v.2.0.09 (LoRA) https://tensor.art/models/705577639974520212 Yody PVC 3D Print - 1.0 (Checkpoint) https://tensor.art/models/673632484975460872 Eldritch Expressionism style (LoRA) https://tensor.art/models/708171473803739178 [Y5] Impressionism Style 印象派风格 (LoRA) https://tensor.art/models/621173217551417505 surrealism - 2024-02-17 (LoRA) https://tensor.art/models/695557949424221333 pop-art - 01 style (LoRA) https://tensor.art/models/697182692602582375 FF Style: Kazimir Malevich | Suprematism (LoRA) https://tensor.art/models/655758742350092928 Hoping these collections (today and in the future) will allow A.I. artists and enthusiasts to generate anime-inspired images effortlessly, blending creativity with advanced AI technology to bring their visions to life. :D
17
2
Prompt reference for "Lighting Effects"

Prompt reference for "Lighting Effects"

Hello. I usually use "lighting/lighting effects" when generating images.I will introduce some of the "words" I use when I want to add something.Please note that these words alone do not provide 100% effectiveness, and the base modelThe effect you get will differ depending on the LoRA sampling method and where you place it in the prompt.Words related to "lighting effects"・ Backlight :  Light from behind the subject・ Colorful lighting :  The impression itself is not colored, but the color changes depending on the light.・ moody lighting :  natural lighting, not direct artificial light・ studio lighting :  A term used to describe the artificial lighting of a photography studio.・ Directional Light :  directional light source is a light source that shines parallel rays in a selected direction.・ Dramatic lighting :  Lighting techniques in the field of photography・ Spot lighting :  A lighting technique that uses artificial light in a small area.・ Cinematic lighting :  A single word that describes several lighting techniques used in movies.・ Bounce Lighting :  Light reflected by a reflex plate, etc.・ Practical Lighting :  Photographs and videos that depict the light source itself in the composition・ Volumetric lighting :  A word derived from 3DCG. It tends to be a picture with a divine golden light source.・ Dynamic lighting :  I don't really understand what it means, but it tends to create high-contrast images.・ Warm lighting :  Creates a warm picture illuminated with warm colors・ Cold lighting :  Lights with a cold light source.・ High-key lighting :  Soft light, minimal shadows, low contrast, resulting in bright frames・ Low-key lighting :  It provides high contrast, but the impression is a little weak.・ Hard light :  Strong light. Highlights appear strong.・ soft light :  A word that refers to faint light.・ strobe lighting :  strong artificial light (stroboscopic lighting)・ Ambient light :  An English word that refers to ambient lighting/indoor lighting.・ flash lighting  :  For some reason, the characters themselves tend to emit light, and there are often flashes of light. (flash lighting photography) ・ Natural lighting :  This tends to create a natural-looking picture that feels contrasting with artificial light.
32
2
The future of AI image generation: endless possibilities -

The future of AI image generation: endless possibilities -

introduction{{For those who are about to start AI image generation}}In recent years, advances in AI technology have brought about revolutionary changes in the field of image generation. In particular, AI-powered illustration generation has become a powerful tool for artists and designers. However, as this technology advances, issues of creativity and copyright arise. In this article, we will explain the possibilities of AI image generation, specific use cases, how to create prompts, how to use LoRA and its effects, keywords for improving image quality, consideration for copyright, etc.Fundamentals of AI image generationAI image generation uses artificial intelligence to learn from data and generate new images. Deep learning techniques are often used for this, and one notable approach is stable diffusion. Stable Diffusion employs a probabilistic method called a diffusion model to gradually remove noise during image generation, resulting in highly realistic, high-quality output.Generating real imagesAI technology is excellent not only for creating cute illustrations, but also for generating realistic images. For example, you can generate high-resolution images that resemble photorealistic landscapes or portraits. By utilizing Stable Diffusion, it is possible to generate more detailed images, which expands the possibilities of application in various fields such as advertising, film production, and game design.Generate cute illustrationsOne of the practical applications of AI image generation is the creation of cute illustrations. This is useful for things like character design and avatar creation, allowing you to quickly generate different styles. This process typically involves collecting a large dataset of illustrations, training an AI model on this data to learn different styles and patterns, and generating new illustrations based on user input or keywords.creativity and AIAI image generation also influences creative ideas. Artists can use her AI-generated images as inspiration for new works or expand on ideas, which can lead to the creation of new styles and concepts never thought of before.Use and effects of LoRALoRA (Low-Rank Adaptation) is a technique used to improve the performance of AI models. Its impacts include:1. Fine-tune models: LoRA allows you to fine-tune existing AI models to learn specific styles and features, allowing for customization based on user needs.2. Efficient learning: LoRA reduces the need for large-scale data collection and training costs by efficiently training models using small datasets.3. Rapid adaptation: LoRA allows you to quickly adapt to new styles and trends, making it easy to generate images tailored to your current needs.For example, LoRA can be leveraged to efficiently achieve high-quality results when generating illustrations in a specific style.Creating a promptWhen instructing an AI to generate illustrations, it's important to create effective prompts. Key points for creating prompts include providing specific instructions, using the right keywords, trial and error, and an optional reference image to help the AI figure out what you're looking for. Keywords for improving image qualityWhen creating prompts for AI image generation, you can incorporate keywords related to image quality improvement to improve the overall quality of the images generated. Useful keywords include "high resolution," "detail," "clean lines," "high quality," "sharp," "bright colors," and "photorealistic."Copyright considerationsImage generation using AI also raises copyright issues. If the dataset used to train your AI model contains copyrighted works, the resulting images may infringe your copyright. When using AI image generation tools, it's important to be aware of the data source, ensure that the generated images comply with copyright laws, and check the license agreement.conclusionAI image generation offers great possibilities for artists and designers, but it also raises challenges related to copyright. By using data responsibly and understanding copyright law, you can leverage AI technology to create innovative work. Leveraging technologies like LoRA can further improve efficiency and quality. Users can adjust the output by incorporating image enhancement keywords into the prompt. Let's explore new ways of expression while being aware of advances in AI technology and the considerations that come with it! !
22
18
Stylistic QR Code with Stable Diffusion

Stylistic QR Code with Stable Diffusion

source: anfu.me (now you can easyly create QRcode with tensor.art inside controlnet, next time i will create guide about that)Yesterday, I created this image using Stable Diffusion and ControlNet, and shared on Twitter and Instagram – an illustration that also functions as a scannable QR code.The process of creating it was super fun, and I’m quite satisfied with the outcome.In this post, I would like to share some insights into my learning journey and the approaches I adopted to create this image. Additionally, I want to take this opportunity to credit the remarkable tools and models that made this project possible.Get into the Stable DiffusionThis year has witnessed an explosion of mind-boggling AI technologies, such as ChatGPT, DALL-E, Midjourney, Stable Diffusion, and many more. As a former photographer also with some interest in design and art, being able to generate images directly from imagination in minutes is undeniably tempting.So I started by trying Midjourney, it’s super easy to use, very expressive, and the quality is actually pretty good. It would honestly be my recommendation for anyone who wants to get started with generative AI art.By the way, Inès has also delved into it and become quite good at it now, go check her work on her new Instagram account  @a.i.nes.On my end, being a programmer with strong preferences, I would naturally seek for greater control over the process. This brought me to the realm of Stable Diffusion. I started with this guide: Stable Diffusion LoRA Models: A Complete Guide. The benefit of being late to the party is that there are already a lot of tools and guides ready to use. Setting up the environment quite straightforward and luckily my M1 Max’s GPU is supported.QR Code ImageA few weeks ago, nhciao on reddit posted a series of artistic QR codes created using Stable Diffusion and ControlNet. The concept behind them fascinated me, and I defintely want to make one for my own. So I did some research and managed to find the original article in Chinese: Use AI to Generate Scannable Images. The author provided insights into their motivations and the process of training the model, although they did not release the model itself. On the other hand, they are building a service called QRBTF.AI to generate such QR code, however it is not yet available.Until another day I found an community model QR Pattern Controlnet Model on CivitAI. I know I got to give it a try!SetupMy goal was to generate a QR code image that directs to my website while elements that reflect my interests. I ended up taking a slightly cypherpunk style with a character representing myself :PDisclaimer: I’m certainly far from being an expert in AI or related fields. In this post, I’m simply sharing what I’ve learned and the process I followed. My understanding may not be entirely accurate, and there are likely optimizations that could simplify the process. If you have any suggestions or comments, please feel free to reach out using the links at the bottom of the page. Thank you!1. Setup EnvironmentI pretty much follows Stable Diffusion LoRA Models: A Complete Guide to install the web ui AUTOMATIC1111/stable-diffusion-webui, download models you are interested in from CivitAI, etc. As a side note, I found that the user experience of the web ui is not super friendly, some of them I guess are a bit architectural issues that might not be easy to improve, but luckily I found a pretty nice theme canisminor1990/sd-webui-kitchen-theme that improves a bunch of small things.In order to use ControlNet, you will also need to install the Mikubill/sd-webui-controlnet extension for the web ui.Then you can download the QR Pattern Controlnet Model, putt the two files (.safetensors and .yaml) under stable-diffusion-webui/models/ControlNet folder, and restart the web ui.2. Create a QR CodeThere are hundreds of QR Code generators full of adds or paid services, and we certainly don’t need those fanciness – because we are going to make it much more fancier 😝!So I end up found the QR Code Generator Library, a playground of an open source QR Code generator. It’s simple but exactly what I need! It’s better to use medium error correction level or above to make it more easy recognizable later. Small tip that you can try with different Mask pattern to find a better color destribution that fits your design.3. Text to ImageAs the regular Text2Image workflow, we need to provide some prompts for the AI to generate the image from. Here is the prompts I used:Prompts(one male engineer), medium curly hair, from side, (mechanics), circuit board, steampunk, machine, studio, table, science fiction, high contrast, high key, cinematic light, (masterpiece, top quality, best quality, official art, beautiful and aesthetic:1.3), extreme detailed, highest detailed, (ultra-detailed)Negative Prompts(worst quality, low quality:2), overexposure, watermark, text, easynegative, ugly, (blurry:2), bad_prompt,bad-artist, bad hand, ng_deepnegative_v1_75tThen we need to go the ControlNet section, and upload the QR code image we generated earlier. And configure the parameters as suggested in the model homepage.Then you can start to generate a few images and see if it met your expectations. You will also need to check if the generated image is scannable, if not, you can tweak the Start controling step and End controling step to find a good balance between stylization and QRCode-likeness.4. I’m feeling lucky!After finding a set of parameters that I am happy with, I will increase the Batch Count to around 100 and let the model generate variations randomly. Later I can go through them and pick one with the best conposition and details for further refinement. This can take a lot of time, and also a lot of resources from your processors. So I usually start it before going to bed and leave it overnight.Here are some examples of the generated variations (not all of them are scannable):From approximately one hundred variations, I ultimately chose the following image as the starting point:It gets pretty interesting composition, while being less obvious as a QR code. So I decided to proceed with it and add add a bit more details. (You can compare it with the final result to see the changes I made.)5. Refining DetailsUpdate: I recently built a toolkit to help with this process, check my new blog post 👉 Refine AI Generated QR Code for more details.The generated images from the model are not perfect in every detail. For instance, you may have noticed that the hand and face appear slightly distorted, and the three anchor boxes in the corner are less visually appealing. We can use the inpaint feature to tell the model to redraw some parts of the image (it would better if you keep the same or similiar prompts as the original generation).Inpainting typically requires a similar amount of time as generating a text-to-image, and it involves either luck or patience. Often, I utilize Photoshop to "borrow" some parts from previously generated images and utilize the spot healing brush tool to clean up glitches and artifacts. My Photoshop layers would looks like this:After making these adjustments, I’ll send the combined image back for inpainting again to ensure a more seamless blend. Or to search for some other components that I didn’t found in other images.Specifically on the QR Code, in some cases ControlNet may not have enough prioritize, causing the prompts to take over and result in certain parts of the QR Code not matching. To address this, I would overlay the original QR Code image onto the generated image (as shown in the left image below), identify any mismatches, and use a brush tool to paint those parts with the correct colors (as shown in the right image below).I then export the marked image for inpainting once again, adjusting the Denoising strength to approximately 0.7. This would ensures that the model overrides our marks while still respecting the color to some degree.Ultimately, I iterate through this process multiple times until I am satisfied with every detail.6. UpscalingThe recommended generation size is 920x920 pixels. However, the model does not always generate highly detailed results at the pixel level. As a result, details like the face and hands can appear blurry when they are too small. To overcome this, we can upscale the image, providing the model with more pixels to work with. The SD Upscaler script in the img2img tab is particularly effective for this purpose. You can refer to the guide Upscale Images With Stable Diffusion for more information.7. Post-processingLastly, I use Photoshop and Lightroom for subtle color grading and post-processing, and we are done!The one I end up with not very good error tolerance, you might need to try a few times or use a more forgiving scanner to get it scanned :PAnd using the similarly process, I made another one for Inès:ConclusionCreating this image took me a full day, with a total of 10 hours of learning, generating, and refining. The process was incredibly enjoyable for me, and I am thrilled with the end result! I hope this post can offer you some fundamental concepts or inspire you to embark on your own creative journey. There is undoubtedly much more to explore in this field, and I eager to see what’s coming next!Join my Discord Server and let’s explore more together!If you want to learn more about the refining process, go check my new blog post: Refining AI Generated QR Code.ReferencesHere are the list of resources for easier reference.ConceptsStable DiffusionControlNetToolsHardwares & Softwares I am using.AUTOMATIC1111/stable-diffusion-webui - Web UI for Stable Diffusioncanisminor1990/sd-webui-kitchen-theme - Nice UI enhancementMikubill/sd-webui-controlnet - ControlNet extension for the webuiQR Code Generator Library - QR code generator that is ad-free and customisableAdobe Photoshop - The tool I used to blend the QR code and the illustrationModelsControl Net Models for QR Code (you can pick one of them)QR Pattern Controlnet ModelControlnet QR Code MonsterIoC Lab Control NetCheckpoint Model (you can use any checkpoints you like)Ghostmix Checkpoint - A very high quality checkpoint I use. You can use any other checkpoints you likeTutorialsStable Diffusion LoRA Models: A Complete Guide - The one I used to get started(Chinese) Use AI to genereate scannable images - Unfortunately the article is in Chinese and I didn’t find a English version of it.Upscale Images With Stable Diffusion - Enlarge the image while adding more details
The Marvel of Tanjore Temple: A Timeless Treasure

The Marvel of Tanjore Temple: A Timeless Treasure

IntroductionThe Tanjore Temple, also known as Brihadeeswarar Temple, is a striking example of India’s architectural grandeur and rich cultural heritage. Nestled in the historic town of Thanjavur in Tamil Nadu, this UNESCO World Heritage Site draws thousands of visitors each year, eager to marvel at its towering vimana (temple tower), intricate carvings, and vibrant history.Historical BackgroundBuilt by the great Chola emperor Raja Raja Chola I in the 11th century, the Tanjore Temple stands as a testament to the ingenuity and vision of ancient Indian architects and artisans. Completed in 1010 AD, it celebrated its millennium in 2010, marking a thousand years of awe-inspiring presence.Architectural SplendorThe VimanaThe most striking feature of the Tanjore Temple is its colossal vimana, which rises to a height of 66 meters. This towering structure is crowned with a massive dome, made from a single piece of granite weighing approximately 80 tons. This engineering marvel leaves historians and architects alike in awe, given the lack of modern machinery during its construction.The SanctumAt the heart of the temple lies the sanctum sanctorum, housing a massive Shiva lingam. The inner walls of the sanctum are adorned with exquisite frescoes and murals, depicting various mythological scenes and showcasing the artistic brilliance of the Chola period.Intricate CarvingsEvery inch of the Tanjore Temple is a canvas of intricate carvings. From the elaborate depictions of deities and mythological narratives on the walls to the ornate pillars and ceilings, the temple is a visual feast. These carvings not only serve as decorative elements but also provide a glimpse into the socio-cultural milieu of the Chola dynasty.Cultural SignificanceReligious ImportanceThe Tanjore Temple is dedicated to Lord Shiva and holds immense religious significance for Hindus. It is one of the largest temples in India and serves as a major pilgrimage site, especially during festivals like Maha Shivaratri. Devotees from across the country flock to the temple to seek blessings and participate in the vibrant festivities.Artistic HeritageThe temple is a treasure trove of Chola art and architecture. The frescoes and murals, in particular, offer invaluable insights into the artistic and cultural landscape of the period. The depictions of dance forms, musical instruments, and attire provide a vivid picture of the era’s cultural richness.Visiting Tanjore TempleBest Time to VisitThe ideal time to visit Tanjore Temple is between October and March when the weather is pleasant. The temple complex is open from early morning till evening, allowing visitors ample time to explore and soak in its magnificence.How to ReachThanjavur is well-connected by road, rail, and air. The nearest airport is Tiruchirappalli International Airport, about 60 kilometers away. Thanjavur Junction is the nearest railway station, with regular trains from major cities like Chennai, Bangalore, and Coimbatore. Buses and taxis are also readily available for local transportation.AccommodationThanjavur offers a range of accommodation options, from budget hotels to luxury resorts, catering to the diverse needs of travelers. Staying in the town allows visitors to explore not just the temple, but also other nearby attractions like the Thanjavur Royal Palace and the Saraswathi Mahal Library.ConclusionThe Tanjore Temple is more than just an architectural marvel; it is a living testament to India’s rich cultural and religious heritage. Its towering vimana, intricate carvings, and historical significance make it a must-visit destination for history enthusiasts, art lovers, and spiritual seekers alike. Plan your visit to this timeless treasure and immerse yourself in the grandeur of the Chola dynasty.
3
[Guide] Make your own Loras, easy and free

[Guide] Make your own Loras, easy and free

This article helped me to create my first Lora and upload it to Tensor.art, although Tensor.art has its own Lora Train , this article helps to understand how to create Lora well.🏭 PreambleEven if you don't know where to start or don't have a powerful computer, I can guide you to making your first Lora and more!In this guide we'll be using resources from my GitHub page. If you're new to Stable Diffusion I also have a full guide to generate your own images and learn useful tools.I'm making this guide for the joy it brings me to share my hobbies and the work I put into them. I believe all information should be free for everyone, including image generation software. However I do not support you if you want to use AI to trick people, scam people, or break the law. I just do it for fun.Also here's a page where I collect Hololive loras.📃What you needAn internet connection. You can even do this from your phone if you want to (as long as you can prevent the tab from closing).Knowledge about what Loras are and how to use them.Patience. I'll try to explain these new concepts in an easy way. Just try to read carefully, use critical thinking, and don't give up if you encounter errors.🎴Making a Lorat has a reputation for being difficult. So many options and nobody explains what any of them do. Well, I've streamlined the process such that anyone can make their own Lora starting from nothing in under an hour. All while keeping some advanced settings you can use later on.You could of course train a Lora in your own computer, granted that you have an Nvidia graphics card with 6 GB of VRAM or more. We won't be doing that in this guide though, we'll be using Google Colab, which lets you borrow Google's powerful computers and graphics cards for free for a few hours a day (some say it's 20 hours a week). You can also pay $10 to get up to 50 extra hours, but you don't have to. We'll also be using a little bit of Google Drive storage.This guide focuses on anime, but it also works for photorealism. However I won't help you if you want to copy real people's faces without their consent.🎡 Types of LoraAs you may know, a Lora can be trained and used for:A character or personAn artstyleA poseA piece of clothingetcHowever there are also different types of Lora now:LoRA: The classic, works well for most cases.LoCon: Has more layers which learn more aspects of the training data. Very good for artstyles.LoHa, LoKR, (IA)^3: These use novel mathematical algorithms to process the training data. I won't cover them as I don't think they're very useful.📊 First Half: Making a DatasetThis is the longest and most important part of making a Lora. A dataset is (for us) a collection of images and their descriptions, where each pair has the same filename (eg. "1.png" and "1.txt"), and they all have something in common which you want the AI to learn. The quality of your dataset is essential: You want your images to have at least 2 examples of: poses, angles, backgrounds, clothes, etc. If all your images are face close-ups for example, your Lora will have a hard time generating full body shots (but it's still possible!), unless you add a couple examples of those. As you add more variety, the concept will be better understood, allowing the AI to create new things that weren't in the training data. For example a character may then be generated in new poses and in different clothes. You can train a mediocre Lora with a bare minimum of 5 images, but I recommend 20 or more, and up to 1000.As for the descriptions, for general images you want short and detailed sentences such as "full body photograph of a woman with blonde hair sitting on a chair". For anime you'll need to use booru tags (1girl, blonde hair, full body, on chair, etc.). Let me describe how tags work in your dataset: You need to be detailed, as the Lora will reference what's going on by using the base model you use for training. If there is something in all your images that you don't include in your tags, it will become part of your Lora. This is because the Lora absorbs details that can't be described easily with words, such as faces and accessories. Thanks to this you can let those details be absorbed into an activation tag, which is a unique word or phrase that goes at the start of every text file, and which makes your Lora easy to prompt.You may gather your images online, and describe them manually. But fortunately, you can do most of this process automatically using my new 📊 dataset maker colab.Here are the steps:1️⃣ Setup: This will connect to your Google Drive. Choose a simple name for your project, and a folder structure you like, then run the cell by clicking the floating play button to the left side. It will ask for permission, accept to continue the guide.If you already have images to train with, upload them to your Google Drive's "lora_training/datasets/project_name" (old) or "Loras/project_name/dataset" (new) folder, and you may choose to skip step 2.2️⃣ Scrape images from Gelbooru: In the case of anime, we will use the vast collection of available art to train our Lora. Gelbooru sorts images through thousands of booru tags describing everything about an image, which is also how we'll tag our images later. Follow the instructions on the colab for this step; basically, you want to request images that contain specific tags that represent your concept, character or style. When you run this cell it will show you the results and ask if you want to continue. Once you're satisfied, type yes and wait a minute for your images to download.3️⃣ Curate your images: There are a lot of duplicate images on Gelbooru, so we'll be using the FiftyOne AI to detect them and mark them for deletion. This will take a couple minutes once you run this cell. They won't be deleted yet though: eventually an interactive area will appear below the cell, displaying all your images in a grid. Here you can select the ones you don't like and mark them for deletion too. Follow the instructions in the colab. It is beneficial to delete low quality or unrelated images that slipped their way in. When you're finished, send Enter in the text box above the interactive area to apply your changes.4️⃣ Tag your images: We'll be using the WD 1.4 tagger AI to assign anime tags that describe your images, or the BLIP AI to create captions for photorealistic/other images. This takes a few minutes. I've found good results with a tagging threshold of 0.35 to 0.5. After running this cell it'll show you the most common tags in your dataset which will be useful for the next step.5️⃣ Curate your tags: This step for anime tags is optional, but very useful. Here you can assign the activation tag (also called trigger word) for your Lora. If you're training a style, you probably don't want any activation tag so that the Lora is always in effect. If you're training a character, I myself tend to delete (prune) common tags that are intrinsic to the character, such as body features and hair/eye color. This causes them to get absorbed by the activation tag. Pruning makes prompting with your Lora easier, but also less flexible. Some people like to prune all clothing to have a single tag that defines a character outfit; I do not recommend this, as too much pruning will affect some details. A more flexible approach is to merge tags, for example if we have some redundant tags like "striped shirt, vertical stripes, vertical-striped shirt" we can replace all of them with just "striped shirt". You can run this step as many times as you want.6️⃣ Ready: Your dataset is stored in your Google Drive. You can do anything you want with it, but we'll be going straight to the second half of this tutorial to start training your Lora!⭐ Second Half: Settings and TrainingThis is the tricky part. To train your Lora we'll use my ⭐ Lora trainer colab. It consists of a single cell with all the settings you need. Many of these settings don't need to be changed. However, this guide and the colab will explain what each of them do, such that you can play with them in the future.Here are the settings:▶️ Setup: Enter the same project name you used in the first half of the guide and it'll work automatically. Here you can also change the base model for training. There are 2 recommended default ones, but alternatively you can copy a direct download link to a custom model of your choice. Make sure to pick the same folder structure you used in the dataset maker.▶️ Processing: Here are the settings that change how your dataset will be processed.The resolution should stay at 512 this time, which is normal for Stable Diffusion. Increasing it makes training much slower, but it does help with finer details.flip_aug is a trick to learn more evenly, as if you had more images, but makes the AI confuse left and right, so it's your choice.shuffle_tags should always stay active if you use anime tags, as it makes prompting more flexible and reduces bias.activation_tags is important, set it to 1 if you added one during the dataset part of the guide. This is also called keep_tokens.▶️ Steps: We need to pay attention here. There are 4 variables at play: your number of images, the number of repeats, the number of epochs, and the batch size. These result in your total steps.You can choose to set the total epochs or the total steps, we will look at some examples in a moment. Too few steps will undercook the Lora and make it useless, and too many will overcook it and distort your images. This is why we choose to save the Lora every few epochs, so we can compare and decide later. For this reason, I recommend few repeats and many epochs.There are many ways to train a Lora. The method I personally follow focuses on balancing the epochs, such that I can choose between 10 and 20 epochs depending on if I want a fast cook or a slow simmer (which is better for styles). Also, I have found that more images generally need more steps to stabilize. Thanks to the new min_snr_gamma option, Loras take less epochs to train. Here are some healthy values for you to try:10 images × 10 repeats × 20 epochs ÷ 2 batch size = 1000 steps20 images × 10 repeats × 10 epochs ÷ 2 batch size = 1000 steps100 images × 3 repeats × 10 epochs ÷ 2 batch size = 1500 steps400 images × 1 repeat × 10 epochs ÷ 2 batch size = 2000 steps1000 images × 1 repeat × 10 epochs ÷ 3 batch size = 3300 steps▶️ Learning: The most important settings. However, you don't need to change any of these your first time. In any case:The unet learning rate dictates how fast your Lora will absorb information. Like with steps, if it's too small the Lora won't do anything, and if it's too large the Lora will deepfry every image you generate. There's a flexible range of working values, specially since you can change the intensity of the lora in prompts. Assuming you set dim between 8 and 32 (see below), I recommend 5e-4 unet for almost all situations. If you want a slow simmer, 1e-4 or 2e-4 will be better. Note that these are in scientific notation: 1e-4 = 0.0001The text encoder learning rate is less important, specially for styles. It helps learn tags better, but it'll still learn them without it. It is generally accepted that it should be either half or a fifth of the unet, good values include 1e-4 or 5e-5. Use google as a calculator if you find these small values confusing.The scheduler guides the learning rate over time. This is not critical, but still helps. I always use cosine with 3 restarts, which I personally feel like it keeps the Lora "fresh". Feel free to experiment with cosine, constant, and constant with warmup. Can't go wrong with those. There's also the warmup ratio which should help the training start efficiently, and the default of 5% works well.▶️ Structure: Here is where you choose the type of Lora from the 2 I mentioned in the beginning. Also, the dim/alpha mean the size of your Lora. Larger does not usually mean better. I personally use 16/8 which works great for characters and is only 18 MB.▶️ Ready: Now you're ready to run this big cell which will train your Lora. It will take 5 minutes to boot up, after which it starts performing the training steps. In total it should be less than an hour, and it will put the results in your Google Drive.🏁 Third Half: TestingYou read that right. I lied! 😈 There are 3 parts to this guide.When you finish your Lora you still have to test it to know if it's good. Go to your Google Drive inside the /lora_training/outputs/ folder, and download everything inside your project name's folder. Each of these is a different Lora saved at different epochs of your training. Each of them has a number like 01, 02, 03, etc.Here's a simple workflow to find the optimal way to use your Lora:Put your final Lora in your prompt with a weight of 0.7 or 1, and include some of the most common tags you saw during the tagging part of the guide. You should see a clear effect, hopefully similar to what you tried to train. Adjust your prompt until you're either satisfied or can't seem to get it any better.Use the X/Y/Z plot to compare different epochs. This is a builtin feature in webui. Go to the bottom of the generation parameters and select the script. Put the Lora of the first epoch in your prompt (like "<lora:projectname-01:0.7>"), and on the script's X value write something like "-01, -02, -03", etc. Make sure the X value is in "Prompt S/R" mode. These will perform replacements in your prompt, causing it to go through the different numbers of your lora so you can compare their quality. You can first compare every 2nd or every 5th epoch if you want to save time. You should ideally do batches of images to compare more fairly.Once you've found your favorite epoch, try to find the best weight. Do an X/Y/Z plot again, this time with an X value like ":0.5, :0.6, :0.7, :0.8, :0.9, :1". It will replace a small part of your prompt to go over different lora weights. Again it's better to compare in batches. You're looking for a weight that results in the best detail but without distorting the image. If you want you can do steps 2 and 3 together as X/Y, it'll take longer but be more thorough.If you found results you liked, congratulations! Keep testing different situations, angles, clothes, etc, to see if your Lora can be creative and do things that weren't in the training data.source: civitai/holostrawberry
6
Area Composition

Area Composition

Get more specific generations each time!Have you ever heard of Area composition?Area composition is a technique where you can specify and set custom locations for every element you want to generate. In order to create this simple but effective workflow all you need is:NodesLoad checkpoint: here you select your desired model.Load LoRA: here you select your desired style with any LoRA (this one is optional).Clip Set Last Layer: this node works as your Clip Skip (set it to -2 for better results).Clip text encode: here is where your lovely prompt will be. you will need to have two of these because one will work as your positives and the other as negatives.Ksampler: this node is important because it is like the brain of the main process. here is where your prompt and image size gets read it and transformed into an image. here you can use the sampler and scheduler you like the most (set the denoise strength to 1.0 for better results).Empty latent image: as important as the ksampler, the empty latent image node is where you decide the specific size of your initial image (can be portrait or landscape).Clip text encode: wait, again? yes. just as the last ones, this node will focus on the specific element you want to generate. it is important to keep it simple and only consider the main element to represent (you can have as many nodes for every element you want to generate. keep in mind that these nodes will only work as positives. for this example i will only use 2 clip text encode nodes).MultiArea conditioning: ok so, this is the most important node of the process. here, for explaining purposes, i will call each one of my positives as conditionings.conditioning 0 will be my first positive (the one i made on step 4).conditioning 1 and 2 will be my second and third positive (the one i made on step 7).it is very important to know that for each conditioning you will have to set a desired size for each element. in this example conditioning 0 i set it to 512x718 because is the base prompt and i want all of the canvas to represent it. for conditioning 1, which is my main character, i set it to 384x576 on lower part of the center of the canvas. and for conditioning 2, which is the background /setting, i set it to 512x718 because i want all of the canvas to work as the background. (you may notice that for each conditioning, while setting it's position, a different color will show on the multiarea conditioning node. keep calm, these colors will work just as a visual representation for the position of each element).also important, as you have figured it out, this node works just as a super detailed composition instruction, therefore, this multiarea conditioning node will work as your positive, so be sure to connect it as positive in your ksampler.Upscale latent: until this part of the process we have only created the base image, which means it is time to upscale it. to do so, i have used the upscale latent node. it not only upscale the image to a desired size but also introduces more detail in the process.Ksampler: yes, again. this second ksampler will work along the upscale latent node in order to refine details, so using the same configuration as your first one (step 5) is a good idea. (lowering the denoise strength on this second ksampler will help in avoiding drastic changes. for this example i set it to 0.5).VAE encode: the variational autoencoder or vae node is important because this node will transform the noise and commands into your beautiful masterpiece.Preview/Save image: lastly, what is left to add is the preview/save image node. (this one does not need an explanation, right?).And there you go, you will now be able to generate more personalized images.Intended image to create: cyborg girl inside abandoned building.Do not forget to set this article as favorite if you found it useful.Happy generations!
11
4

Posts