Tensor.Art
Create

Tensor.Art

Creation

Get start with Stable Diffusion!
💥 SD3 & DiT

ComfyFlow

ComfyUI's amazing experience!
🎭 TAttoo Event

Host My Model

Share my models,get more attention!
💸 Double Earnings

Online Training

Make LoRA Training easier!
🤖 Make Fun

AI Tools

750974722736426081
131
11

Tribal Tattoo Style ( Black minimalist line⚫)

742811845332595136
1.2K
22

Soul Merger XL

Models

750930029065784799
CHECKPOINT HunyuanDiT

HunYuanDiT-V1.2-EMA

27K 619
750523639511693904
CHECKPOINT Kolors

Kolors-1.0.fp16

9.5K 156
739119827031156378
CHECKPOINT SD3

Stable Diffusion 3 SD3-medium

158K 1.3K
740591342956212867
LORA
EXCLUSIVE

Hands Repair |Lora-V5

57K 251
669853098447197493
LORA XL

Sticky Rice Style XL-v1.0

2.1K 66
698544588085932029
LORA

Nagi Kirishima from Bloody Roar-v1.0

2.2K 61
655758312861748818
LORA XL

FF Style: Esao Andrews (ESAO)-Esao Andrews - LoRA v.1.0

150K 190
656984448715437967
LORA XL
EXCLUSIVE

SDVN-PortraitArt-v1

39K 198
668099794012654096
LORA
EXCLUSIVE

SDVN-3DCharacter-v1.0

7K 224
699871312073616300
LORA

DonM - Geometric Toon Style [SD1.5,SDXL]-SD1.5

578 12
695840468086504801
LORA

Sylvain Sarrailh LoRA (Art Style)-1

1.9K 66
684621794465887631
LORA

Seek - The Doors - Roblox-v1.0

18K 119
658715883176360845
LOCON

Blackwork Style | LoCon-v1.0

20K 265
755723951558183445
LORA HunyuanDiT
EXCLUSIVE

Anime Style | Hunyuan DiT LoRA-v1

170 3
712491102810306167
LORA

Miniature house like thing smth smth that i found cool-v1.0

901 42
686543160970792338
LORA
EXCLUSIVE

Dragon New Year Insane-v1

2.1K 48
662491412832694887
LORA XL

Christmas Critters [SDXL LoRA]-SDXL

22K 90
755018632028941075
LORA SD3
EARLY ACCESS

[SD3]Scene: Monster Lab - FuturEvoLab-v1

27 4
644666190452690859
LORA

Rubberhose (Low weight:0.1-0.3)-✦

10K 290
657931214126189049
LORA

Fireflies ホタル-4960_fireflies(=v1.0)

88K 343
756502899707816641
LORA SD3

HH-Newcolony_SD3-V10

16 2
690553251680190457
CHECKPOINT XL
EXCLUSIVE

ToxicEchoXL-v1

298K 1.2K
656995224788200974
LOCON
EXCLUSIVE

Neg4All!Both Positive High quality、Details and Negative worse quality、bad hand in one LoRA!-v3.5

72K 167

Articles

SDG - HunyuanDiT loras released

SDG - HunyuanDiT loras released

HunyuanDiT - Perfect cute animehttps://tensor.art/models/755812883138538240?source_id=nz-ypFjjk0C7pPcibn708xQiEnhance character appearance details, eyes, hair, colors, and drawings in anime styleHunyuanDiT - Realistic detailshttps://tensor.art/models/755789054659947864/HunyuanDiT-Realistic-details-V1Add more realistic details for imagesHunyuanDIT - Vivid colorhttps://tensor.art/models/755810413532312715?source_id=nz-ypFjjk0C7pPcibn708xQiEnhance vivid colors and details in photosHunyuan - Beauty Portraithttps://tensor.art/models/755789995257798458?source_id=nz-ypFjjk0C7pPcibn708xQiortrait within more details hair, skin...
1
Hunyuan model online training tutorial

Hunyuan model online training tutorial

EnglishToday, Iwill teach you how to use TensorArt to train an Hunyuan model online.Step 1: Open "Online Training.On the left side, you will see the dataset window, which is empty by default. You can upload some images to create a dataset or upload a dataset zip file. The zip file can include annotation files, following the same format as kohya-ss, where each image file corresponds to a text annotation file with the same name.In the model theme section on the right, you can choose from options such as anime characters, real people, 2.5D, standard, and custom.Here, we select "Base" and choose the Hunyuan model as the base model.For the base model parameter settings, we recommend setting the number of repetitions per image to 4 and the number of epochs to 16.、After uploading a processed dataset, if your dataset annotations include character names, you don't need to specify a trigger word. Otherwise, you should assign a simple trigger word to your model, such as a character name or style name.Next, select an annotation file from the dataset to use as a preview prompt.If you want to use Professional Mode, click the button in the top right corner to switch to Professional Mode.In Professional Mode, it is recommended to double the learning rateand use the cosine_with_restarts learning rate scheduler. For the optimizer, you can choose AdamW8bit.Enable label shuffling and ensure that the first token remains unchanged (especially if you have a character name trigger word as the first token).Disable the noise offset feature, and you can set the convolution DIM to 8 and Alpha to 1.In the sample settings, add the Negative prompts, and then you can start the training process.In the training queue, you can view the current loss value chart and the four sample images generated for each epoch.Finally, you can choose the epoch with the best results to download to your local machine or publish directly on TensorArt.After a few minutes, your model will be deployed and ready.日本語今日、私はTensorArtを使用してHunyuanモデルをオンラインでトレーニングする方法を教えます。ステップ1: 「オンライントレーニング」を開きます。左側にデータセットウィンドウが表示され、デフォルトでは空です。データセットを作成するために画像をアップロードするか、データセットのzipファイルをアップロードできます。zipファイルには、kohya-ssと同じ形式のアノテーションファイルを含めることができ、各画像ファイルには同じ名前のテキストアノテーションファイルが対応しています。右側のモデルテーマセクションでは、アニメキャラクター、実在の人物、2.5D、標準、カスタムなどのオプションから選択できます。ここでは「Base」を選択し、Hunyuanモデルをベースモデルとして選びます。ベースモデルのパラメーター設定では、画像ごとの繰り返し回数を4、エポック数を16に設定することをお勧めします。 処理済みのデータセットをアップロードした後、データセットのアノテーションにキャラクター名が含まれている場合は、トリガーワードを指定する必要はありません。それ以外の場合は、キャラクター名やスタイル名など、モデルに簡単なトリガーワードを割り当ててください。 次に、プレビュー用プロンプトとして使用するために、データセットからアノテーションファイルを選択します。プロフェッショナルモードを使用したい場合は、右上隅のボタンをクリックしてプロフェッショナルモードに切り替えます。プロフェッショナルモードでは、学習率を倍増することをお勧めします。また、cosine_with_restarts学習率スケジューラーを使用してください。オプティマイザーとしては、AdamW8bitを選択できます。ラベルシャッフルを有効にし、最初のトークンが変更されないようにします(特にキャラクター名トリガーワードが最初のトークンの場合)。ノイズオフセット機能を無効にし、畳み込みDIMを8、Alphaを1に設定できます。サンプル設定でNegative promptsを追加し、その後、トレーニングプロセスを開始できます。トレーニングキューでは、現在の損失値チャートと各エポックごとに生成された4つのサンプル画像を表示できます。最後に、最良の結果が得られたエポックを選択して、ローカルマシンにダウンロードするか、直接TensorArtで公開できます。数分後には、モデルがデプロイされ、使用可能になります。한국인오늘은 TensorArt를 사용하여 Hunyuan 모델을 온라인에서 훈련하는 방법을 알려드리겠습니다.1단계: "온라인 훈련"을 엽니다.왼쪽에는 기본적으로 비어 있는 데이터셋 창이 표시됩니다. 데이터셋을 만들기 위해 이미지를 업로드하거나 데이터셋 zip 파일을 업로드할 수 있습니다. zip 파일에는 kohya-ss와 같은 형식의 주석 파일이 포함될 수 있으며, 각 이미지 파일에는 동일한 이름의 텍스트 주석 파일이 대응됩니다.오른쪽의 모델 테마 섹션에서는 애니메이션 캐릭터, 실제 인물, 2.5D, 표준, 사용자 정의 등 다양한 옵션 중에서 선택할 수 있습니다.여기에서는 "Base"를 선택하고 Hunyuan 모델을 기본 모델로 선택합니다.기본 모델 파라미터 설정에서는 이미지당 반복 횟수를 4로, 에포크 수를 16으로 설정하는 것을 권장합니다. 처리된 데이터셋을 업로드한 후, 데이터셋의 주석에 캐릭터 이름이 포함되어 있으면 트리거 단어를 지정할 필요가 없습니다. 그렇지 않으면 모델에 간단한 트리거 단어를 지정해야 합니다, 예를 들어 캐릭터 이름이나 스타일 이름 등. 다음으로, 미리 보기 프롬프트로 사용할 주석 파일을 데이터셋에서 선택합니다.전문 모드를 사용하려면, 오른쪽 상단의 버튼을 클릭하여 전문 모드로 전환합니다.전문 모드에서는 학습률을 두 배로 늘리는 것이 좋습니다.또한 cosine_with_restarts 학습률 스케줄러를 사용합니다. 옵티마이저로는 AdamW8bit을 선택할 수 있습니다.레이블 셔플을 활성화하고 첫 번째 토큰이 변경되지 않도록 합니다(특히 캐릭터 이름 트리거 단어가 첫 번째 토큰인 경우).노이즈 오프셋 기능을 비활성화하고, 컨볼루션 DIM을 8로, Alpha를 1로 설정할 수 있습니다.샘플 설정에서 Negative prompts를 추가한 후, 훈련 프로세스를 시작할 수 있습니다.훈련 대기열에서 현재 손실 값 차트와 각 에포크에 대해 생성된 4개의 샘플 이미지를 볼 수 있습니다.마지막으로, 가장 좋은 결과를 얻은 에포크를 선택하여 로컬 컴퓨터로 다운로드하거나 직접 TensorArt에 게시할 수 있습니다.몇 분 후, 모델이 배포되고 사용 가능해집니다.Tiếng ViệtHôm nay, tôi sẽ hướng dẫn bạn cách sử dụng TensorArt để đào tạo mô hình Hunyuan trực tuyến.Bước 1: Mở "Đào tạo trực tuyến."Ở bên trái, bạn sẽ thấy cửa sổ tập dữ liệu, mặc định là trống. Bạn có thể tải lên một số hình ảnh để tạo tập dữ liệu hoặc tải lên tệp zip của tập dữ liệu. Tệp zip có thể bao gồm các tệp chú thích, theo cùng một định dạng như kohya-ss, trong đó mỗi tệp hình ảnh tương ứng với một tệp chú thích văn bản cùng tên.Ở phần chủ đề mô hình bên phải, bạn có thể chọn từ các tùy chọn như nhân vật anime, người thật, 2.5D, tiêu chuẩn và tùy chỉnh.Tại đây, chúng ta chọn "Base" và chọn mô hình Hunyuan làm mô hình cơ bản.Đối với cài đặt tham số của mô hình cơ bản, chúng tôi khuyên bạn nên đặt số lần lặp lại trên mỗi hình ảnh là 4 và số epoch là 16. Sau khi tải lên tập dữ liệu đã xử lý, nếu các chú thích của tập dữ liệu của bạn bao gồm tên nhân vật, bạn không cần phải chỉ định từ kích hoạt. Ngược lại, bạn nên gán một từ kích hoạt đơn giản cho mô hình của mình, chẳng hạn như tên nhân vật hoặc tên phong cách. Tiếp theo, chọn một tệp chú thích từ tập dữ liệu để sử dụng làm lời nhắc xem trước.Nếu bạn muốn sử dụng Chế độ Chuyên nghiệp, hãy nhấp vào nút ở góc trên bên phải để chuyển sang Chế độ Chuyên nghiệp.Trong Chế độ Chuyên nghiệp, nên gấp đôi tỷ lệ học.Và sử dụng bộ lập lịch tỷ lệ học cosine_with_restarts. Đối với bộ tối ưu hóa, bạn có thể chọn AdamW8bit.Kích hoạt xáo trộn nhãn và đảm bảo rằng mã thông báo đầu tiên không thay đổi (đặc biệt nếu bạn có từ kích hoạt tên nhân vật là mã thông báo đầu tiên).Tắt tính năng dịch chuyển tiếng ồn và bạn có thể đặt DIM tích chập là 8 và Alpha là 1.Trong cài đặt mẫu, thêm các Lời nhắc tiêu cực, sau đó bạn có thể bắt đầu quá trình đào tạo.Trong hàng đợi đào tạo, bạn có thể xem biểu đồ giá trị tổn thất hiện tại và bốn hình ảnh mẫu được tạo ra cho mỗi epoch.Cuối cùng, bạn có thể chọn epoch có kết quả tốt nhất để tải xuống máy tính của bạn hoặc xuất bản trực tiếp trên TensorArt.Sau vài phút, mô hình của bạn sẽ được triển khai và sẵn sàng sử dụng.españolHoy, te enseñaré cómo usar TensorArt para entrenar un modelo Hunyuan en línea.Paso 1: Abre "Entrenamiento en línea."A la izquierda, verás la ventana del conjunto de datos, que está vacía por defecto. Puedes subir algunas imágenes para crear un conjunto de datos o subir un archivo zip del conjunto de datos. El archivo zip puede incluir archivos de anotación, siguiendo el mismo formato que kohya-ss, donde cada archivo de imagen corresponde a un archivo de anotación de texto con el mismo nombre.En la sección de temas del modelo a la derecha, puedes elegir entre opciones como personajes de anime, personas reales, 2.5D, estándar y personalizado.Aquí, seleccionamos "Base" y elegimos el modelo Hunyuan como el modelo base.Para la configuración de parámetros del modelo base, te recomendamos configurar el número de repeticiones por imagen a 4 y el número de épocas a 16. Después de subir un conjunto de datos procesado, si las anotaciones de tu conjunto de datos incluyen nombres de personajes, no necesitas especificar una palabra de activación. De lo contrario, deberías asignar una palabra de activación simple a tu modelo, como un nombre de personaje o un nombre de estilo. A continuación, selecciona un archivo de anotación del conjunto de datos para usarlo como un aviso de vista previa.Si deseas usar el Modo Profesional, haz clic en el botón en la esquina superior derecha para cambiar al Modo Profesional.En el Modo Profesional, se recomienda duplicar la tasa de aprendizaje.Y usar el programador de tasa de aprendizaje cosine_with_restarts. Para el optimizador, puedes elegir AdamW8bit.Habilita el barajado de etiquetas y asegúrate de que el primer token permanezca sin cambios (especialmente si tienes una palabra de activación de nombre de personaje como el primer token).Desactiva la función de desplazamiento de ruido y puedes configurar el DIM de convolución a 8 y Alpha a 1.En la configuración de muestra, añade los Avisos Negativos, y luego puedes comenzar el proceso de entrenamiento.En la cola de entrenamiento, puedes ver el gráfico del valor de pérdida actual y las cuatro imágenes de muestra generadas para cada época.Finalmente, puedes elegir la época con los mejores resultados para descargarla a tu máquina local o publicarla directamente en TensorArt.Después de unos minutos, tu modelo estará desplegado y listo para usar.
Online Training SD3 Model Tutorial

Online Training SD3 Model Tutorial

EnglishToday, Iwill teach you how to use TensorArt to train an SD3 model online.Step 1: Open "Online Training.On the left side, you will see the dataset window, which is empty by default. You can upload some images to create a dataset or upload a dataset zip file. The zip file can include annotation files, following the same format as kohya-ss, where each image file corresponds to a text annotation file with the same name.In the model theme section on the right, you can choose from options such as anime characters, real people, 2.5D, standard, and custom.Here, we select "Base" and choose the SD3 model as the base model.For the base model parameter settings, we recommend setting the number of repetitions per image to 4 and the number of epochs to 16.、After uploading a processed dataset, if your dataset annotations include character names, you don't need to specify a trigger word. Otherwise, you should assign a simple trigger word to your model, such as a character name or style name.Next, select an annotation file from the dataset to use as a preview prompt.If you want to use Professional Mode, click the button in the top right corner to switch to Professional Mode.In Professional Mode, it is recommended to double the learning rateand use the cosine_with_restarts learning rate scheduler. For the optimizer, you can choose AdamW8bit.Enable label shuffling and ensure that the first token remains unchanged (especially if you have a character name trigger word as the first token).Disable the noise offset feature, and you can set the convolution DIM to 8 and Alpha to 1.In the sample settings, add the Negative prompts, and then you can start the training process.In the training queue, you can view the current loss value chart and the four sample images generated for each epoch.Finally, you can choose the epoch with the best results to download to your local machine or publish directly on TensorArt.After a few minutes, your model will be deployed and ready.日本語今日は、TensorArtを使用してオンラインでSD3モデルをトレーニングする方法を教えます。ステップ1: 「オンライントレーニング」を開いてください。左側には、デフォルトでは空のデータセットウィンドウが表示されます。ここに画像をアップロードしてデータセットを作成するか、データセットのzipファイルをアップロードすることができます。zipファイルにはアノテーションファイルを含めることができ、これらのファイルはkohya-ssと同じ形式で、各画像ファイルには同じ名前のテキストアノテーションファイルが対応しています。右側のモデルテーマセクションでは、「アニメキャラクター」、「実在の人物」、「2.5D」、「標準」、「カスタム」などのオプションから選択することができます。ここでは、「ベース」を選択し、SD3モデルをベースモデルとして選びます。基本モデルのパラメーター設定については、画像ごとの繰り返し回数を4、エポック数を16に設定することをお勧めします。処理済みのデータセットをアップロードした後、データセットのアノテーションにキャラクター名が含まれている場合は、トリガーワードを指定する必要はありません。それ以外の場合は、キャラクター名やスタイル名などのシンプルなトリガーワードをモデルに割り当てるべきです。次に、プレビュー用プロンプトとして使用するために、データセットからアノテーションファイルを選択します。プロフェッショナルモードを使用したい場合は、右上隅のボタンをクリックしてプロフェッショナルモードに切り替えてください。プロフェッショナルモードでは、学習率を2倍にすることをお勧めします。また、学習率スケジューラーとして「cosine_with_restarts」を使用し、オプティマイザーには「AdamW8bit」を選択することができます。ラベルシャッフルを有効にし、最初のトークンが変更されないようにしてください(特に、最初のトークンにキャラクター名のトリガーワードがある場合は注意してください)。ノイズオフセット機能を無効にし、畳み込みDIMを8、Alphaを1に設定することができます。サンプル設定で「ネガティブプロンプト」を追加し、その後トレーニングプロセスを開始することができます。トレーニングキューでは、現在の損失値のグラフと、各エポックごとに生成された4つのサンプル画像を表示することができます。最後に、最も良い結果が得られたエポックを選択して、ローカルマシンにダウンロードするか、直接TensorArtに公開することができます。数分後に、モデルがデプロイされ、使用可能になります。한국인오늘은 TensorArt를 사용하여 SD3 모델을 온라인으로 훈련하는 방법을 가르쳐 드리겠습니다.1단계: "온라인 훈련"을 엽니다.왼쪽에는 기본적으로 비어 있는 데이터셋 창이 보일 것입니다. 여기에서 이미지를 업로드하여 데이터셋을 생성하거나 데이터셋 ZIP 파일을 업로드할 수 있습니다. ZIP 파일에는 주석 파일이 포함될 수 있으며, 이는 kohya-ss와 동일한 형식을 따라야 합니다. 즉, 각 이미지 파일은 동일한 이름의 텍스트 주석 파일과 대응되어야 합니다.오른쪽의 모델 테마 섹션에서 애니메 캐릭터, 실제 인물, 2.5D, 표준, 맞춤형 등 다양한 옵션을 선택할 수 있습니다.여기에서는 "Base"를 선택하고 SD3 모델을 기본 모델로 선택합니다.기본 모델 파라미터 설정에서, 이미지당 반복 횟수를 4로 설정하고 에폭 수를 16으로 설정하는 것을 추천합니다. 처리된 데이터셋을 업로드한 후, 데이터셋 주석에 캐릭터 이름이 포함되어 있다면 트리거 단어를 지정할 필요가 없습니다. 그렇지 않다면, 모델에 간단한 트리거 단어를 할당해야 합니다. 예를 들어, 캐릭터 이름이나 스타일 이름을 사용할 수 있습니다. 다음으로, 데이터셋에서 미리보기 프롬프트로 사용할 주석 파일을 선택합니다.전문 모드를 사용하려면, 오른쪽 상단의 버튼을 클릭하여 전문 모드로 전환합니다.전문 모드에서는 학습률을 두 배로 설정하는 것이 권장됩니다.또한, cosine_with_restarts 학습률 스케줄러를 사용하고, 옵티마이저로는 AdamW8bit를 선택할 수 있습니다.라벨 셔플링을 활성화하고 첫 번째 토큰이 변경되지 않도록 유지합니다 (특히 첫 번째 토큰으로 캐릭터 이름 트리거 단어를 사용하는 경우에는 더욱 중요합니다).노이즈 오프셋 기능을 비활성화하고, 컨볼루션 DIM을 8로 설정하며, Alpha를 1로 설정할 수 있습니다.샘플 설정에서 Negative prompts를 추가한 후, 훈련 과정을 시작할 수 있습니다.훈련 대기열에서 현재 손실 값 차트와 각 에폭마다 생성된 네 개의 샘플 이미지를 확인할 수 있습니다.마지막으로, 가장 좋은 결과를 보인 에폭을 선택하여 로컬 컴퓨터에 다운로드하거나 TensorArt에서 직접 게시할 수 있습니다.몇 분 후에 모델이 배포되어 준비 완료됩니다.Tiếng ViệtHôm nay, tôi sẽ hướng dẫn bạn cách sử dụng TensorArt để huấn luyện mô hình SD3 trực tuyến.Bước 1: Mở "Đào tạo trực tuyến".Ở phía bên trái, bạn sẽ thấy cửa sổ dữ liệu, mặc định sẽ trống. Bạn có thể tải lên một số hình ảnh để tạo thành một tập dữ liệu hoặc tải lên một tệp zip chứa tập dữ liệu. Tệp zip có thể bao gồm các tệp chú thích, theo định dạng tương tự như kohya-ss, trong đó mỗi tệp hình ảnh tương ứng với một tệp chú thích văn bản có cùng tên.Trong phần chủ đề mô hình ở bên phải, bạn có thể chọn từ các tùy chọn như nhân vật anime, người thật, 2.5D, tiêu chuẩn và tùy chỉnh.Tại đây, chúng ta chọn "Cơ bản" và chọn mô hình SD3 làm mô hình cơ sở.Đối với các cài đặt tham số của mô hình cơ bản, chúng tôi khuyến nghị thiết lập số lần lặp lại cho mỗi hình ảnh là 4 và số lượng epoch là 16.Sau khi tải lên một tập dữ liệu đã xử lý, nếu các chú thích trong tập dữ liệu của bạn bao gồm tên nhân vật, bạn không cần phải chỉ định từ kích hoạt. Ngược lại, bạn nên gán một từ kích hoạt đơn giản cho mô hình của bạn, chẳng hạn như tên nhân vật hoặc tên phong cách.Tiếp theo, chọn một tệp chú thích từ tập dữ liệu để sử dụng làm gợi ý xem trước.Nếu bạn muốn sử dụng Chế độ Chuyên nghiệp, hãy nhấp vào nút ở góc trên bên phải để chuyển sang Chế độ Chuyên nghiệp.Trong Chế độ Chuyên nghiệp, nên gấp đôi tỷ lệ học.và sử dụng bộ lập lịch tỷ lệ học cosine_with_restarts. Đối với bộ tối ưu hóa, bạn có thể chọn AdamW8bit.Kích hoạt tùy chọn xáo trộn nhãn và đảm bảo rằng token đầu tiên không thay đổi (đặc biệt nếu bạn có từ kích hoạt là tên nhân vật ở token đầu tiên).Tắt tính năng bù nhiễu, và bạn có thể đặt DIM của phép tích chập thành 8 và Alpha thành 1.Trong cài đặt mẫu, thêm các lời nhắc tiêu cực, sau đó bạn có thể bắt đầu quá trình huấn luyện.Trong hàng đợi huấn luyện, bạn có thể xem biểu đồ giá trị mất mát hiện tại và bốn hình ảnh mẫu được tạo ra cho mỗi epoch.Cuối cùng, bạn có thể chọn epoch có kết quả tốt nhất để tải về máy tính của mình hoặc xuất bản trực tiếp trên TensorArt.Sau vài phút, mô hình của bạn sẽ được triển khai và sẵn sàng.españolHoy, les enseñaré cómo utilizar TensorArt para entrenar un modelo SD3 en línea.Paso 1: Abre "Entrenamiento en línea".En el lado izquierdo, verás la ventana de conjuntos de datos, que está vacía por defecto. Puedes subir algunas imágenes para crear un conjunto de datos o subir un archivo zip de conjunto de datos. El archivo zip puede incluir archivos de anotación, siguiendo el mismo formato que kohya-ss, donde cada archivo de imagen corresponde a un archivo de anotación de texto con el mismo nombre.En la sección de temas del modelo a la derecha, puedes elegir entre opciones como personajes de anime, personas reales, 2.5D, estándar y personalizado.Aquí, seleccionamos "Base" y elegimos el modelo SD3 como el modelo base.Para la configuración de los parámetros del modelo base, recomendamos ajustar el número de repeticiones por imagen a 4 y el número de épocas a 16.Después de subir un conjunto de datos procesado, si las anotaciones de tu conjunto de datos incluyen nombres de personajes, no necesitas especificar una palabra clave. De lo contrario, deberías asignar una palabra clave simple a tu modelo, como el nombre de un personaje o el nombre de un estilo.A continuación, selecciona un archivo de anotación del conjunto de datos para usarlo como mensaje de vista previa.Si deseas utilizar el Modo Profesional, haz clic en el botón en la esquina superior derecha para cambiar al Modo Profesional.En el Modo Profesional, se recomienda duplicar la tasa de aprendizaje.y utilizar el programador de tasa de aprendizaje cosine_with_restarts. Para el optimizador, puedes elegir AdamW8bit.Habilita el barajado de etiquetas y asegúrate de que el primer token permanezca sin cambios (especialmente si tienes una palabra clave de nombre de personaje como el primer token).Desactiva la función de compensación de ruido y puedes configurar el DIM de la convolución en 8 y el Alpha en 1.En la configuración de muestras, añade los Negative prompts y luego puedes comenzar el proceso de entrenamiento.En la cola de entrenamiento, puedes ver el gráfico del valor de pérdida actual y las cuatro imágenes de muestra generadas para cada época.Finalmente, puedes elegir la época con los mejores resultados para descargarla a tu máquina local o publicarla directamente en TensorArt.Después de unos minutos, tu modelo estará desplegado y listo.
如何使用混元DiT在线训练

如何使用混元DiT在线训练

首先点击右上角的头像,在弹出的下拉框中选择我训练的模型,进入训练中心。如果之前有训练过模型,这里会看到许多训练任务。然后选择在线训练按钮进行一次训练。左侧是数据集窗口,默认没有任何数据。您可以上传一些图片作为数据集,或者上传一个数据集压缩包,压缩包可以包含标注文件,格式和kohya-ss一样,每个图片文件对应一个同名的标注文件txt。右边的模型主题中可以选择二次元人物、真实人物、2.5D、标准以及自定义。训练混元模型这里我们选择标准,在使用底模中选择混元1.2模型。混元模型使用了40depth的块,所以非常大,训练相对速度较慢,需要更高的学习率,默认使用4e-4,默认单张图片重复次数5,优化器AdamW。基础模式下参数选择,推荐单张图片重复次数5,轮数为16。上传一个处理好的数据集后,如果你的数据集标注中有人物名,可以不写触发词。否则你应该给你的模型起一个简单的触发词,例如人物名称或者风格名称。接着从数据集中选择一个标注文件作为预览提示词。如果你想使用专业模式,选择右上角按钮切换到专业模式。专业模式推荐学习率翻倍,然后使用cosine_with_restarts学习率调度器,优化器选择AdamW或者AdamW8bit。开启打乱标签(shuffle),并且保持第1个token(如果你有一个人名触发词在第一个)关闭噪声偏移功能,卷积DIM和Alpha可以选择8和1。在样图设置中追加填写反向提示词,接下来就可以开始训练了。在训练队列中,你可以看到当前loss值变化表以及每轮epoch产生的4张样图。最后可以选择效果最好的epoch下载到本地或者直接在tensorart上发布。
3
SD3 - composition repair

SD3 - composition repair

SD3 can generate interesting images, but it has a huge problem with the human body. However, I noticed that simply reducing the image size to 60% can, in most cases, eliminate issues with image composition as well as extra hands or legs. This workflow does not solve the problem of having six fingers, etc. :)Base model: https://tensor.art/models/751330255836302856/Aderek-SD3-v1 or https://civitai.com/models/600179/aderek-sd3Look at the image below. You might say: "Hey, nothing's wrong here." Well, that's because you're already seeing the generation based on the reduced size. Below, you have the original image.Use composition on to use this trick&tips.Have fun!Support Paweł Tomczuk on Ko-fi! ❤️. ko-fi.com/aderek514 - Ko-fi ❤️ Where creators get support from fans through donations, memberships, shop sales and more! The original 'Buy Me a Coffee' Page.Visit my DeviantArt page: Aderek - Hobbyist, Digital Artist | DeviantArt
6
2
🆘 ERROR | Exception (routeId: 7544351650166750320230)

🆘 ERROR | Exception (routeId: 7544351650166750320230)

Exception (routeId: 7544339967855538950230)Suspect nodes:<string function>. <LayeStyle>, <LayerUtility>, <FaceDetailer>, many <TextBox>, <Bumpmap>After some reseach (on my own) I've found<FaceDetailer> node is completely broken<TextBox> and <MultiLine:Textbox> node will cause this error if you introduce more than 250+ characters, I'm not very sure about this number, but you won't be able to introduce a decent amount of text anymore.More than 40 nodes, despite its function will couse this error.How do i know this? Well I made a functional comfyflow following those rules:https://tensor.art/template/754955251181895419The next functional comfyflow suddelny stopped from generating, it's almost the same flow than the previous, but with <FaceDetailer> and large text strings to polish the prompt. It works again yay!https://tensor.art/template/752678510492967987 proof it really worked (here)I feel bad for you if this error suddenly disrupt your day; feel bad for me cuz I bought the yearly membership of this broken product I can't refound. I'll be happy to delete this bad review if you fix this error.News072824 | <FaceDetailer> seems to work again.
4
Upscaling in ComfyUI: ¿Algorithm or Latent?

Upscaling in ComfyUI: ¿Algorithm or Latent?

Hello again! In this little article I want to explain the upscaling methods that I know in ComfyUI and that I have researched. I hope they will help you and that you can use them in the creation of your workflows and AI tools. In addition, remember that if you have any useful knowledge, you can share it in the comments section to enrich the topic. Also, please excuse any spelling mistakes; I am just learning English hehe.¡Let’s get to the point!To the best of my knowledge, there are two widely used ways in ComfyUI to achieve uspcaling (you decide which one to use according to your needs). The two options are: Algorithm Method or Latent Method.Algorithm Method:This is one of the most commonly used method, and is readily available. It consists of loading an upscaling model, and connecting it to the workflow. That way the image pixels are manipulated as the user wishes. It is very similar to the upscale method used in the normal way of creating images in Tensor Art.The following nodes are needed:A. Load Upscale Model.B. Upscale Image (Using Model).These nodes are connected to the workflow between the “VAE Decode” and “Save Image” nodes; as shown in the image. Once this structure is created, you can choose from all the different models offered by the “Load Upscale Model” node, ranging from “2x-ESRGAN.pth” to “SwimIR_4x”. You can use any of the 23 available models and experiment with any of them. You just have to click on the node and the list will be displayed.This can also be achieved in other ways by using another node such as “Upscale Image By”. The structure is simpler to create because only that node is connected between the VAE decode and Save Image as shown in the following image.Once the node is connected, you are free to select the mode in which you want to upscale the image (Upscale_method) and you can also set the scale to which you can recondition the image pixel value (Scale By).Strengths and Weaknesses of the Algorithm Method:Among the strengths of this method are its ease of integration into the workflow and its advantage of choosing between several upscaling model options. It also allows fast generation both in the ComfyUI and in the use of AI tools.However, among its weaknesses, it is not very effective in some specific contexts. For example: the algorithm can upscale the image pixels but does not alter the actual image size; causing the generated image itself to end up being blurred in some cases.Latent Method:This is the other alternative option to the algorithm method. It is focused on highlighting image details and maximizing quality. This method is also one of the most used in the Workflow mode of different visual content creation platforms with artificial intelligence. Here, upscaling is performed while the image is generated from latent space (Latent space is where the IA takes data from the prompt, deconstructs it for analysis and then reconstructs it to represent it in an image).The Latent Upscale node is placed between the two Ksamplers. While the first Ksampler is connected to the “Empty Latent Image” node, the second one is connected to the “VAE Decode” to ensure the correct processing and representation of the generated image.It should be noted that the “Empty Latent Image” node and the “VAE Decode” node are already included by default in the Text2Image templates in WorkFlow mode. (For more information about Text2Image, you can see my other article called “ComfyUi: Text2Image Basic Glossary”).It is important to take into consideration that for this method to work properly, you have to know how to create a correct balance between the original size of the image and its upscaled size. For example, you can generate a 512x512 image and upscale it to 1024x1024; but it is not recommended to make a 512x512 image (square image) and upscale it to 768x11152 (rectangular image) since the shape of the image would not be compatible with its uspcale version. For this reason you have to pay attention to the values of the “Empty Latent Image” and the “Latent Upscale”, so that these are always proportional.In the “Empty Latent Image” node you must place the original image dimensions (for example: 768x1152); while in the “Latent Upscale” node you must place the resized image dimensions (for example: 1152x1728). In this way you are given the freedom to set the image size to your own discretion. For this I always recommend to look at the size and upscale of the normal mode in which we create illustrations, this way we will always know which values to set and which will be compatible. As you can see in the image. You look at those values, and then write them to the nodes listed above.Once everything is connected and configured, you are able to have images of any size you want. You can experiment to your taste.Strengths and Weaknesses of the Latent Method:As strengths this option should be highlighted that it allows you to access excellent quality images if everything is correctly configured. It also allows you to create images of a custom size and upscale with the values you want. It brings out the details in both SD and XL images.As negative points we have to configure everything manually every time you want to change the size of the images or the shape of the same. Also, this method is just a little bit slower in the generation process compared to the algorithm method.Which is better: ¿Algorithm or Latent?Neither method is better than the other. Both are useful in different contexts. Remember that workflows will be different from user to user, because we all have different ways of creating and designing things.It all depends on your taste and whether you want something simpler or more elaborate. I hope the explanation in this article has helped you to make Workflows more complex and to make it easier to make the images you want.Extra Tip:If you do not find any of the nodes outlined in this document. You can double click on any empty place in the workflow and you can search for the name of the node you are looking for. Just remember to type the name without spaces.
3
Controlnet with SD3

Controlnet with SD3

Today, I noticed that I can add ControlNet to the SD3 model.The Tiled function works very well, so I incorporated it into my workflow and created a group for generating artistic images based on a given photo or a previously generated image. In the main part of the workflow, I simply set a very short prompt, like "grass, flowers," and I get an image that blends grass and flowers in an arrangement resembling the base photo.https://youtu.be/sv35wKNiFGsControlnet with SD3 | ComfyUI Workflow | Tensor.Art
2
如何使用SD3在线训练

如何使用SD3在线训练

首先点击右上角的头像,在弹出的下拉框中选择我训练的模型,进入训练中心。如果之前有训练过模型,这里会看到许多训练任务。然后选择在线训练按钮进行一次训练。左侧是数据集窗口,默认没有任何数据。您可以上传一些图片作为数据集,或者上传一个数据集压缩包,压缩包可以包含标注文件,格式和kohya-ss一样,每个图片文件对应一个同名的标注文件txt。右边的模型主题中可以选择二次元人物、真实人物、2.5D、标准以及自定义。这里我们选择自定义,在使用底模中选择SD3模型。注意在选择版本中下拉框内选择T5XXL的版本,这样才可以训练T5文本编码器。基础模式下参数选择,推荐单张图片重复次数4,轮数为16。上传一个处理好的数据集后,如果你的数据集标注中有人物名,可以不写触发词。否则你应该给你的模型起一个简单的触发词,例如人物名称或者风格名称。接着从数据集中选择一个标注文件作为预览提示词。如果你想使用专业模式,选择右上角按钮切换到专业模式。专业模式推荐学习率翻倍,然后使用cosine_with_restarts学习率调度器,优化器可以选择AdamW8bit。开启打乱标签(shuffle),并且保持第1个token(如果你有一个人名触发词在第一个)关闭噪声偏移功能,卷积DIM和Alpha可以选择8和1。在样图设置中追加填写反向提示词,接下来就可以开始训练了。在训练队列中,你可以看到当前loss值变化表以及每轮epoch产生的4张样图。最后可以选择效果最好的epoch下载到本地或者直接在tensorart上发布。
3
1
SD3 - training on your own PC

SD3 - training on your own PC

So first, you need to update your version of OneTrainer.Second, u need dowload ALL files and folders (and rename)stabilityai/stable-diffusion-3-medium-diffusers at main (huggingface.co)then u put it:With float16 output lora has only 36MB:This is my setting for a style training:My checkpoint to testing u can dowload for free:Aderek SD3 - v1 | Stable Diffusion Model - Checkpoint | Tensor.Artand my loras: Aderek514's Profile | Tensor.ArtSo, good luck!
8
ReActor Node for ComfyUI (Face Swap)

ReActor Node for ComfyUI (Face Swap)

ReActor Node for ComfyUI 👉Downlond👈The Fast and Simple Face Swap Extension Node for ComfyUI, based on ReActor SD-WebUI Face Swap ExtensionThis Node goes without NSFW filter (uncensored, use it on your own responsibility)| Installation | Usage | Troubleshooting | Updating | Disclaimer | Credits | Note!✨What's new in the latest update✨💡0.5.1 ALPHA1Support of GPEN 1024/2048 restoration models (available in the HF dataset https://huggingface.co/datasets/Gourieff/ReActor/tree/main/models/facerestore_models)👈[]~( ̄▽ ̄)~*ReActorFaceBoost Node - an attempt to improve the quality of swapped faces. The idea is to restore and scale the swapped face (according to the face_size parameter of the restoration model) BEFORE pasting it to the target image (via inswapper algorithms), more information is here (PR#321)InstallationSD WebUI: AUTOMATIC1111 or SD.NextStandalone (Portable) ComfyUI for WindowsUsageYou can find ReActor Nodes inside the menu ReActor or by using a search (just type "ReActor" in the search field)List of Nodes:••• Main Nodes •••💡ReActorFaceSwap (Main Node Download)👈[]~( ̄▽ ̄)~*ReActorFaceSwapOpt (Main Node with the additional Options input)ReActorOptions (Options for ReActorFaceSwapOpt)ReActorFaceBoost (Face Booster Node)ReActorMaskHelper (Masking Helper)••• Operations with Face Models •••ReActorSaveFaceModel (Save Face Model)ReActorLoadFaceModel (Load Face Model)ReActorBuildFaceModel (Build Blended Face Model)ReActorMakeFaceModelBatch (Make Face Model Batch)••• Additional Nodes •••ReActorRestoreFace (Face Restoration)ReActorImageDublicator (Dublicate one Image to Images List)ImageRGBA2RGB (Convert RGBA to RGB)Connect all required slots and run the query.Main Node Inputsinput_image - is an image to be processed (target image, analog of "target image" in the SD WebUI extension);Supported Nodes: "Load Image", "Load Video" or any other nodes providing images as an output;source_image - is an image with a face or faces to swap in the input_image (source image, analog of "source image" in the SD WebUI extension);Supported Nodes: "Load Image" or any other nodes providing images as an output;face_model - is the input for the "Load Face Model" Node or another ReActor node to provide a face model file (face embedding) you created earlier via the "Save Face Model" Node;Supported Nodes: "Load Face Model", "Build Blended Face Model";Main Node OutputsIMAGE - is an output with the resulted image;Supported Nodes: any nodes which have images as an input;FACE_MODEL - is an output providing a source face's model being built during the swapping process;Supported Nodes: "Save Face Model", "ReActor", "Make Face Model Batch";Face RestorationSince version 0.3.0 ReActor Node has a buil-in face restoration.Just download the models you want (see Installation instruction) and select one of them to restore the resulting face(s) during the faceswap. It will enhance face details and make your result more accurate.Face IndexesBy default ReActor detects faces in images from "large" to "small".You can change this option by adding ReActorFaceSwapOpt node with ReActorOptions.And if you need to specify faces, you can set indexes for source and input images.Index of the first detected face is 0.You can set indexes in the order you need.E.g.: 0,1,2 (for Source); 1,0,2 (for Input).This means: the second Input face (index = 1) will be swapped by the first Source face (index = 0) and so on.GendersYou can specify the gender to detect in images.ReActor will swap a face only if it meets the given condition.💡Face ModelsSince version 0.4.0 you can save face models as "safetensors" files (stored in ComfyUI\models\reactor\faces) and load them into ReActor implementing different scenarios and keeping super lightweight face models of the faces you use.To make new models appear in the list of the "Load Face Model" Node - just refresh the page of your ComfyUI web application.(I recommend you to use ComfyUI Manager - otherwise you workflow can be lost after you refresh the page if you didn't save it before that).TroubleshootingI. (For Windows users) If you still cannot build Insightface for some reasons or just don't want to install Visual Studio or VS C++ Build Tools - do the following:(ComfyUI Portable) From the root folder check the version of Python:run CMD and type python_embeded\python.exe -VDownload prebuilt Insightface package for Python 3.10 or for Python 3.11 (if in the previous step you see 3.11) or for Python 3.12 (if in the previous step you see 3.12) and put into the stable-diffusion-webui (A1111 or SD.Next) root folder (where you have "webui-user.bat" file) or into ComfyUI root folder if you use ComfyUI PortableFrom the root folder run:(SD WebUI) CMD and .\venv\Scripts\activate(ComfyUI Portable) run CMDThen update your PIP:(SD WebUI) python -m pip install -U pip(ComfyUI Portable) python_embeded\python.exe -m pip install -U pip💡Then install Insightface:(SD WebUI) pip install insightface-0.7.3-cp310-cp310-win_amd64.whl (for 3.10) or pip install insightface-0.7.3-cp311-cp311-win_amd64.whl (for 3.11) or pip install insightface-0.7.3-cp312-cp312-win_amd64.whl (for 3.12)(ComfyUI Portable) python_embeded\python.exe -m pip install insightface-0.7.3-cp310-cp310-win_amd64.whl (for 3.10) or python_embeded\python.exe -m pip install insightface-0.7.3-cp311-cp311-win_amd64.whl (for 3.11) or python_embeded\python.exe -m pip install insightface-0.7.3-cp312-cp312-win_amd64.whl (for 3.12)Enjoy!II. "AttributeError: 'NoneType' object has no attribute 'get'"This error may occur if there's smth wrong with the model file inswapper_128.onnx💡Try to download it manually from here and put it to the ComfyUI\models\insightface replacing existing oneIII. "reactor.execute() got an unexpected keyword argument 'reference_image'"This means that input points have been changed with the latest updateRemove the current ReActor Node from your workflow and add it againIV. ControlNet Aux Node IMPORT failed error when using with ReActor NodeClose ComfyUI if it runsGo to the ComfyUI root folder, open CMD there and run:python_embeded\python.exe -m pip uninstall -y opencv-python opencv-contrib-python opencv-python-headlesspython_embeded\python.exe -m pip install opencv-python==4.7.0.72That's it!reactor+controlnetV. "ModuleNotFoundError: No module named 'basicsr'" or "subprocess-exited-with-error" during future-0.18.3 installationDownload https://github.com/Gourieff/Assets/raw/main/comfyui-reactor-node/future-0.18.3-py3-none-any.whlPut it to ComfyUI root And run:python_embeded\python.exe -m pip install future-0.18.3-py3-none-any.whlThen:python_embeded\python.exe -m pip install basicsrVI. "fatal: fetch-pack: invalid index-pack output" when you try to git clone the repository"Try to clone with --depth=1 (last commit only):git clone --depth=1 https://github.com/Gourieff/comfyui-reactor-nodeThen retrieve the rest (if you need):git fetch --unshallow
11
ComfyUi: Text2Image Basic Glossary

ComfyUi: Text2Image Basic Glossary

Hello! This is my first article; I hope it will be of benefit to the person who reads it. I still have limited knowledge about WorkFlow; but I have researched and learned little by little. If anyone would like to contribute some content; you are totally free to do so. Thank you.I made this article to give a brief and basic explanation about basic concepts about Comfyui or WorkFlow. This is a technology with many possibilities and it would be great to make it easier to use for everyone! What is Workflow?Workflow is one of the two main image generation systems that Tensor Art has at the moment. It corresponds to a generation method that is characterized by a great capacity to stimulate the creativity of the users; also, it allows us to access to some Pro features being Free users.How do I access the WorkFlow mode?To access the WorkFlow mode, you must place the mouse cursor on the “Create” tab as if you were going to create an image by conventional means. Once you have done that; click on the “ComfyFlow” option and you are done.After that, you will see a tab with two options “New WorkFlow” and “Import WorkFlow”. The first one allows you to start a workflow from a template or from scratch; while the second option allows you to load a workflow that you have saved on your pc in a JSON file.If you click on the “New WorkFlow” option, a tab with a list of various templates will be displayed (each template will have a different purpose). But the main one will be “Text2Image”; it will allow us to create images from text, similarly to the conventional method we always use. You can also create a workflow from scratch in the “Empty WorkFlow Template” option but for a better explanation of the basics we will use the “Text2Image”.Once you click on the "Text2Image" option, you must wait a few seconds and a new tab will be displayed with the template, which contains the basics to create an image by means of text. Nodes and Borders: ¿What are they and how do they work?Well, to understand the basics of how a WorkFlow works, it is necessary to have a clear understanding of what Nodes and Border are.Nodes are small boxes that are present in the workflow; each node will have a specific function necessary for the creation, enhancement or editing of the image or video. The basics of Text2Image are the CheckPoint loader, the Clip Text Encoders, the Empty Lantent Image, the Ksampler, the VAE decoder, and Save Image. It should be noted that there are hundreds of other nodes besides these basics and they all have many different functions.On the other hand, the “Borders” are the small colored wires that connect the different nodes. They are the ones that will set which nodes will be directly related. The Borders are ordered by colors that are generally related to a specific function.The purple is related to the Model or Lora used.The yellow one is intended for connection to the model or lora with the space to place the prompt.The red refers to VAE.The orange color refers to the connection between the spaces for placing the prompt and the “Ksampler” node.The fucsia color makes allusion to the latent, which will serve for many things; but for this case it serves to connect the “Empty Latent Image” node with the “Ksampler” node and establish the number and size of the images that will be generated.And the blue color is related to everything that has to do with images; it has many uses but this case is related to the “Save Image” node.What are the Text2Image template Nodes used for?Having this clear is of utmost relevance, since it allows you to know what each node of this basic template is for. It's like knowing what each piece in a lego set is for and understanding how they should be connected to create a beautiful masterpiece! Also, if you get to know what these nodes are for, it will be easier for you to intuit the functionality of its variants and other derived nodes.A) The first one is the node called “Load Chckpoint”, this node has three specific functions. The first one is to load the base model or checkpoint with which an image will be created. The second is the Clip, which will take care of connecting the positive and negative prompts that you write to the checkpoint. And the third is that it connects and helps to load the VAE model. B) The second one is the “Empty Latent Image”; which is the node in charge of processing the image dimensions from the latent space. It has two functions: First, set the width and length of the image; and second, set how many images will be generated simultaneously according to the “Batch Size” option.C) The third is the two “Clip Text Enconder” nodes: in this case there will always be at least two of these nodes, since they are responsible for setting both the positive and negative prompts that you write to describe the image you want. They are usually connected to the "Load Checkpoint" or any LoRa and are also connected to the “Ksampler” node.D) Then, there is a node “Ksampler”. This node is the central point of all WorkFlow; it is the one that sets the most important parameters in the creation of images. It has several functions: the first one is to determine which is the seed of the image and to regulate how much it changes from image to generated image by means of the “control_after_generate” option. The second function is to set how many steps are needed to create the image (you set them as you wish); the third function is to determine which sampling method is used and also what is the scheduler of this method (this helps to regulate how much space is eliminated when creating the image).E) The penultimate one is the VAE decoder. This node is in charge of assisting the processing of the image to be generated: its main function is to be responsible for materializing the written prompt into an image. That is to say, it reconstructs the description of the image we want as one of the final steps to finish the generation process. Then, the information is transmitted to the “Save Image” node to display the generated image as the final product.F) The last node to explain is the “Save Image”. This node has the simple function of saving the generated image and providing the user with a view of the final work that will later be stored in the taskbar where all the generated images are located.Final Consideration:This has been a small summary and explanation about very basic concepts about ComfyUI Mode; you could even say that it is like a small glossary about general terms. I have tried to give a small notion that tries to facilitate the understanding of this image generation tool. There is still a lot to explain, but I will try to cover all the topics; the information would not fit in a single article (ComfyUI is a whole universe of possibilities). ¡Thank you so much for taking the time to read this article!
16
11
Textual Inversion Embeddings  ComfyUI_Examples

Textual Inversion Embeddings ComfyUI_Examples

ComfyUI_examplesTextual Inversion Embeddings ExamplesHere is an example for how to use Textual Inversion/Embeddings.To use an embedding put the file in the models/embeddings folder then use it in your prompt like I used the SDA768.pt embedding in the previous picture.Note that you can omit the filename extension so these two are equivalent:embedding:SDA768.ptembedding:SDA768You can also set the strength of the embedding just like regular words in the prompt:(embedding:SDA768:1.2)Embeddings are basically custom words so where you put them in the text prompt matters.For example if you had an embedding of a cat:red embedding:catThis would likely give you a red cat.
9
1
Art Mediums (127 Style)

Art Mediums (127 Style)

Art MediumsVarious art mediums. Prompted with '{medium} art of a woman MetalpointMiniature PaintingMixed MediaMonotype PrintingMosaic Tile ArtMosaicNeonOil PaintOrigamiPapermakingPapier-mâchéPastelPen And InkPerformance ArtPhotographyPhotomontagePlasterPlastic ArtsPolymer ClayPrintmakingPuppetryPyrographyQuillingQuilt ArtRecycled ArtRelief PrintingResinReverse Glass PaintingSandScratchboard ArtScreen PrintingScrimshawSculpture WeldingSequin ArtSilk PaintingSilverpointSound ArtSpray PaintStained GlassStencilStoneTapestryTattoo ArtTemperaTerra-cottaTextile ArtVideo ArtVirtual Reality ArtWatercolorWaxWeavingWire SculptureWoodWoodcutGlassGlitch ArtGold LeafGouacheGraffitiGraphite PencilIceInk Wash PaintingInstallation ArtIntaglio PrintingInteractive MediaKinetic ArtKnittingLand ArtLeatherLenticular PrintingLight ProjectionLithographyMacrameMarbleMetalColored PencilComputer-generated Imagery (cgi)Conceptual ArtCopper EtchingCrochetDecoupageDigital MosaicDigital PaintingDigital SculptureDioramaEmbroideryEnamelEncaustic PaintingEnvironmental ArtEtchingFabricFeltingFiberFoam CarvingFound ObjectsFrescoAugmented Reality ArtBatikBeadworkBody PaintingBookbindingBronzeCalligraphyCast PaperCeramicsChalkCharcoalClayCollageCollagraphy3d PrintingAcrylic PaintAirbrushAlgorithmic ArtAnimationArt GlassAssemblage
16
Anime Vision | Detail Enhancer SD3

Anime Vision | Detail Enhancer SD3

SD3 Anime LoRA is Finally Here!I am thrilled to announce that the SD3 Anime LoRA model is finally available. In addition, I am releasing a new update that includes an SD3 anime checkpoint model.Currently, I am publishing a beta version as I continue to work diligently to perfect the model. I aim to have the final release ready by the end of this month or early August.Stay tuned, as the SD3 Anime beta version will be available within the next couple of days!Here are some guidelines to use this LoRA to its full potential:If you are trying to create any specific subject or object, use trigger word like 'anime style' in your prompt.If you're targeting a character, you can ignore the keyword and go with something like this:For a male character: 'anime boy'For a female character: 'anime girl'Simple, right? You can also use the trigger word 'anime style' most of the time. I've noticed it gives better results.ModelRecommended Parameter :LoRA Weight : 0🆙1VAE : No NeedSampler : DPM++ 2M SGM UniformSteps : 20➡30CFG : 3➡4Upscaler : R-ESRGAN 4x+If you encounter any issues, I recommend using ComfyUI for a better experience. Here's the workflow: ComfyUI Workflow. Open the link, select the LoRA model, choose the LoRA strength, and hit the run button.Join my community, Share your feedback, learn, and have fun with us! 😊Discord➡️https://discord.gg/QQKd7bu97P
18
How to set up Radio Button in your AI Tools

How to set up Radio Button in your AI Tools

Hello everyone! ✨ Today I will bring you a super practical tutorial: How to set up a convenient prompt word radio version for your AI Tools! 😎 Save it quickly, and you will never have to worry about how to set prompt words again! 👌Are you ready for the course? Let's get started! 🔍First, the first step is to open the official website of TensorArt. 📂 After opening, you will see a variety of AI tools and resources, which are very rich~ 👀Next, open comfyflow and start making our AI Tool! 🤖 This process is simple and fun, let's explore it together! ✨In comfyflow, we click the "New" button, which will take you to a new interface~ 🖱️💻In this interface, we can start creating our own workflow~ 🌟🎉 Next, we need to fill in the positive prompt words, which is a super critical step! 📝✨In the positive prompt word area, we need to enter the content we want. 📋 Here, the editor simply wrote an example for everyone: "a man". 🤵 This example is just for the convenience of teaching, you can freely play according to your needs~ 🌈🎆🎉 When you have completed the workflow, you can click the "Publish" button in the upper right corner! 🚀✨Don't forget to give your AI Tool an interesting name! 💡 This name will make your tool more attractive~✨ In addition, remember to divide the area correctly, so that you can see it clearly and it is also convenient for your friends to find and use it! 📂🔍🌟 Next, let's complete the next step together! 💪We pull down the current interface and find the user-configurable settings area. 👏 Then click the "Add" button. This step is very critical! 🖱️✨ Everyone must remember to add your positive prompt word node! 🔍✨After adding the node, our next step is to click the "Set" button on the right to proceed to the next step. 🔧✨ This step is crucial! Don't miss it! 😉🚀✨ The next step is also very important! 😊First, click the radio button, then click "Add". 🔘✨ Here, you can add the buttons you want to release to the user! 👍 After selecting, be sure to click "Confirm"! ✔️✨Friends, we have finally reached the last step! 🎉💪 This is an exciting moment! ✨When you have completed all the operations, remember to click the "Publish" button to publish your AI gadget! 🚀✨ Can't wait to see the results? Hurry up and generate a picture yourself to try and experience your results! 🌟🖼️Well, that's all for today's tutorial! 😊 I hope everyone can complete it successfully and create their own AI gadgets! 👏 If you have any questions, don't hesitate to leave a comment in the comment section at any time! ❤️
16
5
Guide to Using SDXL / SDXLモデルの利用手引

Guide to Using SDXL / SDXLモデルの利用手引

Guide to Using SDXLI occasionally see posts about difficulties in generating images successfully, so here is an introduction to the basic setup.1. IntroductionSDXL is a model that can generate images with higher accuracy compared to SD1.5. It produces high-quality representations of human bodies and structures, with fewer distortions and more realistic fine details, textures, and shadows.With SD1.5, generation parameters were generally applicable across different models, so there was no need for specific adjustments.However, while SDXL can still use some SD1.5 techniques without issues, the recommended generation parameters vary significantly depending on the model.Additionally, LoRA and Embeddings (such as EasyNegative) are completely incompatible, requiring a review of prompt construction.Notably, embeddings commonly used in SD1.5 negative prompts are recognized merely as strings in the XL model, so you must replace them with corresponding embeddings or add appropriate tags.This guide explains the recommended parameter settings for using SDXL.2. Basic ParametersVAESelecting "sdxl-vae-fp16-fix.safetensors" will suffice.Many models have this built-in, so specification might not be necessary.Image SizeUsing the presets provided by TensorArt for resolution should be sufficient.Small or excessively large resolutions may not yield appropriate generation results, so please avoid using the sizes that were frequently used with SD1.5 wherever possible.Even if you want to create vertically or horizontally elongated images, do so within the range that does not significantly alter the total pixel count (adjust by increasing height and decreasing width, for example).Sampling MethodChoose the sampler recommended for the model first.Then, select according to your preference.Typically, selecting Euler a or DPM++ 2M SDE Karras should work well.Sampling StepsXL models might generate images effectively with lower steps due to optimizations like LCM or Turbo.Be sure to check the recommended values for the selected model.CFG ScaleThis varies by model, so check the recommended values.Typically, the range is around 2 to 8.Hires.fixFor free users, specifying 1.5x might hit the upper limit, so use custom settings with the following resolutions:768x1152 -> 1024x15361152x768 -> 1536x10241024x1024 -> 1248x1248Choose the upscaler according to your preference.Set the denoising strength to around 0.3 to 0.4.3. PromptSDXL handles natural language better.You can input elements separated by commas or simply write a complete sentence in English, and it will generate images as intended.Using a tool like ChatGPT to create prompts can also be beneficial.However, depending on how the model was additionally trained, it might be better to use existing tags.Furthermore, some models have tags specified to enhance quality, so always check the model’s page.For example:AnimagineXL3.1: masterpiece, best quality, very aesthetic, absurdres is recommended.Pony Models: score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up is recommended.ToxicEchoXL: masterpiece, best quality, aesthetic is recommended.In this way, especially for XL models, particularly anime or illustration models, appropriate tag usage is crucial.4. Negative PromptsForget the negative prompts used in SD1.5. "EasyNegative" is just a string.The embeddings usable on TensorArt are negativeXL_D and unaestheticXLv13.Choose according to your preference.Some models have recommended prompts listed.For AnimagineXLnsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]For ToxicEchoXLnsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name.For photo models, sometimes it is better not to use negative prompts to create a certain atmosphere, so try various approaches.5. Recommended SDXL modelToxicEnvisionXLhttps://tensor.art/models/736585744778443103/ToxicEnvisionXL-v1Recently released high-quality photo model. Yes, I created it.If you are looking for a photo model, you can't go wrong with this one.Check the related posts to see what kind of images can be created.You can create a variety of realistic images, from analog photo styles to gravure, movies, fantasy, and surreal depictions.Although it is primarily a photo-based model, it can also create analog-style images.ToxicEtheRealXLhttps://tensor.art/models/702813703965453448/ToxicEtheRealXL-v1A versatile model that supports both illustrations and photorealistic images. Yes, I created it.The model's flexibility requires well-crafted prompts to determine whether the output is more illustrative or photorealistic.Using LoRA to strengthen the direction might make it easier to use.ToxicEchoXLhttps://tensor.art/models/689378702666043553/ToxicEchoXL-v1A high-performance model specialized for illustrations. Yes, I created it.It features a unique style based on watercolor painting, with custom learning and adjustments.I have also created various LoRA for style changes, so please visit my user page.My current favorite is Beautiful Warrior XL + atmosphere.The model covers a range from illustrations to photos, so give it a try.However, it is weak in generating copyrighted characters, so use LoRA or models like AnimagineXL or Pony for those.ToxicEchoXL can produce unique illustration styles when using character LoRA, making it highly suitable for fan art.6. ConclusionI hope this guide helps those who struggle to generate images as well as others.Well... if you remix from Model Showcase, you can create beautiful images without this guide...SD3 has also been released, so if possible, I would like to create models for that as well.It seems that a commercial license is required for commercial use, though...SDXLモデルの利用手引ここではSDXLの基本的な設定を紹介します。1. はじめにSDXLはSD1.5と比較してより高精度な生成が行えるモデルです。人体や構造物はより高品質で破綻が少なく、微細なディテールがよりリアルに表現され、自然なテクスチャや影を描写します。SD1.5ではどのモデルでも生成パラメータは概ね流用可能で、特に気にする必要はありませんでした。SDXLは一部SD1.5の手法を利用しても問題ありませんが、推奨される生成パラメータがモデルによってもだいぶ変わります。またLoRAやEmbeddings(EasyNegativeなど)も一切互換性はありませんので、プロンプトの構築も見直す必要があります。特にSD1.5のネガティブプロンプトでよく使用されているEmbeddingsをそのままXLモデルで入力しても、ただの文字列としてしか認識されていませんので、対応するEmbeddingsに差し替えるか、適切なタグを追加しなければいけません。このガイドでは、SDXLを使用する際の推奨パラメータ設定について説明します。2. 基本的なパラメータVAEsdxl-vae-fp16-fix.safetensorsを選択しておけば問題ありません。モデルに内蔵されている場合も多いですので、指定しなくても大丈夫な場合もあります。画像サイズ解像度はTensorArtで用意されているプリセットを使えば問題ありません。小さかったり大きすぎる解像度は適切な生成結果を得られなくなりますので、SD1.5でよく使用していたサイズはなるべく使用しないでください。プリセットよりも縦長や横長にしたい場合でも、総ピクセル数を大幅に変更しない範囲で行ってください。(縦を増やしたら横は減らす等で調整)サンプリング法モデルによって推奨されるサンプラーがありますので、まずはそれを選択してください。あとはお好みです。基本は Euler a か DPM++ 2M SDE Karras あたりを選択しておけば大丈夫です。サンプリング回数XLではLCMやターボなど低ステップで生成できるようになっていたりしますので、必ずモデルの推奨値を確認してください。CFG Scaleこれもモデルによって異なりますので推奨値を確認してください。概ね2~8程度です。高解像度修復無料ユーザーだと1.5xを指定すると上限に引っかかってしまいますので、使用する場合はカスタムにして以下の解像度を指定してください768x1152 -> 1024x15361152x768 -> 1536x10241024x1024 -> 1248x1248Upscalerはお好みで指定してください。Denoising strengthは0.3~0.4程度。3. プロンプトSDXLはより自然言語の取り扱いに長けています。要素をコンマで区切って入力するだけではなく、普通に英文を入力するだけでも意図した通りの生成が行えます。ChatGPTなどにプロンプトを作ってもらうのもいいでしょう。ただしモデルが追加学習をどのように行ったかによって、既存のタグで記述したほうがいい場合もあります。また、モデルによっては品質を上げるためのタグが指定されていますので、使用するモデルのページは必ず見るようにしましょう。例えば…AnimagineXL3.1では「masterpiece, best quality, very aesthetic, absurdres」を指定することが推奨されています。Pony系モデルでは「score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up」が基本テンプレートとなっています。ToxiEchoXLでは「masterpiece, best quality, aesthetic」を指定することが推奨されています。このように、XLモデル、特にアニメ・イラストモデルでは適切なタグの使用が求められる場合があります。4. ネガティブプロンプトSD1.5で使用していたネガティブプロンプトは忘れてください。EasyNegativeはただの文字列です。TensorArtで使用できるEmbeddingsは negativeXL_D と unaestheticXLv13 です。お好みで指定してください。推奨されるプロンプトが記載されているモデルもあります。AnimagineXLでは以下のようなプロンプトが推奨されていますので、これをベースに組むのがいいかもしれません。nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]ToxicEchoXLでは以下のようなプロンプトが推奨されていますnsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name,フォトモデルではネガティブプロンプト無しのほうが雰囲気のある画作りができる場合もありますので、色々試してみてください。5. おすすめのSDLXモデル紹介ToxicEnvisionXLhttps://tensor.art/models/736585744778443103/ToxicEnvisionXL-v1最近リリースされた高品質フォトモデル。実写系モデルを探しているならこれを選んでおけば間違いありません。関連する投稿からどういった画像が作成できるか見てみてください。アナログ写真風からグラビア、映画、ファンタジー、非現実的な描写等、様々な実写的な画像が作成できます。基本的にはフォトベースのモデルですが、アナログ画風も作成できたりします。ToxicEtheRealXLhttps://tensor.art/models/702813703965453448/ToxicEtheRealXL-v1イラストからフォトリアルまで幅広く対応したモデル。プロンプトによってイラストかフォトリアルか振れ幅が大きいので、明確にプロンプトの作り込みが必要です。LoRAで方向性を強めると使いやすいかもしれません。ToxicEchoXLhttps://tensor.art/models/689378702666043553/ToxicEchoXL-v1イラスト特化の超高性能モデル。水彩をベースに独自の学習・調整を行っているので、わりと独特な画風を持っています。画風変更に様々なLoRAも作成していますので、是非私のユーザーページへお越しください。https://tensor.art/u/649265516304702656最近のお気に入りはBeautiful Warrior XL + atmosphere です。イラストからフォトまで一通り網羅できるので、是非使ってみてください。なお版権キャラの生成は弱いので、その辺はLoRAかAnimagineXLとかPonyとか使うといいと思います。ToxicEchoXLはキャラLoRAを使うと他のモデルとはタッチの違うイラストが作れますので、ファンアート適正自体は高いです。6. おわりにモデルのサンプルやみんなみたいにうまく生成できないな…という方の助けになれば幸いです。まあ…モデルのショーケースからリミックスすればこんなガイド見なくてもきれいな画像が作れますけどね…SD3もリリースされたので、もし可能ならそちらのモデルも作成してみたいですね。どうも商用利用は有償のライセンスが必要そうですが…
25
Fix EXIF data from EMS-#####-EMS using ExifTool

Fix EXIF data from EMS-#####-EMS using ExifTool

IntroductionYou download your images from this website but the EXIF data for the model / lora looks like:Model: EMS-342970-EMS, or <lora:EMS-45352-EMS:0.500000>.The ExifTool utility can fix this. I am using linux but it should also work for mac/Windows if you follow https://exiftool.org/install.htmlPreReq- Download ExifTool from https://exiftool.org/ and extract the archive into your home drive.- Make a new dot-file called .ExifTool_config in the same folder as exiftool.- linux example: ~/Image-ExifTool-12.86/.ExifTool_config- windows might need cmd like: echo.>.ExifTool_config.ExifTool_config fileEdit the config file. Copy/paste the basic example.- This is perl language, search and replace, and not optimized, but it works. Switch is probably more efficient.- The \+ is an escape for the + in the model name.- The /g at the end searches for all instances.- 'Parameters' is the block it changes in the EXIF.- Add all your desired entries and save the file.You only need to make new entries like:$val =~ s/EMS-151022-EMS/RealCartoon Realistic v11/g; %Image::ExifTool::UserDefined = ( 'Image::ExifTool::Composite' => { MyParameters=> { Require => 'Parameters', ValueConv => q{ # MODEL $val =~ s/EMS-151022-EMS/RealCartoon Realistic v11/g; $val =~ s/EMS-219023-EMS/ShampooMix_v4-fp16-no-ema/g; $val =~ s/EMS-230098-EMS/RealCartoon Realistic v12/g; $val =~ s/EMS-379840-EMS/Lazymix\+ - v4/g; # LoRa $val =~ s/EMS-72516-EMS/Realistic Fusion X - V1/g; $val =~ s/EMS-343944-EMS/A simple nun suit - v1/g; return $val; }, }, }, ); 1; # end Modify the EXIFI use linux, extracted to a folder inside my home folder, and my files are in my Downloads folder, so the command I run is this, where "~/Downloads" has my raw files:perl ~/Image-ExifTool-12.86/exiftool "-Parameters<MyParameters" ~/DownloadsIt will make new files and append "original" to the old, however you can add -overwrite_original to delete the old files once absolutely sure your config file works. This does not forgive. I am not responsible for lost EXIF.Copy into folders based on ModelThis will parse your EXIF for the Model: and grab until the first comma, copy the file into a subfolder of the destination named as the Model. Ideally you already modified the EXIF to fix the model name. In this example the files are in my home Downloads folder on linux.- ~/Downloads/ is the source folder with the files- /path/to/destination/ is the destination parent folder. You need to change this- -r is recursive, if you choose, make it -r -o .- The "-o ." is the copy argument. Remove for move, at your own risk.- If you only run this without first doing the above section you'll get a bunch of EMS-###-EMS folders. The next section will combine everything together into one command.perl ~/Image-ExifTool-12.86/exiftool -if '($Parameters=~/Model/i)' -o . '-Directory</path/to/destination/${Parameters;m/\bModel:\s+(\w+[^,]*)/;$_=$1;}' ~/Downloads/Combine into one commandThis combines the above into one command. This example does a move not a copy. I also renamed my exiftool folder.- ~/Downloads There are two. Rename those to the folder with your EMS-###-EMS pictures- /path/to/destination/ Where you want to move the files after renaming the EMS-###-EMS to Model nameperl ~/ExifTool/exiftool -if '($Parameters=~/Model/i)' "-Parameters<MyParameters" ~/Downloads -overwrite_original -execute '-Directory</path/to/destination/${Parameters;m/\bModel:\s+(\w+[^,]*)/;$_=$1;}' ~/DownloadsNotesYou can rename the Image-ExifTool-12.86 directory or have it wherever.On Windows you might need to change the ' to " when referencing directories.This runs perl code so maybe rename exiftool to something else for safety.ExifTool created by Phil Harvey. Very impressive. Active community and forum at the creator's website.It can do advanced operations and scripting. Above my pay grade.Please understand this post isn't an offer for support. This took me all day to figure out. I don't know what I am doing.
1
Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

📝 - SynthicalThe Dynamics of Negative Prompts in AI: A Comprehensive Study by: Yuanhao Ban UCLA, Ruochen Wang UCLA, Tianyi Zhou UMD, Minhao Cheng PSU, Boqing Gong, Cho-Jui Hsieh UCLAEThis study addresses the gap in understanding the impact of negative prompts in AI diffusion models. By focusing on the dynamics of diffusion steps, the research aims to answer the question: "When and how do negative prompts take effect?". The investigation categorizes the mechanism of negative prompts into two primary tasks: noun-based removal and adjective-based alteration.The role of prompts in AI diffusion models is crucial for guiding the generation process. Negative prompts, which instruct the model to avoid generating certain features, have been less studied compared to their positive counterparts. This study provides a detailed analysis of negative prompts, identifying the critical steps at which they begin to influence the image generation process.FindingsCritical Steps for Negative PromptsNoun-Based Removal: The influence of noun-based negative prompts peaks at the 5th diffusion step. At this critical step, negative prompts initially generate a target object at a specific location within the image. This neutralizes the positive noise through a subtractive process, effectively erasing the object. However, introducing a negative prompt in the early stages paradoxically results in the generation of the specified object. Therefore, the optimal timing for introducing these prompts is after the critical step.Adjective-Based Alteration: The influence of adjective-based negative prompts peaks around the 10th diffusion step. During the initial stages, the absence of the object leads to a subdued response. Between the 5th and 10th steps, as the object becomes clearer, the negative prompt accurately focuses on the intended area and maintains its influence.Cross-Attention DynamicsAt the peak around the 5th step for noun-based prompts, the negative prompt attempts to generate objects in the middle of the image, regardless of the positive prompt's context. As this process approaches its peak, the negative prompt begins to assimilate layout cues from its positive counterpart, trying to remove the object. This represents the zenith of its influence.For adjective-based prompts, during the peak around the 10th step, the negative prompt maintains its influence on the intended area, accurately targeting the object as it becomes clear.The study highlights the paradoxical effect of introducing negative prompts in the early stages of diffusion, leading to the unintended generation of the specified object. This finding suggests that the timing of negative prompt introduction is crucial for achieving the desired outcome.Reverse Activation PhenomenonA significant phenomenon observed in the study is Reverse Activation. This occurs when a negative prompt, introduced early in the diffusion process, unexpectedly leads to the generation of the specified object within the context of that negative prompt. To explain this, researchers borrowed the concept of the energy function from Energy-Based Models to represent data distribution.Real-world distributions often feature elements like clear blue skies or uniform backgrounds, alongside distinct objects such as the Eiffel Tower. These elements typically possess low energy scores, making the model inclined to generate them. The energy function is designed to assign lower energy levels to more 'likely' or 'natural' images according to the model’s training data, and higher energy levels to less likely ones.A positive difference indicates that the presence of the negative prompt effectively induces the inclusion of this component in the positive noise. The presence of a negative prompt promotes the formation of the object within the positive noise. Without the negative prompt, implicit guidance is insufficient to generate the intended object. The application of a negative prompt intensifies the distribution guidance towards the object, preventing it from materializing.As a result, negative prompts typically do not attend to the correct place until step 5, well after the application of positive prompts. The use of negative prompts in the initial steps can significantly skew the diffusion process, potentially altering the background.ConclusionsDo not step less than 10th times, going beyond 25th times does not make the difference for negative prompting.Negative prompts could enhance your positive prompts, depending on how well the model and LoRA have learn their keywords, so they could be understood as an extension of their counterparts.Weighting-up negative keywords may cause reverse activation, breaking up your image, try keeping the ratio influence of all your LoRAs and models equals.Referencehttps://synthical.com/article/Understanding-the-Impact-of-Negative-Prompts%3A-When-and-How-Do-They-Take-Effect%3F-171ebba1-5ca7-410e-8cf9-c8b8c98d37b6?
13
[ 🔥🔥🔥 SD3 MEDIUM OPEN DOWNLOAD - 2024.06.12 🔥🔥🔥]

[ 🔥🔥🔥 SD3 MEDIUM OPEN DOWNLOAD - 2024.06.12 🔥🔥🔥]

Finally! It's happening! The Medium version will be released first!+Stability.AICo-CEO Christian Laporte has announced the release of the weights.Stable Diffusion 3 Medium, our most advanced text-to-image model, will soon be available! You can download the weights from Hugging Face starting Wednesday, June 12.SD3 Medium is the SD3 model with 2 billion parameters, designed to excel in areas where previous models struggled. Key features include:• Photorealism: Overcomes common artifacts in hands and faces to deliver high-quality images without complex workflows.• Typography: Provides powerful typography results that surpass the latest large models.• Performance: Optimized size and efficiency make it ideal for both consumer systems and enterprise workloads.• Fine-Tuning: Can absorb fine details from small datasets, perfect for customization and creativity.SD3 Medium weights and code are available for non-commercial use only. If you wish to discuss a self-hosting license for commercial use of Stable Diffusion 3, please fill out the form below and our team will contact you shortly.+ @everyone
27
4
What exactly are the "node" and the "workflow" in AI image platform (explanation for the beginner)

What exactly are the "node" and the "workflow" in AI image platform (explanation for the beginner)

The Traditional Way of Generating AI Images for the BeginnerIf you are a beginner in the AI community, maybe you will be very confused and have no clue about what is "Node", and "Workflow" and their relations with "AI Tools" in the TensorArtTo start with the most simple way. We need to first mention how the user generates an image using a "Remixing" button that brings us to the "Normal Creation menu"Needless to say, by just editing the prompt (what you would like to see your picture look like) and negative prompt (what you do not want to see in the output image). Then push the Generate button, and the wonderful AI tool will kindly draw the new illustration serving you within a minute!!!!That sounds great, don't you think? If we imagine how humans spent a huge amount of time in the past to publish just 1 single piece of art. (Yeah, today, in 2024, in my personal opinion, both AI and human abilities are still not fully replaceable, especially in the terms of beautiful perfect hand :P ) However, the backbone or what happens behind the User-friendly menu allows us to "Select model", "Add LoRA", "Add ControlNet", "Set the aspect ratio (the original size of the image)" and so on, all of them are collected "Node" in a very complex "Workflow" PS.1. The Checkpoint or The Model often refers to the same thing. They are the core program that had been trained to draw the illustration. Each one has its strengths and weaknesses (I.E. Anime oriented or Realistic oriented) PS.2. The LoRA (Low-Rank Adaptation) is like an add-on to the Model allowing it to adapt to a different style, theme, and user preference. A concrete example is the Anime Character LoRAPS.3 The ControlNet is like a condition setting of the image. It helps the model to truly understand what is beyond the text prompt can describe. For instance, how a character poses in each direction and the angle of the camera.So here comes "The Comfyflow" (the nickname of the Workflow, people also mentioned it by the name "ComfyUI") which gives me a super headache when I see things like this for the first time in my life!!!!!!!!!(This image is a flow I have spent a lot of time studying, it is a flow for combining what is in the two images into a single one) Yeah, maybe, it is my fault that did not go to class about the workflow from the beginning or search for the tutorial on YouTube the first time (as my first language is not English). But would it be better if we had an instructor to tell us step-by-step here in Tensor.ArtAnd that is the reason why I got inspired to write this article solely for the beginner. So let's start with the main content of the article.What is ComfyFlowComfyFlow or the Workflow is an innovative AI image-generating platform that allows users to create stunning visuals with ease. To get the most out of this tool, it's important to understand two key concepts: "workflow" and "node." Let's break these down in the simplest way possible.What is a Workflow?A workflow is like a blueprint or a recipe that guides the creation of an image. Just as a recipe outlines the steps to make a dish, a workflow outlines the steps and processes needed to generate an image. It’s a sequence of actions that the AI follows to produce the final output.Think of it like this:Recipe (Workflow): Tells you what ingredients to use and in what order.Ingredients (Nodes): Each step or component used in the recipe.Despite the recommended pre-set template that TensorArt kindly gives to the users, from the beginner view's viewpoint without the knowledge of the workflow, it is not that helpful because, after clicking the "Try" button, we will bombarded with the complexity of the Node!!!!!!!What is a Node?Nodes are the building blocks of a workflow. Each node represents a specific action or process that contributes to the final image. In ComfyFlow, nodes can be thought of as individual steps in the workflow, each performing a distinct function.Imagine nodes as parts of a puzzle:Nodes: Individual pieces that fit together to complete the picture (workflow).How Do Workflows and Nodes Work Together? 1-2) Starting Point: Every workflow begins with an initial node, which might be an image input from the user, together with Checkpoint and LoRA serving the role of image references. 3-4) Processing Nodes: These are nodes that draw or modify the image in some way, such as adding color, or texture, or applying filters. 5) Ending Point: The node outputs the completed image which works very closely with the node of the previous stage in terms of sampling and VAE PS. A Variational Autoencoder (VAE) is a generative model that learns input data, such as images, to reconstruct and generate new, similar, or variations of images based on the patterns it has learned.Here is the list of nodes I have used in the normal image-generating images of my Waifu using 1checkpoint, and 2LoRAs to help the reader understand how ComfyFlow worksThe numbers 1-5 represent the overview process of the workflow and the role of each type of node I have mentioned above. However, in the case of more complex tasks like in AI Tools, the number of nodes sometimes is higher than 30!!!!!!!By the way, when starting with an empty ComfyFlow page, the way to add a node is "Right Click" -> "Add Node" -> Scroll down to the top, since the most frequently used node will be over there.1) loaders -> Load CheckPointLike in the normal task creation menu, this node is the one we can choose CheckPoint or the Core model.It is important to note that nodes work together using input/output. The "Model/CLIP/VAE" (the output) circles have to connect to the next one in which it has to correspond. We link them together by left-clicking on the circle's inner area and then drag to the destination. PS. CLIP (Contrastive Language-Image Pre-training) is a model developed by OpenAI that links images and text together in a way that helps AI understand and generate images based on textual descriptions.2) loaders -> Load LoRACheckpoint is very closely related to LoRA and that is a reason why they are connected by the input/output named "model/MODEL", "clip/CLIP"Anyway, since in this example, I have used 2 LoRAs (first for The theme of the picture and the Second for the character reference of my Waifu), two nodes of LoRAs then have to be connected as well. Here we can adjust the strength of the LoRA or the weight like it happens in the normal task generation menu.3) CLIP Text Encode (Prompt)This node is the prompt and negative prompt we normally see in the menu. The input here is only clip (Contrastive Language-Image Pre-training) and the output is "CONDITIONING" User tip: If you click on the output circle of the "Load LoRA" node and drag it to the empty area, the ComfyFlow will pop up a corresponding next node list to create a new one with ease. 4) KSampler & Empty Latent ImageThe sampling method is used to tell the AI how it should start generating visual patterns from the initial noise and everything associated with its adjustment will be set here in this type of sampling node together with "Empty Latent Image" The inputs in this step here are models (from LoRA node), positive and negative (from prompt node) and the output is "Latent"5) VAE Decode & Final output nodeOnce we establish the sampling node, the output named "LATENT" will then have to connect with "samples" Meanwhile the "vae" is the linkage between this one and the "Load Checkpoint" node from the beginning.And when everything is done the "IMAGE" as a final output here will be served at your hand.PS. An AI Tool is a more complex Workflow created to do some specific task such as swapping the face of the human in the original picture with the target face or changing the style of the input illustration to another one and etc.
25
2
PhotoReal Makeup Edition - V3 Slider

PhotoReal Makeup Edition - V3 Slider

PhotoReal Makeup Edition - V3 Slider (no trigger)Introducing the PhotoReal Makeup Edition - V3 Slider! Slide to the right to add beautiful, realistic makeup. Slide to the left to reduce the makeup effect for a more natural look. It's perfect for adjusting the makeup to get just the style you want.Try it out and see the amazing changes you can make!More Information:- Model linkYour feedback is invaluable to me. Feel free to share your experiences and suggestions in the comment section. For more personal interactions, join our Discord server where we can discuss and learn together.Thank you for your continued support!
42
4

Tips for new Users

Intro Hey there! If you're reading this, you're probably new to AI image generation and want to learn more. If you're not, you probably already know more than me :). Yeah, full disclosure: I'm still pretty inexperienced at this whole thing, but I thought I could still share some of the things I've learned with you! So, in no particular order:1. You can like your own posts I doubt there's anyone who doesn't know this already, but if you're posting your favorite generations and you care about getting likes, you can always like them yourself. Sketchy? Kinda. Do I still do it? Yes. And on the topic of getting more likes:2. Likes will often be returned Whenever I receive a like on one of my posts, I'll look at that person's pictures and heart any that I particularly enjoy. I know a lot of people do this, so one of the best ways to get people to notice and like your content is to just browse through posts and be generous with your own likes. It's a great way to get inspiration too!3. Use turbo/lightning LORAs If you find yourself running out of credits, there are ways to conserve them. When I'm iterating on an idea, I'll use a SDXL model (Meina XL) paired with this LORA. This lets me get high quality images in 10 steps for only 0.4 credits! It's really nice, and works with any SDXL model. Unfortunately, if there is a similar method for speeding up SD 1.5 models I don't know it, so it only works with XL.4. Use ADetailer smartly ADetailer is the best solution I've found for improving faces and hands. It's also a little difficult to figure out. So, though I'm still not a professional with it, I thought I could share some of the tricks I've learned. The models I normally use are face_yolo8s.pt and hand_yolo8s.pt. The "8s" versions are better than the "8n" versions, though they are slightly slower. In addition to these models, I'll often add the Attractive Eyes and Perfect Hand LORAs respectively. These are all just little things you can do to improve these notoriously hard parts of image generation. Also, using ADetailer before upscaling the image is cheaper in terms of credits, though the upscaling process can sometimes mess up the hands and face a little bit so there's some give and take there.5. Use an image editing app Wait a minute, I hear you saying, isn't this a guide for using Tensor Art? Yes, but you can still use other tools to improve your images. If I don't like a specific part of my image, I'll download it, open it in Krita (Or Photoshop or Gimp) and work on it. My art skills are pretty bad, (which is why I'm using this site in the first place,) but I can still remove, recolor, or edit certain aspects of the image. I can then reupload it to Tensor Art, and Img2img with a high denoising strength to improve it further. You could also just try inpainting the specific thing you want to change, but I always find it a bit of a struggle to get inpaint to make the changes I want.6. Experiment! The best way to learn is to do, so just start generating images, fiddling with settings, and trying new things. I still feel like I'm learning new stuff every day, and this technology is improving so fast that I don't think anyone will ever truly master it. But we can still try our hardest and hone our skills through experimentation, sharing knowledge, and getting more familiar with these models. And all the anime girls are a big plus too.Outro If you have anything to add, or even a tip you'd like to share, definitely leave a comment and maybe I can add it to this article. This list is obviously not exhaustive, and I'm no where near as talented as some of the people on this platform. Still though, I hope to have helped at least one person today. If that was you, maybe give the article a like? I appreciate it a ton, so if you enjoyed, just let me know. Thanks for reading!
38
• MOOD MAGIC SERIES • I. Melancholy

• MOOD MAGIC SERIES • I. Melancholy

MOOD MAGIC: adding emotion to your promptsMelancholy & GloomOvercast: Cloud-covered skies for subdued lighting.Dim Lighting: Limited light sources for creating deep shadows.Muted Colors: Toned-down color palette to convey sadness or desolation.Dusky: Twilight ambiance, suggesting the fading light of day.Foggy: A thick mist that obscures details and softens the scene.Drizzly: Gentle rain that adds a reflective, melancholic quality.Cloudy: Thick clouds that reduce brightness and saturate the scene with grey.Desaturated: Low color saturation to enhance the bleak feel.Shadowed: Prominent shadows that deepen the mood.Moody Lighting: Emotionally charged lighting with strong contrasts.Gloomy: Overall dark and dismal atmosphere.Monochrome: Black and white or single-color dominance to strip away cheer.Underexposed: Darker exposure to mimic a sense of foreboding.Chiaroscuro: Strong contrasts between light and dark, emphasizing turmoil.Hazy: Blurred or smoky atmosphere, creating a sense of mystery or unease.Twilight: Dim natural lighting that can feel lonely or isolating.Stormy: Implication of an approaching or ongoing storm to add tension.Wintery: Cold, barren landscape cues, even in urban settings.Grainy: Visual noise that adds an old or troubled quality.Bleak: Stark, harsh lighting or barren scenery settings.Ominous Clouds: Dark, menacing clouds that threaten bad weather.Subdued Tones: Soft, low-key colors that don't catch the eye.Cold Colors: Blues and greys to suggest chilliness and discomfort.Rusty: Implications of decay and neglect.Aged: A sense of time wearing down the scene, historical weariness.Soft Focus: Slightly out-of-focus elements to create a sense of disorientation or confusion.Tenebrous: Deeply shadowed, almost pitch-dark.Low-Key Lighting: Minimal lighting mostly in darkness with occasional highlights.Pensive: Engaged in, involving, or reflecting deep or serious thought.Yearning: A feeling of intense longing for something typically something that one has lost or been separated from.Weary: Conveying a sense of tiredness or exhaustion, both physical and emotional.Sparse: Minimalist or bare settings that suggest simplicity or emptiness.Brooding: A deep, serious, and sometimes dark contemplation.Silent: Lack of sound or motion, emphasizing solitude or contemplation.Ephemeral: Fleeting or transitory, suggesting the transient nature of moments and emotions.Desolate: Emptiness that conveys a sense of abandonment or loneliness.Poetic: Imbued with a sense of beauty and melancholy, often through lyrical expression.Moody Skies: Cloudy, stormy, or unsettled skies that reflect a turbulent emotional landscape.Cold Light: Harsh, unyielding light that doesn’t warm but isolates subjects.Autumnal: Related to autumn, often seen as a melancholic season due to its association with the end of summer.Faded: Colors or elements that have lost brightness, suggesting the passing of time.Blue Hour: Moody cool natural lighting obtained in the twilight hour just after sunset or just before sunrise.Example using Stable Diffusion SDXL + refinerCheckpoint: RealVis4Cfg: 5.5Steps: 40Sampler: DPM++ 3m SDE KarrasVisualize a close-up portrait of a young woman standing by a foggy window, her gaze distant and contemplative. The room is dimly lit, with only a soft, diffuse light filtering through the heavy overcast outside, casting subtle shadows across her face. The colors are desaturated, emphasizing a palette of cool grays and muted blues that reflect her somber mood. Her expression is serene yet melancholic, with her eyes slightly downcast as if lost in thought. The background is blurred, enhancing the sense of isolation and introspection. This portrait captures the essence of melancholy, framed in a moment of quiet solitude.negative: illustration, cartoon, anime, 3d, digital art, bad quality, CGI, sketch, drawn, blurry, painting, worst quality, low quality, bad anatomy, bad hands, bad body, missing fingers, extra digit, fewer digits
2
Buzz words: LIGHTING

Buzz words: LIGHTING

Getting the lighting right is key to making your AI-generated images look super realistic. This guide gives you the top keywords to use in your prompts to nail the lighting every time. Whether you're after dramatic shadows or soft, natural light, these tips will help your images look lifelike and set the tone to your composition.Ambient light:Soft, even lighting that fills the entire scene, reducing shadows.Chiaroscuro Lighting:A technique that uses strong contrasts between light and dark to create a dramatic, three-dimensional effect.Rim light:Light that outlines the subject, emphasizing its edges and creating a glowing effect.Diffused light:Soft light scattered in many directions, minimizing harsh shadows.Natural light:Light from the sun, moon, or other natural sources, offering realism and variationBacklight:Light coming from behind the subject, creating a silhouette or halo effect.Volumetric light:Light that interacts with particles in the air, such as fog or dust, creating visible light rays and enhancing the sense of depth in the scene.Polarized light:Light that vibrates in parallel planes.Emissive light:Light emitted from surfaces or objects themselves, often used to simulate glowing materials or lights.Directional light:Focused light from a specific direction, creating strong shadows and highlights.Soft light:Gentle light that produces minimal shadows, creating a smoother look.Hard light:Sharp, intense light that casts strong shadows and highlights details.Spotlight:Intense focused beam that highlights a set area or subject.Artificial light:Light from man-made sources allowing precise control over the scene.Holagen, florescent, blacklight, led, xenon, plasma, ultraviolet, incandescent, neon, Infrared, sodium vapor lights, metal halide lights, krypton, photoluminescent, ceramic metal halide, HMI, CCFL, CFLLow key light:Predominantly dark lighting with high contrast, often creating a dramatic or moody atmosphere.High Key Light:Bright, low-contrast lighting that minimizes shadows.Bounce Lighting/Reflected Lighting:Light reflected off a surface to soften the effect and spread it more evenly.Side Lighting:Light coming from the side of the subject.Caustic Lighting:Light patterns created when light is refracted or reflected through transparent or reflective materials, producing intricate and often beautiful effects.Uplighting:Light directed upwards. Great for emphasizing architectural features.Color Gel Lighting:The use of colored filters over lights to alter the color or mood of the scene.Gobo Lighting:Using a stencil or template placed in front of a light source to project patterns or shapes onto a surface.Split Lighting:Lighting that illuminates one half of the subject's face while leaving the other half in shadow, creating a strong, dramatic effectButterfly Lighting:Light placed above and in front of the subject, creating a butterfly-shaped shadow under the nose, often used in glamour photography.Rembrandt Lighting:technique where light creates a triangle of illumination on the cheek opposite the light source, adding depth and character.Specular lighting:Sharp, bright reflections from shiny surfaces, emphasizing glossiness and texture.Natural Breakup Lighting/Dappled Lighting:Using irregular patterns to mimic natural light effects, such as light filtering through leaves.Subsurface Scattering:Light that penetrates the surface of a translucent material, scattering within and then exiting at a different point, adding realism to materials like skin or wax.Golden Hour:Warm golden natural lighting obtained shortly after sunrise or shortly before sunset. Creates long soft shadows.Blue Hour:Moody cool natural lighting obtained in the twilight hour just after sunset or just before sunrise.Clamshell Lighting:portrait lighting setup using two light sources, one above and one below the subject's face.Catch light:A small reflection of the light source in the subject's eyes, adding life and dimension to portraits.Cross lighting:two light sources positioned at opposite sides of the subject, creating dramatic shadows and highlights.Tenebrism:Aggressive contrast between light and dark producing dark and gloomy images.Contre-jour:Lighting technique that produces clear silhouettes by the use of backlighting.Sfumato:Artistic lighting technique soft transitions between colors and tones resulting in a dreamy effect with no clear boundaries. Ie. The Mona Lisa.Ray tracing: Rendering technique that simulates the way the light interacts with the scene. Traces the light from the source, bounces off surfaces and reaches the viewers eye. Three point lighting:Cinematic lighting technique using key light, fill light and backlight. Global Illumination: Computer graphic technique that adds more realistic lighting to 3d scenery. Bloom: simulates the glow around bright light sources, creating a soft halo. Luminescence:emission of light by a substance not resulting from heat. It occurs through various processes such as chemical reactions, electrical energy, or other means.Bioluminescence:A cold light produced out of a chemical reaction inside of a living organism.
1
Quickstart Guide to Stable Video Diffusion

Quickstart Guide to Stable Video Diffusion

What is Stable Video Diffusion (SVD)?Stable Video Diffusion (SVD) from Stability AI, is an extremely powerful image-to-video model, which accepts an image input, into which it “injects” motion, producing some fantastic scenes.SVD is a latent diffusion model trained to generate short video clips from image inputs. There are two models. The first, img2vid, was trained to generate 14 frames of motion at a resolution of 576×1024, and the second, img2vid-xt is a finetune of the first, trained to generate 25 frames of motion at the same resolution.The newly released (2/2024) SVD 1.1 is further finetuned on a set of parameters to produce excellent, high-quality outputs, but requires specific settings, detailed below.Why should I be excited by SVD?SVD creates beautifully consistent video movement from our static images!How can I use SVD?ComfyUI is leading the pack when it comes to SVD image generation, with official SVD support! 25 frames of 1024×576 video uses < 10 GB VRAM to generate.It’s entirely possible to run the img2vid and img2vid-xt models on a GTX 1080 with 8GB of VRAM!There’s still no word (as of 11/28) on official SVD support in Automatic1111.If you’d like to try SVD on Google Colab, this workbook works on the Free Tier; https://github.com/sagiodev/stable-video-diffusion-img2vid/. Generation time varies, but is generally around 2 minutes on a V100 GPU.You’ll need to download one of the SVD models, from the links below, placing them in the ComfyUI/models/checkpoints directoryAfter updating your ComfyUI installation, you’ll see new nodes for VideoLinearCFGGuidance and SVD_img2vid _Conditioning. The Conditioning node takes the following inputs;You can download ComfyUI workflows for img2video and txt2video below, but keep in mind you’ll need to have an updated ComfyUI, and also may be missing additional nodes for Video. I recommend using the ComfyUI Manager to identify and download missing nodes!Suggested SettingsThe settings below are suggested settings for each SVD component (node), which I’ve found produce the most consistently useable outputs, with the img2vid and img2vid-xt models.Settings – Img2vid-xt-1.1February 2024 saw the release of a finetuned SVD model, version 1.1. This version only works with a very specific set of parameters to improve the consistency of outputs. If using the Img2vid-xt-1.1 model, the following settings must be applied to produce the best results;The easiest way to generate videosin tensor.art, you can generate videos very easily compared to the explanation above, all you need to do is input the prompt you want, select the model you like, set the ratio and set the frame in the animatediff menu.Output ExamplesLimitationsIt’s not perfect! Currently there are a few issues with the implementation, including;Generations are short! Only <=4 second generations are possible, at present.Sometimes there’s no motion in the outputs. We can tweak the conditioning parameters, but sometimes the images just refuse to move.The models cannot be controlled through text.Faces, and bodies in general, often aren’t the best!
List of style collection - focusing on anime charactor examples (continue updating)

List of style collection - focusing on anime charactor examples (continue updating)

AI image-generating platforms like Tensor.art offer diverse anime styles, enabling users to create artwork in various distinct masterpieces of art inspired by popular anime aesthetics. These collections aim to cater to different preferences from classic to contemporary anime illustrations within one place.P.S.1 I will continue updating this post maybe every 2 weeks when I find a unique style (both for LoRA and model) that is worth listing here solely from my perspective - Anyway if anyone has a list of favorite styles in mind, feel free to share them here or even create your post. :DP.S.2 People normally mix multiple LoRA at once, and the core model (checkpoint) has a variation in base style depending on the prompt used. Therefore, in the following example, I will choose only a single LoRA or Checkpoint to represent without mixing anything. However, if confusion about the contribution to the style happens, I have to apologize in advance since I am just a beginner in the art community. Here are some examples: Anime Lineart / Manga-like (线稿/線画/マンガ風/漫画风) Style (LORA) https://tensor.art/models/623935989624337542 Spacezin Sketch Style (LoRA) https://tensor.art/models/638083414328801488 Cute Chibi - V.1 (LoRA) https://tensor.art/models/726716640076597245 CAT - Citron Anime Treasure (Checkpoint) https://tensor.art/models/713607777118974323 LizMix V.7.0 (Checkpoint) https://tensor.art/models/721034681811855891 Flower style - (LORA) https://tensor.art/models/699582840586758007 Art Nouveau Style - Oosayam (LoRA) https://tensor.art/models/654562112921690173 Torino Style - v.2.0.09 (LoRA) https://tensor.art/models/705577639974520212 Yody PVC 3D Print - 1.0 (Checkpoint) https://tensor.art/models/673632484975460872 Eldritch Expressionism style (LoRA) https://tensor.art/models/708171473803739178 [Y5] Impressionism Style 印象派风格 (LoRA) https://tensor.art/models/621173217551417505 surrealism - 2024-02-17 (LoRA) https://tensor.art/models/695557949424221333 pop-art - 01 style (LoRA) https://tensor.art/models/697182692602582375 FF Style: Kazimir Malevich | Suprematism (LoRA) https://tensor.art/models/655758742350092928 Hoping these collections (today and in the future) will allow A.I. artists and enthusiasts to generate anime-inspired images effortlessly, blending creativity with advanced AI technology to bring their visions to life. :D
18
2
Prompt reference for "Lighting Effects"

Prompt reference for "Lighting Effects"

Hello. I usually use "lighting/lighting effects" when generating images.I will introduce some of the "words" I use when I want to add something.Please note that these words alone do not provide 100% effectiveness, and the base modelThe effect you get will differ depending on the LoRA sampling method and where you place it in the prompt.Words related to "lighting effects"・ Backlight :  Light from behind the subject・ Colorful lighting :  The impression itself is not colored, but the color changes depending on the light.・ moody lighting :  natural lighting, not direct artificial light・ studio lighting :  A term used to describe the artificial lighting of a photography studio.・ Directional Light :  directional light source is a light source that shines parallel rays in a selected direction.・ Dramatic lighting :  Lighting techniques in the field of photography・ Spot lighting :  A lighting technique that uses artificial light in a small area.・ Cinematic lighting :  A single word that describes several lighting techniques used in movies.・ Bounce Lighting :  Light reflected by a reflex plate, etc.・ Practical Lighting :  Photographs and videos that depict the light source itself in the composition・ Volumetric lighting :  A word derived from 3DCG. It tends to be a picture with a divine golden light source.・ Dynamic lighting :  I don't really understand what it means, but it tends to create high-contrast images.・ Warm lighting :  Creates a warm picture illuminated with warm colors・ Cold lighting :  Lights with a cold light source.・ High-key lighting :  Soft light, minimal shadows, low contrast, resulting in bright frames・ Low-key lighting :  It provides high contrast, but the impression is a little weak.・ Hard light :  Strong light. Highlights appear strong.・ soft light :  A word that refers to faint light.・ strobe lighting :  strong artificial light (stroboscopic lighting)・ Ambient light :  An English word that refers to ambient lighting/indoor lighting.・ flash lighting  :  For some reason, the characters themselves tend to emit light, and there are often flashes of light. (flash lighting photography) ・ Natural lighting :  This tends to create a natural-looking picture that feels contrasting with artificial light.
34
2
The future of AI image generation: endless possibilities -

The future of AI image generation: endless possibilities -

introduction{{For those who are about to start AI image generation}}In recent years, advances in AI technology have brought about revolutionary changes in the field of image generation. In particular, AI-powered illustration generation has become a powerful tool for artists and designers. However, as this technology advances, issues of creativity and copyright arise. In this article, we will explain the possibilities of AI image generation, specific use cases, how to create prompts, how to use LoRA and its effects, keywords for improving image quality, consideration for copyright, etc.Fundamentals of AI image generationAI image generation uses artificial intelligence to learn from data and generate new images. Deep learning techniques are often used for this, and one notable approach is stable diffusion. Stable Diffusion employs a probabilistic method called a diffusion model to gradually remove noise during image generation, resulting in highly realistic, high-quality output.Generating real imagesAI technology is excellent not only for creating cute illustrations, but also for generating realistic images. For example, you can generate high-resolution images that resemble photorealistic landscapes or portraits. By utilizing Stable Diffusion, it is possible to generate more detailed images, which expands the possibilities of application in various fields such as advertising, film production, and game design.Generate cute illustrationsOne of the practical applications of AI image generation is the creation of cute illustrations. This is useful for things like character design and avatar creation, allowing you to quickly generate different styles. This process typically involves collecting a large dataset of illustrations, training an AI model on this data to learn different styles and patterns, and generating new illustrations based on user input or keywords.creativity and AIAI image generation also influences creative ideas. Artists can use her AI-generated images as inspiration for new works or expand on ideas, which can lead to the creation of new styles and concepts never thought of before.Use and effects of LoRALoRA (Low-Rank Adaptation) is a technique used to improve the performance of AI models. Its impacts include:1. Fine-tune models: LoRA allows you to fine-tune existing AI models to learn specific styles and features, allowing for customization based on user needs.2. Efficient learning: LoRA reduces the need for large-scale data collection and training costs by efficiently training models using small datasets.3. Rapid adaptation: LoRA allows you to quickly adapt to new styles and trends, making it easy to generate images tailored to your current needs.For example, LoRA can be leveraged to efficiently achieve high-quality results when generating illustrations in a specific style.Creating a promptWhen instructing an AI to generate illustrations, it's important to create effective prompts. Key points for creating prompts include providing specific instructions, using the right keywords, trial and error, and an optional reference image to help the AI figure out what you're looking for. Keywords for improving image qualityWhen creating prompts for AI image generation, you can incorporate keywords related to image quality improvement to improve the overall quality of the images generated. Useful keywords include "high resolution," "detail," "clean lines," "high quality," "sharp," "bright colors," and "photorealistic."Copyright considerationsImage generation using AI also raises copyright issues. If the dataset used to train your AI model contains copyrighted works, the resulting images may infringe your copyright. When using AI image generation tools, it's important to be aware of the data source, ensure that the generated images comply with copyright laws, and check the license agreement.conclusionAI image generation offers great possibilities for artists and designers, but it also raises challenges related to copyright. By using data responsibly and understanding copyright law, you can leverage AI technology to create innovative work. Leveraging technologies like LoRA can further improve efficiency and quality. Users can adjust the output by incorporating image enhancement keywords into the prompt. Let's explore new ways of expression while being aware of advances in AI technology and the considerations that come with it! !
22
18
Stylistic QR Code with Stable Diffusion

Stylistic QR Code with Stable Diffusion

source: anfu.me (now you can easyly create QRcode with tensor.art inside controlnet, next time i will create guide about that)Yesterday, I created this image using Stable Diffusion and ControlNet, and shared on Twitter and Instagram – an illustration that also functions as a scannable QR code.The process of creating it was super fun, and I’m quite satisfied with the outcome.In this post, I would like to share some insights into my learning journey and the approaches I adopted to create this image. Additionally, I want to take this opportunity to credit the remarkable tools and models that made this project possible.Get into the Stable DiffusionThis year has witnessed an explosion of mind-boggling AI technologies, such as ChatGPT, DALL-E, Midjourney, Stable Diffusion, and many more. As a former photographer also with some interest in design and art, being able to generate images directly from imagination in minutes is undeniably tempting.So I started by trying Midjourney, it’s super easy to use, very expressive, and the quality is actually pretty good. It would honestly be my recommendation for anyone who wants to get started with generative AI art.By the way, Inès has also delved into it and become quite good at it now, go check her work on her new Instagram account  @a.i.nes.On my end, being a programmer with strong preferences, I would naturally seek for greater control over the process. This brought me to the realm of Stable Diffusion. I started with this guide: Stable Diffusion LoRA Models: A Complete Guide. The benefit of being late to the party is that there are already a lot of tools and guides ready to use. Setting up the environment quite straightforward and luckily my M1 Max’s GPU is supported.QR Code ImageA few weeks ago, nhciao on reddit posted a series of artistic QR codes created using Stable Diffusion and ControlNet. The concept behind them fascinated me, and I defintely want to make one for my own. So I did some research and managed to find the original article in Chinese: Use AI to Generate Scannable Images. The author provided insights into their motivations and the process of training the model, although they did not release the model itself. On the other hand, they are building a service called QRBTF.AI to generate such QR code, however it is not yet available.Until another day I found an community model QR Pattern Controlnet Model on CivitAI. I know I got to give it a try!SetupMy goal was to generate a QR code image that directs to my website while elements that reflect my interests. I ended up taking a slightly cypherpunk style with a character representing myself :PDisclaimer: I’m certainly far from being an expert in AI or related fields. In this post, I’m simply sharing what I’ve learned and the process I followed. My understanding may not be entirely accurate, and there are likely optimizations that could simplify the process. If you have any suggestions or comments, please feel free to reach out using the links at the bottom of the page. Thank you!1. Setup EnvironmentI pretty much follows Stable Diffusion LoRA Models: A Complete Guide to install the web ui AUTOMATIC1111/stable-diffusion-webui, download models you are interested in from CivitAI, etc. As a side note, I found that the user experience of the web ui is not super friendly, some of them I guess are a bit architectural issues that might not be easy to improve, but luckily I found a pretty nice theme canisminor1990/sd-webui-kitchen-theme that improves a bunch of small things.In order to use ControlNet, you will also need to install the Mikubill/sd-webui-controlnet extension for the web ui.Then you can download the QR Pattern Controlnet Model, putt the two files (.safetensors and .yaml) under stable-diffusion-webui/models/ControlNet folder, and restart the web ui.2. Create a QR CodeThere are hundreds of QR Code generators full of adds or paid services, and we certainly don’t need those fanciness – because we are going to make it much more fancier 😝!So I end up found the QR Code Generator Library, a playground of an open source QR Code generator. It’s simple but exactly what I need! It’s better to use medium error correction level or above to make it more easy recognizable later. Small tip that you can try with different Mask pattern to find a better color destribution that fits your design.3. Text to ImageAs the regular Text2Image workflow, we need to provide some prompts for the AI to generate the image from. Here is the prompts I used:Prompts(one male engineer), medium curly hair, from side, (mechanics), circuit board, steampunk, machine, studio, table, science fiction, high contrast, high key, cinematic light, (masterpiece, top quality, best quality, official art, beautiful and aesthetic:1.3), extreme detailed, highest detailed, (ultra-detailed)Negative Prompts(worst quality, low quality:2), overexposure, watermark, text, easynegative, ugly, (blurry:2), bad_prompt,bad-artist, bad hand, ng_deepnegative_v1_75tThen we need to go the ControlNet section, and upload the QR code image we generated earlier. And configure the parameters as suggested in the model homepage.Then you can start to generate a few images and see if it met your expectations. You will also need to check if the generated image is scannable, if not, you can tweak the Start controling step and End controling step to find a good balance between stylization and QRCode-likeness.4. I’m feeling lucky!After finding a set of parameters that I am happy with, I will increase the Batch Count to around 100 and let the model generate variations randomly. Later I can go through them and pick one with the best conposition and details for further refinement. This can take a lot of time, and also a lot of resources from your processors. So I usually start it before going to bed and leave it overnight.Here are some examples of the generated variations (not all of them are scannable):From approximately one hundred variations, I ultimately chose the following image as the starting point:It gets pretty interesting composition, while being less obvious as a QR code. So I decided to proceed with it and add add a bit more details. (You can compare it with the final result to see the changes I made.)5. Refining DetailsUpdate: I recently built a toolkit to help with this process, check my new blog post 👉 Refine AI Generated QR Code for more details.The generated images from the model are not perfect in every detail. For instance, you may have noticed that the hand and face appear slightly distorted, and the three anchor boxes in the corner are less visually appealing. We can use the inpaint feature to tell the model to redraw some parts of the image (it would better if you keep the same or similiar prompts as the original generation).Inpainting typically requires a similar amount of time as generating a text-to-image, and it involves either luck or patience. Often, I utilize Photoshop to "borrow" some parts from previously generated images and utilize the spot healing brush tool to clean up glitches and artifacts. My Photoshop layers would looks like this:After making these adjustments, I’ll send the combined image back for inpainting again to ensure a more seamless blend. Or to search for some other components that I didn’t found in other images.Specifically on the QR Code, in some cases ControlNet may not have enough prioritize, causing the prompts to take over and result in certain parts of the QR Code not matching. To address this, I would overlay the original QR Code image onto the generated image (as shown in the left image below), identify any mismatches, and use a brush tool to paint those parts with the correct colors (as shown in the right image below).I then export the marked image for inpainting once again, adjusting the Denoising strength to approximately 0.7. This would ensures that the model overrides our marks while still respecting the color to some degree.Ultimately, I iterate through this process multiple times until I am satisfied with every detail.6. UpscalingThe recommended generation size is 920x920 pixels. However, the model does not always generate highly detailed results at the pixel level. As a result, details like the face and hands can appear blurry when they are too small. To overcome this, we can upscale the image, providing the model with more pixels to work with. The SD Upscaler script in the img2img tab is particularly effective for this purpose. You can refer to the guide Upscale Images With Stable Diffusion for more information.7. Post-processingLastly, I use Photoshop and Lightroom for subtle color grading and post-processing, and we are done!The one I end up with not very good error tolerance, you might need to try a few times or use a more forgiving scanner to get it scanned :PAnd using the similarly process, I made another one for Inès:ConclusionCreating this image took me a full day, with a total of 10 hours of learning, generating, and refining. The process was incredibly enjoyable for me, and I am thrilled with the end result! I hope this post can offer you some fundamental concepts or inspire you to embark on your own creative journey. There is undoubtedly much more to explore in this field, and I eager to see what’s coming next!Join my Discord Server and let’s explore more together!If you want to learn more about the refining process, go check my new blog post: Refining AI Generated QR Code.ReferencesHere are the list of resources for easier reference.ConceptsStable DiffusionControlNetToolsHardwares & Softwares I am using.AUTOMATIC1111/stable-diffusion-webui - Web UI for Stable Diffusioncanisminor1990/sd-webui-kitchen-theme - Nice UI enhancementMikubill/sd-webui-controlnet - ControlNet extension for the webuiQR Code Generator Library - QR code generator that is ad-free and customisableAdobe Photoshop - The tool I used to blend the QR code and the illustrationModelsControl Net Models for QR Code (you can pick one of them)QR Pattern Controlnet ModelControlnet QR Code MonsterIoC Lab Control NetCheckpoint Model (you can use any checkpoints you like)Ghostmix Checkpoint - A very high quality checkpoint I use. You can use any other checkpoints you likeTutorialsStable Diffusion LoRA Models: A Complete Guide - The one I used to get started(Chinese) Use AI to genereate scannable images - Unfortunately the article is in Chinese and I didn’t find a English version of it.Upscale Images With Stable Diffusion - Enlarge the image while adding more details
The Marvel of Tanjore Temple: A Timeless Treasure

The Marvel of Tanjore Temple: A Timeless Treasure

IntroductionThe Tanjore Temple, also known as Brihadeeswarar Temple, is a striking example of India’s architectural grandeur and rich cultural heritage. Nestled in the historic town of Thanjavur in Tamil Nadu, this UNESCO World Heritage Site draws thousands of visitors each year, eager to marvel at its towering vimana (temple tower), intricate carvings, and vibrant history.Historical BackgroundBuilt by the great Chola emperor Raja Raja Chola I in the 11th century, the Tanjore Temple stands as a testament to the ingenuity and vision of ancient Indian architects and artisans. Completed in 1010 AD, it celebrated its millennium in 2010, marking a thousand years of awe-inspiring presence.Architectural SplendorThe VimanaThe most striking feature of the Tanjore Temple is its colossal vimana, which rises to a height of 66 meters. This towering structure is crowned with a massive dome, made from a single piece of granite weighing approximately 80 tons. This engineering marvel leaves historians and architects alike in awe, given the lack of modern machinery during its construction.The SanctumAt the heart of the temple lies the sanctum sanctorum, housing a massive Shiva lingam. The inner walls of the sanctum are adorned with exquisite frescoes and murals, depicting various mythological scenes and showcasing the artistic brilliance of the Chola period.Intricate CarvingsEvery inch of the Tanjore Temple is a canvas of intricate carvings. From the elaborate depictions of deities and mythological narratives on the walls to the ornate pillars and ceilings, the temple is a visual feast. These carvings not only serve as decorative elements but also provide a glimpse into the socio-cultural milieu of the Chola dynasty.Cultural SignificanceReligious ImportanceThe Tanjore Temple is dedicated to Lord Shiva and holds immense religious significance for Hindus. It is one of the largest temples in India and serves as a major pilgrimage site, especially during festivals like Maha Shivaratri. Devotees from across the country flock to the temple to seek blessings and participate in the vibrant festivities.Artistic HeritageThe temple is a treasure trove of Chola art and architecture. The frescoes and murals, in particular, offer invaluable insights into the artistic and cultural landscape of the period. The depictions of dance forms, musical instruments, and attire provide a vivid picture of the era’s cultural richness.Visiting Tanjore TempleBest Time to VisitThe ideal time to visit Tanjore Temple is between October and March when the weather is pleasant. The temple complex is open from early morning till evening, allowing visitors ample time to explore and soak in its magnificence.How to ReachThanjavur is well-connected by road, rail, and air. The nearest airport is Tiruchirappalli International Airport, about 60 kilometers away. Thanjavur Junction is the nearest railway station, with regular trains from major cities like Chennai, Bangalore, and Coimbatore. Buses and taxis are also readily available for local transportation.AccommodationThanjavur offers a range of accommodation options, from budget hotels to luxury resorts, catering to the diverse needs of travelers. Staying in the town allows visitors to explore not just the temple, but also other nearby attractions like the Thanjavur Royal Palace and the Saraswathi Mahal Library.ConclusionThe Tanjore Temple is more than just an architectural marvel; it is a living testament to India’s rich cultural and religious heritage. Its towering vimana, intricate carvings, and historical significance make it a must-visit destination for history enthusiasts, art lovers, and spiritual seekers alike. Plan your visit to this timeless treasure and immerse yourself in the grandeur of the Chola dynasty.
4
[Guide] Make your own Loras, easy and free

[Guide] Make your own Loras, easy and free

This article helped me to create my first Lora and upload it to Tensor.art, although Tensor.art has its own Lora Train , this article helps to understand how to create Lora well.🏭 PreambleEven if you don't know where to start or don't have a powerful computer, I can guide you to making your first Lora and more!In this guide we'll be using resources from my GitHub page. If you're new to Stable Diffusion I also have a full guide to generate your own images and learn useful tools.I'm making this guide for the joy it brings me to share my hobbies and the work I put into them. I believe all information should be free for everyone, including image generation software. However I do not support you if you want to use AI to trick people, scam people, or break the law. I just do it for fun.Also here's a page where I collect Hololive loras.📃What you needAn internet connection. You can even do this from your phone if you want to (as long as you can prevent the tab from closing).Knowledge about what Loras are and how to use them.Patience. I'll try to explain these new concepts in an easy way. Just try to read carefully, use critical thinking, and don't give up if you encounter errors.🎴Making a Lorat has a reputation for being difficult. So many options and nobody explains what any of them do. Well, I've streamlined the process such that anyone can make their own Lora starting from nothing in under an hour. All while keeping some advanced settings you can use later on.You could of course train a Lora in your own computer, granted that you have an Nvidia graphics card with 6 GB of VRAM or more. We won't be doing that in this guide though, we'll be using Google Colab, which lets you borrow Google's powerful computers and graphics cards for free for a few hours a day (some say it's 20 hours a week). You can also pay $10 to get up to 50 extra hours, but you don't have to. We'll also be using a little bit of Google Drive storage.This guide focuses on anime, but it also works for photorealism. However I won't help you if you want to copy real people's faces without their consent.🎡 Types of LoraAs you may know, a Lora can be trained and used for:A character or personAn artstyleA poseA piece of clothingetcHowever there are also different types of Lora now:LoRA: The classic, works well for most cases.LoCon: Has more layers which learn more aspects of the training data. Very good for artstyles.LoHa, LoKR, (IA)^3: These use novel mathematical algorithms to process the training data. I won't cover them as I don't think they're very useful.📊 First Half: Making a DatasetThis is the longest and most important part of making a Lora. A dataset is (for us) a collection of images and their descriptions, where each pair has the same filename (eg. "1.png" and "1.txt"), and they all have something in common which you want the AI to learn. The quality of your dataset is essential: You want your images to have at least 2 examples of: poses, angles, backgrounds, clothes, etc. If all your images are face close-ups for example, your Lora will have a hard time generating full body shots (but it's still possible!), unless you add a couple examples of those. As you add more variety, the concept will be better understood, allowing the AI to create new things that weren't in the training data. For example a character may then be generated in new poses and in different clothes. You can train a mediocre Lora with a bare minimum of 5 images, but I recommend 20 or more, and up to 1000.As for the descriptions, for general images you want short and detailed sentences such as "full body photograph of a woman with blonde hair sitting on a chair". For anime you'll need to use booru tags (1girl, blonde hair, full body, on chair, etc.). Let me describe how tags work in your dataset: You need to be detailed, as the Lora will reference what's going on by using the base model you use for training. If there is something in all your images that you don't include in your tags, it will become part of your Lora. This is because the Lora absorbs details that can't be described easily with words, such as faces and accessories. Thanks to this you can let those details be absorbed into an activation tag, which is a unique word or phrase that goes at the start of every text file, and which makes your Lora easy to prompt.You may gather your images online, and describe them manually. But fortunately, you can do most of this process automatically using my new 📊 dataset maker colab.Here are the steps:1️⃣ Setup: This will connect to your Google Drive. Choose a simple name for your project, and a folder structure you like, then run the cell by clicking the floating play button to the left side. It will ask for permission, accept to continue the guide.If you already have images to train with, upload them to your Google Drive's "lora_training/datasets/project_name" (old) or "Loras/project_name/dataset" (new) folder, and you may choose to skip step 2.2️⃣ Scrape images from Gelbooru: In the case of anime, we will use the vast collection of available art to train our Lora. Gelbooru sorts images through thousands of booru tags describing everything about an image, which is also how we'll tag our images later. Follow the instructions on the colab for this step; basically, you want to request images that contain specific tags that represent your concept, character or style. When you run this cell it will show you the results and ask if you want to continue. Once you're satisfied, type yes and wait a minute for your images to download.3️⃣ Curate your images: There are a lot of duplicate images on Gelbooru, so we'll be using the FiftyOne AI to detect them and mark them for deletion. This will take a couple minutes once you run this cell. They won't be deleted yet though: eventually an interactive area will appear below the cell, displaying all your images in a grid. Here you can select the ones you don't like and mark them for deletion too. Follow the instructions in the colab. It is beneficial to delete low quality or unrelated images that slipped their way in. When you're finished, send Enter in the text box above the interactive area to apply your changes.4️⃣ Tag your images: We'll be using the WD 1.4 tagger AI to assign anime tags that describe your images, or the BLIP AI to create captions for photorealistic/other images. This takes a few minutes. I've found good results with a tagging threshold of 0.35 to 0.5. After running this cell it'll show you the most common tags in your dataset which will be useful for the next step.5️⃣ Curate your tags: This step for anime tags is optional, but very useful. Here you can assign the activation tag (also called trigger word) for your Lora. If you're training a style, you probably don't want any activation tag so that the Lora is always in effect. If you're training a character, I myself tend to delete (prune) common tags that are intrinsic to the character, such as body features and hair/eye color. This causes them to get absorbed by the activation tag. Pruning makes prompting with your Lora easier, but also less flexible. Some people like to prune all clothing to have a single tag that defines a character outfit; I do not recommend this, as too much pruning will affect some details. A more flexible approach is to merge tags, for example if we have some redundant tags like "striped shirt, vertical stripes, vertical-striped shirt" we can replace all of them with just "striped shirt". You can run this step as many times as you want.6️⃣ Ready: Your dataset is stored in your Google Drive. You can do anything you want with it, but we'll be going straight to the second half of this tutorial to start training your Lora!⭐ Second Half: Settings and TrainingThis is the tricky part. To train your Lora we'll use my ⭐ Lora trainer colab. It consists of a single cell with all the settings you need. Many of these settings don't need to be changed. However, this guide and the colab will explain what each of them do, such that you can play with them in the future.Here are the settings:▶️ Setup: Enter the same project name you used in the first half of the guide and it'll work automatically. Here you can also change the base model for training. There are 2 recommended default ones, but alternatively you can copy a direct download link to a custom model of your choice. Make sure to pick the same folder structure you used in the dataset maker.▶️ Processing: Here are the settings that change how your dataset will be processed.The resolution should stay at 512 this time, which is normal for Stable Diffusion. Increasing it makes training much slower, but it does help with finer details.flip_aug is a trick to learn more evenly, as if you had more images, but makes the AI confuse left and right, so it's your choice.shuffle_tags should always stay active if you use anime tags, as it makes prompting more flexible and reduces bias.activation_tags is important, set it to 1 if you added one during the dataset part of the guide. This is also called keep_tokens.▶️ Steps: We need to pay attention here. There are 4 variables at play: your number of images, the number of repeats, the number of epochs, and the batch size. These result in your total steps.You can choose to set the total epochs or the total steps, we will look at some examples in a moment. Too few steps will undercook the Lora and make it useless, and too many will overcook it and distort your images. This is why we choose to save the Lora every few epochs, so we can compare and decide later. For this reason, I recommend few repeats and many epochs.There are many ways to train a Lora. The method I personally follow focuses on balancing the epochs, such that I can choose between 10 and 20 epochs depending on if I want a fast cook or a slow simmer (which is better for styles). Also, I have found that more images generally need more steps to stabilize. Thanks to the new min_snr_gamma option, Loras take less epochs to train. Here are some healthy values for you to try:10 images × 10 repeats × 20 epochs ÷ 2 batch size = 1000 steps20 images × 10 repeats × 10 epochs ÷ 2 batch size = 1000 steps100 images × 3 repeats × 10 epochs ÷ 2 batch size = 1500 steps400 images × 1 repeat × 10 epochs ÷ 2 batch size = 2000 steps1000 images × 1 repeat × 10 epochs ÷ 3 batch size = 3300 steps▶️ Learning: The most important settings. However, you don't need to change any of these your first time. In any case:The unet learning rate dictates how fast your Lora will absorb information. Like with steps, if it's too small the Lora won't do anything, and if it's too large the Lora will deepfry every image you generate. There's a flexible range of working values, specially since you can change the intensity of the lora in prompts. Assuming you set dim between 8 and 32 (see below), I recommend 5e-4 unet for almost all situations. If you want a slow simmer, 1e-4 or 2e-4 will be better. Note that these are in scientific notation: 1e-4 = 0.0001The text encoder learning rate is less important, specially for styles. It helps learn tags better, but it'll still learn them without it. It is generally accepted that it should be either half or a fifth of the unet, good values include 1e-4 or 5e-5. Use google as a calculator if you find these small values confusing.The scheduler guides the learning rate over time. This is not critical, but still helps. I always use cosine with 3 restarts, which I personally feel like it keeps the Lora "fresh". Feel free to experiment with cosine, constant, and constant with warmup. Can't go wrong with those. There's also the warmup ratio which should help the training start efficiently, and the default of 5% works well.▶️ Structure: Here is where you choose the type of Lora from the 2 I mentioned in the beginning. Also, the dim/alpha mean the size of your Lora. Larger does not usually mean better. I personally use 16/8 which works great for characters and is only 18 MB.▶️ Ready: Now you're ready to run this big cell which will train your Lora. It will take 5 minutes to boot up, after which it starts performing the training steps. In total it should be less than an hour, and it will put the results in your Google Drive.🏁 Third Half: TestingYou read that right. I lied! 😈 There are 3 parts to this guide.When you finish your Lora you still have to test it to know if it's good. Go to your Google Drive inside the /lora_training/outputs/ folder, and download everything inside your project name's folder. Each of these is a different Lora saved at different epochs of your training. Each of them has a number like 01, 02, 03, etc.Here's a simple workflow to find the optimal way to use your Lora:Put your final Lora in your prompt with a weight of 0.7 or 1, and include some of the most common tags you saw during the tagging part of the guide. You should see a clear effect, hopefully similar to what you tried to train. Adjust your prompt until you're either satisfied or can't seem to get it any better.Use the X/Y/Z plot to compare different epochs. This is a builtin feature in webui. Go to the bottom of the generation parameters and select the script. Put the Lora of the first epoch in your prompt (like "<lora:projectname-01:0.7>"), and on the script's X value write something like "-01, -02, -03", etc. Make sure the X value is in "Prompt S/R" mode. These will perform replacements in your prompt, causing it to go through the different numbers of your lora so you can compare their quality. You can first compare every 2nd or every 5th epoch if you want to save time. You should ideally do batches of images to compare more fairly.Once you've found your favorite epoch, try to find the best weight. Do an X/Y/Z plot again, this time with an X value like ":0.5, :0.6, :0.7, :0.8, :0.9, :1". It will replace a small part of your prompt to go over different lora weights. Again it's better to compare in batches. You're looking for a weight that results in the best detail but without distorting the image. If you want you can do steps 2 and 3 together as X/Y, it'll take longer but be more thorough.If you found results you liked, congratulations! Keep testing different situations, angles, clothes, etc, to see if your Lora can be creative and do things that weren't in the training data.source: civitai/holostrawberry
8
Area Composition

Area Composition

Get more specific generations each time!Have you ever heard of Area composition?Area composition is a technique where you can specify and set custom locations for every element you want to generate. In order to create this simple but effective workflow all you need is:NodesLoad checkpoint: here you select your desired model.Load LoRA: here you select your desired style with any LoRA (this one is optional).Clip Set Last Layer: this node works as your Clip Skip (set it to -2 for better results).Clip text encode: here is where your lovely prompt will be. you will need to have two of these because one will work as your positives and the other as negatives.Ksampler: this node is important because it is like the brain of the main process. here is where your prompt and image size gets read it and transformed into an image. here you can use the sampler and scheduler you like the most (set the denoise strength to 1.0 for better results).Empty latent image: as important as the ksampler, the empty latent image node is where you decide the specific size of your initial image (can be portrait or landscape).Clip text encode: wait, again? yes. just as the last ones, this node will focus on the specific element you want to generate. it is important to keep it simple and only consider the main element to represent (you can have as many nodes for every element you want to generate. keep in mind that these nodes will only work as positives. for this example i will only use 2 clip text encode nodes).MultiArea conditioning: ok so, this is the most important node of the process. here, for explaining purposes, i will call each one of my positives as conditionings.conditioning 0 will be my first positive (the one i made on step 4).conditioning 1 and 2 will be my second and third positive (the one i made on step 7).it is very important to know that for each conditioning you will have to set a desired size for each element. in this example conditioning 0 i set it to 512x718 because is the base prompt and i want all of the canvas to represent it. for conditioning 1, which is my main character, i set it to 384x576 on lower part of the center of the canvas. and for conditioning 2, which is the background /setting, i set it to 512x718 because i want all of the canvas to work as the background. (you may notice that for each conditioning, while setting it's position, a different color will show on the multiarea conditioning node. keep calm, these colors will work just as a visual representation for the position of each element).also important, as you have figured it out, this node works just as a super detailed composition instruction, therefore, this multiarea conditioning node will work as your positive, so be sure to connect it as positive in your ksampler.Upscale latent: until this part of the process we have only created the base image, which means it is time to upscale it. to do so, i have used the upscale latent node. it not only upscale the image to a desired size but also introduces more detail in the process.Ksampler: yes, again. this second ksampler will work along the upscale latent node in order to refine details, so using the same configuration as your first one (step 5) is a good idea. (lowering the denoise strength on this second ksampler will help in avoiding drastic changes. for this example i set it to 0.5).VAE encode: the variational autoencoder or vae node is important because this node will transform the noise and commands into your beautiful masterpiece.Preview/Save image: lastly, what is left to add is the preview/save image node. (this one does not need an explanation, right?).And there you go, you will now be able to generate more personalized images.Intended image to create: cyborg girl inside abandoned building.Do not forget to set this article as favorite if you found it useful.Happy generations!
12
4

Posts