AnySomnium XL - v2

Name: AnySomnium XL - v2
Availability: InStock
Author: KetenganDiffusion

AnySomnium XL

CHECKPOINT

Original

KetenganDiffusion

Updated: Sep 8, 2024 11:36 AM

[Proudly introducing, AnySomniumXL v3, an SDXL Model]

You can support me on Ko-Fi

The SDXL model with a 2D (cartoonish) style is trained with the basic SDXL model (SDXL Base v1.0), supported by text encoder training to generate a 2D style with natural language and likely not generate the realistic style inherent in SDXL Base.

The model is trained with 133,000+ curated images from hundreds of thousands of images from various sources. The dataset is built by saving images that have an aesthetic score of at least 17 and a maximum of 50 (to maintain the cartoonish model and not too realistic. The scale is based on our proprietary aesthetic scoring mechanism), and do not have text and watermarks such as signatures or comic/manga images. Thus, images that have an aesthetic score of less than 17 and more than 50 will be discarded, as well as images that have watermarks or text will be discarded.

AnySomniumXL v3 Technical Specifications:

Training per 1 Epoch 16 Epoch (Results from AnySomniumXL using Epoch 16)
Captioned by proprietary multimodal LLM, better than LLaVA
Trained with a bucket size of 1280x1280
Shuffle Caption: Yes
Clip Skip: 2
Trained with 2x NVIDIA A100 80GB

The technology for creating this dataset uses a combination of the CLIP model and MLP scoring method by christophschuhmann and modified by us, utilizing VIT-L/14 to produce aesthetic scoring on a scale of -1-100 and modified with the addition of watermark detection from us.

Achievements:

✓ Produces more 2D Models with Natural Language by default without the need for excessive negative or positive prompts

✓ Most likely to produce better fingers than the average stable diffusion model without adetailer or inpainting

✓ Produces a more authentic 2D model without the need for negative prompts

✓ Does not produce images with random watermarks or text

Limitations:

✓ Slightly of characters holding objects such as weapons or items correctly

✓ Still requires broader dataset training

✓ There are still some gaps in the text encoder. There is room for improvement

✓ Text cannot generated correctly

✓ This optimized for human or mutated human generation. Non human like SCP, Ponies, and more maybe could resulting not what you expecting

AnySomniumXL v3 Pro tips:

Because AnySomniumXL v3 trained on 1280x1280, so the resolution on many aspects ratio maybe different than standard SDXL model

Best Resolution (You could flip the resolution number whether it's landscape or portrait):

1280x1280
1472x1088
1152x1408
1536x1024
1856x832
1024x1600

More versions will be coming with broader datasets and trained text encoder. Our targets is to produce the most enormous clean datasets for our training. It's recommended to using this model on Automatic1111 webui

Version Detail

Uploaded

Dec 23, 2023 8:35 AM

Base Model

SDXL 1.0

Steps

338560

Epoch

Description

We are thrilled to announce that our SDXL model AnySomniumXL v2 is released! This is a major upgrade from the previous version. Key features of AnySomniumXL v2: ✔ Wider datasets trained on. It trained with 33k+ curated images, covering a range of scenarios and themes. The image curated using our datasets filtering algorithm that can detect watermark and bad images or too realistic image. ✔ Better variety of shot of the scene. It can generate different angles and perspectives of the same scene just only using natural language or booru tags ✔ Better characters on holding something. It can create more natural and expressive poses for the characters. Something like holding a sword, holding a book, holding the staff, a cup of coffee and many more ✔ Better characters concept. It can design more original and diverse characters as seen in anime/series/video games without LoRA or other embeddings*), Also, due to it's flexibility you can customize their appearance or let the model surprise you with its own algorithms. ✔ Improved input on natural language. You can now use natural language to describe your scene, without worrying about tags or keywords. Just let our model generate your wildest dream. ✔ Better prompt understanding and accuracy. Our model can now interpret your prompts more accurately and generate scenes that match your intentions and expectations. Thanks for the power of Multimodal LLM ✔ Improved on generating tears. Whether it's joy or sorrow, our model can generate realistic tears on your characters' faces, adding more emotion and depth the images. ✔ Improved fingers. Reduced unnatural finger positions ✔ More diversity of skin color and gender. Our mission is to generate more diversity of the skin color and gender. So we not only focus on one gender or skin color. Waifus, husbandos, even ponies will be trained as long as the datasets is sufficient and not wiped by our automatic aesthetic measure algorithm *) Due to I didn't want to overbake the character and offering more flexible characters, more prompt for clothing and maybe eyes color prompt maybe required. This model datasets cut off is November 2023. Any new characters or style above November 2023, maybe will be trained on AnySomniumXL v4. There is few things that We must finish before starting to gather new datasets.

Project Permissions

Use Permissions

Use in TENSOR Online
As a online training base model on TENSOR
Use without crediting me
Share merges of this model
Use different permissions on merges

Commercial Use

Sell generated contents
Use on generation services
Sell this model or merges