Kirazuri (Anima) - 3.0

Name: Kirazuri (Anima) - 3.0
Availability: InStock
Author: motimalu

Kirazuri (Anima)

CHECKPOINT

Original

motimalu

Updated: Jun 11, 2026 9:24 AM

Kirazuri (Anima)

Version 3.0 (Latest)

For in-depth details of version 3.0 training and tooling, see: Kirazuri (Anima) 3.0 Training Diary

Training Details Summary

Trainer: diffusion-pipe commit b0aa4f1e03169f3280c8518d37570a448420f8be

Training device: NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition

Total training time: ~10 days

Total samples seen(unbatched steps): ~2,550,000

Training resolutions:

512^2
768^2
1024^2
1280^2
1536^2

Stage 1

Samples seen(unbatched steps): ~2,000,000
Training time: ~125 hrs
Learning Rate: 6e-6
Learning Rate Scheduler: Cosine
LLM Adaptor Learning Rate: 8e-7
Precision: Mixed BF16
Optimizer: AdamW8bit with Kahan Summation
Weight Decay: 0.01
Timestep Sampling Strategy: Logit-Normal

Stage 2

Samples seen(unbatched steps): ~550,000
Training time: ~118 hrs
Learning Rate: 3e-6
Learning Rate Scheduler: Cosine
LLM Adaptor Learning Rate: 0
Flux Shift: Enabled
Multi-Scale Loss Weight: 0.5
Precision: Mixed BF16
Optimizer: AdamW8bit with Kahan Summation
Weight Decay: 0.01
Timestep Sampling Strategy: Logit-Normal

Additional Features

Tag Dropout: 30% with protected first 8 tags
Tag Shuffle: Applied to last unprotected tags
Natural Language: Short and Long Caption variants

Changes from Kirazuri (Anima) v2.0

Dataset includes recently curated 7,071 images increasing total size from 35,537 to 42,608 images
Dataset cutoff now of 2026/05/12.
Trained at 5 total resolutions in two-stage training
- Stage 1 - 512^2, 768^2, 1024^2
- Stage 2 - 1024^2, 1280^2 1536^2
Introduced cosine learning rate scheduler for smooth learning rate transition between training stages
Re-captioned full dataset for a second natural language captions variant with updated captioning script

Recognitions

Thanks to Circlestone Labs for the Anima Preview base model.
Thanks to tdrussell of Circlestone Labs for the diffusion-pipe trainer.
Thanks to bluvoll for support using their fork of diffusion-pipe.
Thanks to narugo1992 and the deepghs team for open-sourcing various training sets, image processing tools, and models.

License

This model is released under the same license as the base model.

See the base model for details of the CircleStone Labs Non-Commercial License.

Version Detail

Uploaded

Jun 11, 2026 9:23 AM

Base Model

Anima

Trigger Words

Description

Version 3.0 (Latest) For in-depth details of version 3.0 training and tooling, see: Kirazuri (Anima) 3.0 Training Diary: https://github.com/motimalu/diffusion-training-configs/blob/main/diffusion-pipe/anima/notes/kirazuri3.0-notes.md Training Details Summary Trainer: diffusion-pipe commit b0aa4f1e03169f3280c8518d37570a448420f8be Training device: NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition Total training time: ~10 days Total samples seen(unbatched steps): ~2,550,000 Training resolutions: 512^2 768^2 1024^2 1280^2 1536^2 Stage 1 Samples seen(unbatched steps): ~2,000,000 Training time: ~125 hrs Learning Rate: 6e-6 Learning Rate Scheduler: Cosine LLM Adaptor Learning Rate: 8e-7 Precision: Mixed BF16 Optimizer: AdamW8bit with Kahan Summation Weight Decay: 0.01 Timestep Sampling Strategy: Logit-Normal Stage 2 Samples seen(unbatched steps): ~550,000 Training time: ~118 hrs Learning Rate: 3e-6 Learning Rate Scheduler: Cosine LLM Adaptor Learning Rate: 0 Flux Shift: Enabled Multi-Scale Loss Weight: 0.5 Precision: Mixed BF16 Optimizer: AdamW8bit with Kahan Summation Weight Decay: 0.01 Timestep Sampling Strategy: Logit-Normal Additional Features Tag Dropout: 30% with protected first 8 tags Tag Shuffle: Applied to last unprotected tags Natural Language: Short and Long Caption variants Changes from Kirazuri (Anima) v2.0 Dataset includes recently curated 7,071 images increasing total size from 35,537 to 42,608 images Dataset cutoff now of 2026/05/12. Trained at 5 total resolutions in two-stage training Stage 1 - 512^2, 768^2, 1024^2 Stage 2 - 1024^2, 1280^2 1536^2 Introduced cosine learning rate scheduler for smooth learning rate transition between training stages Re-captioned full dataset for a second natural language captions variant with updated captioning script

Project Permissions

Use Permissions

Use in TENSOR Online
As a online training base model on TENSOR
Use without crediting me
Share merges of this model
Use different permissions on merges

Commercial Use

Sell generated contents
Use on generation services
Sell this model or merges