Your current browser has compatibility issues. Please switch toChrome for uninterrupted access.

[AiDv2.91] Two-dimensional illustration design

160
1
0
0
AnimeAnime CharacterGirlCharacter
Recently Updated: 23/10/24First Published: 23/07/09
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info
Anime,Anime Character,Girl,Checkpoint,SD 1.5Image info

0 Introduction

I am tired of the traditional AI drawing models that produce repetitive faces, poses, and styles. Therefore, I wanted to break away from the mixed model. Initially, I used prompts, but I could never achieve a subtle line, color, light, texture, composition, or storytelling, nor could I replicate the stunning style that models accidentally produced. This fleeting difference in style, while subtly different from a general style, is aesthetically captivating. Hence, I wanted to create a model that could perfectly learn artistic styles and consistently output them. I started collecting materials to train the stylization model in November 2022, specially marked to differentiate those materials with subtle differences. Finally, at the beginning of 2023, the model developed its own unique style, known as the AIDv1.0 model.

Why fine-tune instead of train from scratch? I have always believed that fine-tuning yields better results than starting from scratch. It does not rely on the base model; all training images progress towards the lowest error collectively during training, rather than just optimizing a set of additional weights. However, I am also exploring methods to seamlessly incorporate specific styles into large models to reduce the training burden.

Over the following six months, I spent over 20,000 RMB, cutting images, labeling them, and tweaking scripts myself. The training steps ranged from thousands to millions, and the training devices ranged from RTX3060, RTX3090 to A100. From creating materials to training, AID gradually evolved into a complete engineering project.

In the process, I discovered that the best learning of style occurs when the model slightly "overfits" to the noise of the original images. I attempted to overfit all styles, and used negative embedding to learn the noise of overfitting to balance the learning progress between different styles, resulting in the creation of the bad, badhand, and aid series. This regularization method brought me good results. Properly fitting negative embedding in training not only does not destroy the base model's style but also enhances the characteristics of the style.

As the model iterated, I believe I have reached the limit of SD1.5. Even with fine-tuning, the unique lines, colors, lights, compositions, and storytelling of beautiful illustration styles are difficult to imitate with just a simple SD1.5 model. From underfitting to overfitting, I have always failed to achieve perfect stylized features, especially when the model needs to optimize over a hundred different art styles simultaneously.

Therefore, I am very much looking forward to more complex SDXL models to provide me with a breakthrough.

During the model training, I did not focus on writing a large number of prompts or mixing different styles. Some people have achieved stunning results by combining Lora with very complex prompts, and I truly appreciate their innovation and enthusiasm.

Finally, thanks to @BananaCat for the Chinese translation of this article. I am eager to share and exchange ideas with SD enthusiasts worldwide. The AID models are all driven by professional interest. If you are interested in more details on material processing and model training, or if you want to share your training ideas with me, please feel free to leave a comment in the comment section, and I will reply as soon as possible.


I Introduction

AnimeIllustDiffusion (AID) is a pre-trained, non-commercial, and multi-style anime illustration model. It does not generate "AI faces". It comes with a wide range of styles, and you can use some special trigger words (see Appendix A) to generate images in specific styles. Due to its extensive content, AID requires strong negative prompts to function properly. General negative prompts (such as low quality, bad anatomy, etc.) have limited effects, so if your generated images have noise, please use the provided negative textual embeddings [1] to eliminate the noise. For custom negative textual embeddings, refer to version information. VAE preferred is sd-vae-ft-mse-original [5]. Use on Clip Skip = 1.


The AID model features over 200 stable anime illustration styles and 100 anime characters. Refer to Appendix A for special trigger words for generating styles. For characters, simply use the character names. The AID model is like a palette where you can create new styles by combining different trigger words.


Each version of AID has its strengths; newer versions are not necessarily better.

  • Suitable for first-time users: v2.8, v2.91 - Weak, v2.10beta1
  • Highly creative: v2.6, v2.7, v2.91 - Weak, v2.91 - Strong
  • Relatively stable: v2.5, v2.6, v2.8, v2.91 - Weak
  • Diverse styles: v2.91 - Weak, v2.91 - Strong, v2.10beta1


The cover image on this page is a compilation of all AID version cover images. Only the AIDv2.91 Weak version is uploaded on this page. If you are interested in other versions, please visit:

https://civitai.com/models/16828?modelVersionId=91090


II Advantages

Specialized in designing two-dimensional character illustrations. Proficient in flat, thick, and semi-thick painting styles. Artistic lines and colors. Bold and flexible compositions, good at posing and dynamic movements. Neat and soft details, devoid of the 2.5D texture of mixed models, a style of its own, more akin to hand-drawn than AI.

Getting to know more popular anime characters may be more beneficial when paired with character prompts.


III Disadvantages

Not adept at drawing scenes other than characters. Not skilled in oil painting and watercolor styles. Requires custom negative embeddings to eliminate noise. The intensity of trigger words may not be well-balanced. Weak in understanding natural language, not compatible with most stylized prompts and some character prompts.


IV Disclaimer

This model is meant for testing multi-style model training, non-profit or commercial, purely out of interest. If there is any infringement, it will be promptly removed.

Users are only authorized to use this model to generate images and unauthorized reposting is not allowed.

Commercial use of this model is strictly prohibited.

Do not use this model to generate images with bloody, violent, pornographic, or infringing content! Therefore, Appendix A can only provide some of the trained keywords.


Appendix A

Please visit the original model address to get the trigger word list:

https://civitai.com/models/16828/animeillustdiffusion

AIDV2.10 marks a significant update. It is trained with twice the dataset size of aidv2.91 and higher image quality. The resolution of training images has increased from the original 768 pixels to 1024 pixels, allowing you to generate images directly at 1024 pixels (variable proportions, e.g., 768x1532) without image distortion. Additionally, it has more balanced style weights. Furthermore, aidv2.10 supports over 200 different illustration styles (compared to about 100 in aidv2.91, an increase of about 100 styles).

Similarly, AIDV2.10 has its exclusive negative textual embedding aid210, also known as bad17 in the cover image. Without it, you may end up with very ugly images. Place it in the first position of the negative prompts for the intended effect.

The model cover images are all generated purely, withou* ***ng any Lora and ControlNet generation, and are partially restored from high-resolution or second-generation images (generated with exactly the same parameters) for enlargement.

Expand All

Discussion1

Most popular
|
Newest
Send
avatar
Bảo Ngọc
wow co good
2024/08/13 13:16:03
0Reply
No more comments
Euge
Artist
1
10
39.9k
48
Favorite
Download
(5.28GB)
Verified:2023/07/13
safetensors
Version Details
Type
Checkpoint
Generation Count
159
Downloads
1
Base Model
SD 1.5
Trigger Words
See Appendix A
Recommended Parameters
Sampler method
Euler a
CFG
10
VAE
None
License
Can be generated online
Can be merged
Commercial LicenseCommercial Use Rules
Generated images cannot be used for commercial purposes
Models cannot be resold or sold as merged models
*Licensing scope is set by the creator, users must use as per requirement and standards

Gallery4

Most popular
|
Newest
Image info
Image info
Image info
Image info