Reuploaded from Huggingface to civitai for enjoyment.
WD 1.5 Beta 3 is fine-tuned directly from stable-diffusion-2-1 (768), using v-prediction and variable aspect bucketing (maximum pixel area of 896x896) with real life and anime images. Given the broad range of concepts encompassed in WD 1.5, we expect it to serve as an ideal candidate for further fine-tuning, LoRA's, and other embedding applications. - [Notion.site]
Model is good. Think of it like NAI when it first came out. It's a good way to kickstart a lot of finetuning right? Well you can just do that with WD 1.5 B3. - KaraKaraWitch
To be uploaded.
Download the 3 files.
Same deal how you install SD 2.1.
Use the magic sauce VAE
If you can't do that well uhhh... I guess try and google and figure it out? I think this could help.
Use the following "mastering" prompts for improved looks:
Positive Prompt:
(exceptional, best aesthetic, new, newest, best quality, masterpiece, extremely detailed, anime, waifu:1.2)
Negative Prompt:
lowres, ((bad anatomy)), ((bad hands)), missing finger, extra digits, fewer digits, blurry, ((mutated hands and fingers)), (poorly drawn face), ((mutation)), ((deformed face)), (ugly), ((bad proportions)), ((extra limbs)), extra face, (double head), (extra head), ((extra feet)), monster, logo, cropped, worst quality, jpeg, humpbacked, long body, long neck, ((jpeg artifacts)), deleted, old, oldest, ((censored)), ((bad aesthetic)), (mosaic censoring, bar censor, blur censor)
Model can do the following:
- Realistic (realistic, real life:1.2)
In positive.
- Horny (The typical stuff, probably better with finetunes lol)
- Whatever you want to tune it with lol.
- Tuning is rather easy too. LoRA works (Kohya ones) and LyCORS (Tested LoCon and it works sooo yeah.)
Fixed Text Encoder Training, so TE is actually trained now, give it a try if you're from Beta 2.
It's... Complicated.
TLDR: Just follow the Fair AI Public License 1.0-SD (https://freedevproject.org/faipl-1.0-sd/). If any derivative of this model is made, please share your changes accordingly. Special thanks to ronsor/undeleted (https://undeleted.ronsor.com/) for help with the license.
It does kinda go against the spirit of civitai but uhhh whatever lol.
1. BLIP/BLIP2 and WD Tagger to provide booru tags and natural language captions to every image.
2. Apply Date gradient
3. Bucket πππππ½πππΎπΈ Aesthetic into Exceptional, Best, Normal & bad.
4. Add stars to Booru images & bucket em. (Masterpiece, Best, High, Medium, Normal, Low & Worst)
5. Train
6. ???
7. Profit.
KaraKaraWitch here, okay so here are some of my comments from inital trial runs with WD 1.5 B3 and some common pitfalls.
1. Use the VAE provided. Do not use the builtin model VAE.
2. Enable --v2
and --v_parameterization
3. Train as per usual
"Wait what that's it?!"
Yes. The Do not however that your final loss should hover around 0.3. Any lower (like 0.29) might indicate overfitting issues.
"Amongus sus"
I meannn I only did a couple of styles and it did work out like that soo...
According to devs, there is no perceivable difference in terms of quality when using either fp16 or fp32. (Unless you use memory opts like xformers. Those will cause more bigger issues than saving at fp32.)
See when salt uploads the thingy to HF lol
Like I said in the beginning:
> Think of it like NAI when it first came out. It's a good way to kickstart a lot of finetuning right? Well you can just do that with WD 1.5 B3.
It is recommended and encouraged to finetune and/or locon/lora it!