v4.0 My best version so far. Still 8 dim, very good similarity, better picture quality than before, and I believe it is not inferior to any other large lora with 64/128 dim on this site.
This time I trained with 8x8 batch size (using gradient accumulation), a high lr and I trained text encoder this time. Seems like you just have to use a lower te lr than unet.
Let me know if there are any other characters you would like me to train.
v3.0: Still 8 dim, can't really say this version is a lot better, but it's an interesting alt choice. With less dataset, more epochs. I personally believe that it achieves good flexibility.
v2.0 update: Yes, I proved it. Even 8 dim is enough.
STOP UPLOADING HUGE 128 or 256 dim LORAs.
p.s. I've already replaced it with 30 epoch ver, which is basically better than 12 ep.
I trained this lora to show one thing:
You don't need a HUGE lora to recreate a real person. 16 dim is more than enough.
Too many SDXL loras in this site now take too much size. Size of a full SDXL model is about 6-7 GB, and some lora takes 1.7 GB? Outrageous.
Stop training real person loras with more than 32 dim... It's a waste of training time and disk space. Single anime character could use less than 8 dim. Save your time. Save bandwidth for this site.
Anyway, my lora here takes ~200 images and ~1 hour to train. 1 repeat, 42 epoch for this one. Its simple to use, just use the name. It took me no more than 2 hours from gathering the images and captioning to training.
The key is, use a proper VLM to generate caption for your dataset, like LLaVA or CogVLM, or better, GPT4v. Use natural language cause it works good on SDXL. Do not train text encoders.
For people or characters already known by the base model, DO NOT use a seperate trigger token. SDXL knows who Ana de Armas is, so as Taylor Swift, Jenna Ortega... NEVER use something like "ohwx".
For those who weren't in the base model, it's good enough to use their names as trigger token. The model could tell something from their name, like race, nationality...
Feel free to tell me how you think in the comment.
My preview images were generated with ComfyUI. You could download them and load them in ComfyUI to load my workflow.