NOTE: I am so exhausted trying to create women's apparel that is modest enough to not get flagged by civit's filters (1cm of skin below the middle of the throat is enough to trigger it sometimes, and at other times it seems to be turned off entirely), that I have decided to post the most minimal number of images possible.
It can take 3-12 hours for Amazon Rekognition to make a judgment on submitted photos , so in fact there may be no photos accompanying this post. In future I may simply use real images (flagged as such), and let the user try the LoRA for themselves, rather than spend hours negotiating with the civit flagging system.
A custom-trained LoRA of American actress Annette O'Toole as she appeared in the early 1980s in films such as Superman III and Cat People. 100 training images were used, with some augmentation.
Use a LoRA strength between 0.1 and 0.15. Anything more than that and you will get absolute garbage (see 'the three-times trained method' below).
This LoRA uses a new technique first shared on Reddit in late September 2023, by the user shootthesound. Please see the above link for details, but the long and short of it is that you create two versions of the same training data (one portrait and one square, i.e., for instance, 512x512 and 512x768), and train a LoRA for each of them.
You then pick the best trained checkpoint from each and merge them in Kohya at 100% strength each. See the original post for comments from a machine learning expert as to why this massively improves the quality of the LoRA, but suffice to say that the merged LoRA now has the best of both worlds.
The fact that they are merged at 100% each is why you need to use LoRAs made with this technique at around 0.4 strength, because technically the two LoRAs represent a 200% strength!
There are several benefits with this approach:
Faces are much more detailed, even when they are small in frame.
The overall quality is extraordinarily magnified.
You can ramp up the CFG almost to the end of the scale before there is any degradation of quality, which means your prompt instructions can be followed without sacrificing quality.
The resulting LoRA is incredibly disentangled, and can adopt poses and characteristics present in the LAION dataset that do not exist anywhere in the training material that you used for the LoRA.
And there's more besides this - but try it for yourself, and see for yourself.
This model is very likely to produce NSFW and nude renderings unless counter-prompted.