02 Feb 2023
Classic Negative (SD 2.1 768px v0.2)
I finally managed to train an improved version of my original Classic Negative Model for SD 2.1 768.
The improvements mostly come from better and more accurate captions, as well as a more diverse dataset. I also used some pictures generated with the original version for the training.
I attached a few comparisons between the default 2.1 model, v0.1 and v0.2 I used for evaluating if it actually improved. Compared to the default model, it offers vastly improved lighting, a more pleasing color palette, better depth of field and composition. Compared to v0.1, it improves further on a smooth depth of field fall of and creates slightly more realistic images. The colors are also more in line with what I originally intended.
15 Jan 2023
- Update -
After several failed attempts, I finally managed to train a usable 2.1 version on the same dataset I used for my 1.5 Classic Negative model. I wish I could show you a more diverse set of pictures, but I'm busy creating one cute animal after the other.
for 2:3 aspect ratio images, 1152x768px works really well
for 21:9 aspect ratio images, 1344x576px works really well
Make sure to place the config file into the same folder as the model and make sure they are named exactly the same.
13 Jan 2023
- Original Post -
I'll preface this by saying that I have no idea what I'm doing. Also, this is by no means a complete or perfect model. But after many tries I'm at a point where I'm happy with sharing some pictures and an early version for you to try out.
Classic Negative (SD 1.5)
With Classic Negative I tried to train a model with DreamBooth which closely mimics my style of photography. Its name comes from a built in camera profile in Fujifilm cameras, "Classic Negative". I use a modified version of this profile in basically all of my photos. To mimic my style, the model must achieve the following:
recreate the color profile of classic negative: muted and desaturated greens
introduce faded blacks and diffused highlights (like a Tiffen Glimmerglass Filter would do)
reliably create a nice depth of field effect like you would get with large aperture lenses
improve the composition of the default model (foreground and background objects, framing, point of view)
improve the lighting of the default model
add grain and preferably a slight vignetting
try to recreate the look and feel of old 35mm film photos
Training
For training I used 100 of my personal images, consisting mainly of environmental portraits and photos of my dog, some macro and some landscape shots. The model is probably biased towards forests and garden pictures, since that's where I took the majority of my photos. It seems to be on the verge of being overfitted, in some generated pictures I could clearly make out the general structure of my backyard.
The captions were written manually for all of the photos. Nothing too complicated, here's an example: https://i.imgur.com/prf8VxS.png
I trained for 1800 steps with a learning rate of 1e-5 and 350 text encoder steps using TheLastBen's Fast DreamBooth ipynb.
Prompts & Parameters
The prompts I tried so far are very simple. The activation token is classicnegative
- classicnegative photo of a cute raccoon sitting between bushes in a garden, purple tulip flowers
- classicnegative photo of a cute small red panda sitting on a branch in the jungle
- classicnegative photo of a white fluffy rabbit standing in a garden illuminated by fairy lights, winter, heavy snow, snowflakes
Parameters: Euler A, CFG Scale 7, 30 Steps, 860x360px
I then went seed hunting. Although in a batch of 4 there was at least one usable picture so far. If a good picture was generated, I set the same seed and ran it again with Hires. fix enabled (which takes like 3,5 minutes with my GTX 1070 for one picture).
Hires. fix Parameters: ESRGAN_4x, 30 Steps, 0.3 Denoising, Upscale by 2
I discovered this by accident, but using these settings the picture stays exactly the same and all the film photo characteristics like the grain won't get lost during upscaling.
If the effect of the model is too strong, try adding tokens like sharp focus, high contrast, clarity to your prompt. Or just increase the contrast in post. But yes, sometimes it becomes a bit too much, I'll have to take a look into it for a future revision.
What's next
more testing is needed, different parameters and subjects
create a SD2.1 768px version
finetuning
Please feel free to try the model out, test its limitations and if you have any advice on how I can create a better version of it, please let me know ;)