Sunday, August 6, 2023

Unstable Diffusion

Still messing around with AI, although it looks that -- PhotoShop's ad campaign to the contrary -- the tools really aren't up to automating the ugly parts of the process of making book cover art.

In a nutshell, the things that are a pain to do (like fixing bad blends, restoring textures) the AI messes up in various horrid and creative ways. And the things the AI does well (like really deep textures and complex lighting effects) are exactly the things you don't want back in the stages when you are still editing the thing.

Here's the imagined workflow; start with some stock art, change the clothing and pose. Add the right lighting and focus effects. Get everything balanced and properly composed. Now do a final clean-up and dazzle pass to fix tiny errors and add a little sparkle.

AI pretty much sucks as a partner for that workflow. At best, you can approximate by being extremely iterative. I've spent a bit too long in the inpainting stage. Select a shirt sleeve that looks wrong. Hand-paint over the worst of the problems. Run the AI a few dozen times -- twenty second a run really adds up here -- until you find one that seems to be in the right direction. Bounce that render back to the workspace, lather, rinse, repeat. Also dozens of times.

At the early stages it can be rather spectacular. But you are still dealing with the tragedy of the commons and how to speak search engine; you have to put it in terms the AI understands, and show it a picture that it can recognize. If you get it right...

Say you want to add a fresh apple to a table. Make a blob of red paint. I'm serious; the most primitive pen tool in the cheapest came-with-Windows paint program is plenty. Then emphasize the prompt terms, perhaps with specifics that will lead the AI to look in the right directions. And crop the selection carefully.

And if you get it right, a fully detailed photo-realistic apple will appear.

The ways in which it may go wrong are too many to list. The table turning into an apple is just one of them. The AI is also notorious about not parsing well; "Man in white shirt on the beach" returns, ninety percent of the time, a white sand beach. With or without a man in a shirt. It is basically like talking to the terrible search engines they have these days, which I am convinced are basically like those tempt-you shelves they build right in front of the checkout counter at the store. With any possible excuse they will send you towards what is popular with many people -- and profitable to them.

***

So I wanted to mess with my own model. Actually sort of got it working; threw in several photos of my own holocron and got the AI to render something that looked sort of like it when prompted.

But Dreambooth continues to bork. Here's the thing; these are largely labors of love by programmers and people in academia. Which means the current tools are largely by programmers written for an audience comfortable with at least basic python-tinkering. The installation of Automatic1111 is all "CD to the webui directory and pip the dll lfrom github...

And it is all multiple programs maintained by different people. Automatic1111 is a GUI -- a webui -- front-end to what would be a Python command line to the Stable Diffusion core, which depends in turn on a dozen different plug-ins like Coda and Torch to do the math, interface with the GPU for the computationally heavy work, etc. And then can host extensions like Dreambooth which may come with their own dependencies.

The current Automatic1111 expects xformers 1.18 or higher. The developer for xformers has already brought it up to 1.20, and in the BAT file for the Automatic1111 installation there's an automatic pip for the most recent version but, fortunately, it still works.

What doesn't work is Dreambooth, which requires under 1.18 -- which the developer no longer supports and isn't on github anymore -- but there is a 1.17dev function that will work if you edit several of the py files that point to it. And figure out how to force an unsupported version of xformers that can no longer be installed directly but has to be hacked in and possibly built manually in your own environment...

So basically my install keeps borking and throwing up "please install xformers!" and I'm just living with it. Any day now there will be a new version along and it will break everything again in some new and exciting way.

Meanwhile, I have writing to do.




No comments:

Post a Comment