The Starving Theater Artist

Sunday, August 24, 2025

Cross-contamination

Some writers, more likely the new writers than the experienced, avoid books when they are actively writing. They are afraid that something will rub off.

I didn't have that fear. I wonder, though. Writing in narrative voice can be difficult at times. First Person is practically defined by having a distinctive voice. It is Penny's verbal ticks and manners that hit me worst out of all the travails of writing in First Person. Seconded by her habit of covering up what she really feels.

And I don't have a story bible for it. I've always sort of thought that was a thing you did when you created a distinctive character voice. This one does not use contractions. This one is enamored of sesquipedalian verbiage.

Penny, though, I just write the stuff. I go into her head and I jot down what comes out. And two times out of three, what she chose to say didn't work for the story or the audience or was just too weird and I have to go back and delete it.

It is all instinct. I've nothing written down. So, yeah, I got suspicious when I archive-binged the first season of Wednesday and noticed that Penny seemed to be getting more snarky than usual.

Alexis Carew clinched it for me. Penny keeps wanting to say things like "I would admire it were it done promptly" and I have to stop her. Instanter.

I've got her getting a blood draw at the FOMC at Holloman AFB, part of the 49th Medical Group. "Perseverance Prevails," their motto, and I'm having to hold Penny back from muttering something about how many sea anemone are enemies of her. Oddly enough, I tried my local medical supply guy on it and he pronounced it no problem.

No, I didn't ask him about my medical questions. He doesn't do that. He'll sell you a wheelchair. He doesn't do medicine.

I was able to pull up enough data to figure out what kinds of tests these guys could be running on Penny to ward off what she calls her potential to turn into her own nightlight. She means glowing in the dark. And so far nobody has admired her clever turn of phrase (she keeps hoping). It also works out very well for my purposes, because half of it can be done by any lab, and the other half gets mailed off to the CDC regardless of who does it.

I do wish I has some internal photographs or video of the place. But more than that, what kind of procedures they might have, how they might act. Ah, well. I've pretty much given up on strict accuracy. If someone reads the book on the flightline and sends in an angry review -- it will be one more review than I ever expect to get. For this book or for the last, either.

Also, Reacher did it for me at last on accuracy. Dan Brown was bad enough. Lee Child makes fewer mistakes, but a big part of the books is how Reacher is always thinking and doing the math and working out just how far the bad guy drove or why the cafe has to be part of a money-laundering operation.

And its wrong. Weirdly wrong. I mean...there are a hell of a lot of guns in these books and I swear that at least one of the pistols was loaded with shells. There might have even been a clip...

Thursday, August 21, 2025

From a certain point of view

Mystery may not be the best genre for discovery writing.

Especially discovery writing a narrative in first person. I've been delivering information as my protagonist discovers it. She doesn't know what the full plot is. This isn't Columbo, where the identification of the murderer is given in the opening.

The problem is that I don't know it either. I mean, I know more than my protagonist does. But I don't know the details, because I didn't need to know the details before I got to the equivalent of the "I imagine you are all wondering why I have gathered you here today" scene.

Of course the other big difficulty in First Person is that even if someone confesses (and given how people are, even when), the motivations and knowledge remain murky. The writer doesn't have the option to cut away to the villain monologuing to themselves what they've done or what they know.

(Makes it both convenient and frustrating when you've got the proverbial man coming through the door with a gun. How did he know she would be there? Um...she'll never know, so the reader probably never finds out either.)

***

In any case, my current exploration of what all these people actually did or know was touched off by the scene in which Penny goes to the 46th Medical on-post to get a blood draw.

And I really don't know how you would go screen someone for potential radiological contamination. Usually, this would start with a swab. But this is all happening several days following the incident. There's also an informational problem; this was touched off by her visit to Site Theta (that's what she calls it -- she thought Site Tango was silly. And the guy who called it that, Jackson, admitted he'd made up the name as a place-holder).

Site Theta is something "they" are trying to sweep under the rug. So they aren't going to go around telling anyone, including Penny, that they suspect she's been in contact with hot material that was there.

So instead this is probably a signal. Either a paternalistic warn-off; "There's dangerous stuff there, honey. You shouldn't poke at it." Or it is more of a Silkwood warning; "We're going to haul you in for more and more tests if you keep messing with it."

(A little harder to make happen if she isn't your employee.)

At the end of all of this, the only thing I am relatively sure about with Colonel Flowers is that he probably wasn't in his current posting at the time Evil Kitty cleaned up Site Theta. But I'm still not sure what he knows. And what he wants to do about it.

And my protagonist, point-of-view character, and narrator knows less than that.

***

The first Reacher novel is in first person. The second and third are in third. Apparently Lee Child goes back and forth. Having also watched the Prime series, I feel as I do about Holmes, or The Doctor; this is a character that works better at a remove. Not that his inner life isn't interesting, but his process is better when you don't really know what he is thinking. Or if he is actually bluffing.

I've gotten tired of First-Person. I've gotten tired of writing in Penny's voice, both the voice itself and the restrictions of the particular flavor of First Person I picked.

And I thought of something.

So the story that opens with an amnesiac Penny waking up in a strange town, into a Jack Vance situation: that is, she doesn't know where she is or what is going on, but everyone knows who she is -- and half of them are trying to kill her.

Except don't do that. Open in a hospital bed as she wakens from a life-saving surgery that also wiped out her last couple of weeks of memory. And again, surrounded by people who know her. Or, rather, who know this girl with her name that did things that Penny finds hard to believe.

And they have material. They have police reports, and new articles. Enough for her to piece together what happened, but there's a catch. The continuity of human consciousness is an illusion. You really can't step in the same river twice. We change all the time, and we can't always understand the "I" that existed even a week earlier.

Especially if that "I" had gotten involved in a murder, taken down a gang, learned to play trumpet, and fallen deeply in love.

So the rest of the story, everything after this framing scene, is told in third person.

***

Could be fun. But I have a different book to finish. And after that, I really should switch gears. Work on a bit of SF....with a twist.

Saturday, August 16, 2025

Sixteen Tons

I'm not finished with The Early Fox but already have shiny new idea syndrome. Except it isn't that; it is more that I'm thinking about where I'd go with this series assuming I continue with it. Really, I am hoping to switch gears and do the Tiki book next. And I have two other SF ideas.

Well, SF-adjacent. Been thinking about that too. The old idea, the Galaxy back cover idea, is that SF is about ideas. And none of my three SF books in progress are really about ideas of science, technology, how they change human culture, any of that. They are more about mood. Crossing pulp SF with the strange world of Tiki. Doing a steampunk vibe in an unusual environment. Bringing the Universal monsters into a transhumanist post-singularity setting.

More or less.

Anyhow, I looked at a map of the US and if Penny goes full Bill Bixby (or Jack Reacher with different musical tastes) and boards a random bus, there's a cluster of interesting cities she could end up at. Colorado was the most attractive of the adjoining states, though. So I looked to see what kind of interesting archaeology they had going on besides pueblo and paleoindian.

There was a relatively recent project at the camp in Ludlow. That is to say; the Ludlow Massacre, a turning point of the Colorado Coalfield War.

But I have to admit by being as intrigued by Penny waking up drugged and missing her recent memories somewhere in New Orleans. Doesn't have to involve zombies, but I would definitely want to talk about the diaspora. And the jazz.

Plus I really am wanting to do some kind of Kensington Runestone, Renn Faire, and "Viking" musical group.

Anyhow.

The "Asshole Apache" scene went simply enough. Then I started an interesting nonfiction book and that threw me back a little on something I'd chosen for that scene. Fortunately, it was a quick edit. I'm struggling my way through the following scene because while I've had Mary Cartwright, the NAGPRA liaison for Holloman AFB in a bunch of scenes already, I hadn't gotten into the relationship between her and Penny. So that's taken a few tries.

And I'm not...quite...stalled after that. Penny is going to keep looking for the reason why a radioactive body showed up on her archaeological dig. Mary is looking into contamination incidents that have been a problem for the native population for generations. At some point enough comes together for Penny to be able to make a horse trade with the new safety officer at the Waste Isolation Pilot Plant in the southeast corner of the state.

But it requires her finding where "Evil Kitty" (as I call them in my notes) was doing illegal dumping of hot waste. And I don't really have the connection to Juan Baca. (Plus, Evil Kitty is not the reason Penny tunes into the news to learn a man has been murdered by a giant pistachio.)

Where I am at the moment is she picks up a tail at the clinic, which turns out to be a company car for one of the subcontractors (they were only trying to frighten her off, and they thought she knew more than she did anyhow).

What clinic? Well, I want to do the Silkwood routine. In this case, it is someone in DOD who is trying to underline that she shouldn't be prying into mysterious decaying structures somewhere out on White Sands, so they suggest very strongly a nice little round of blood draws and whatever other samples to make sure she's not carrying around any new friends with a bad tendency to emit alpha particles.

Which all I know damn all about. The closest I've been there personally was the other way around, when the introduced me to the Molly Cow so I could get injected with some nice fresh technetium.

With any luck, though, by the time I've written those scenes, I'll have figured out a good excuse for Penny to discover the "Sheep Ranch," as I've been calling the illegal dump (again, for no particular reason.)

And I still haven't figured out why there is a clue at the bottom of that Atlas F silo somewhere outside of Roswell.

Watching Wicked now. The sets and costumes are gorgeous.

Wednesday, August 13, 2025

The Times They Are a Changing

The world has changed. This came up again on a writing subreddit. Charlie talked about it on his blog, particularly post-BREXIT. The world has changed and writers have to react.

The world always changes, but much of it is in cycles; the worst pendulum swings usually reverse. Things settle into a new status quo in which the names of the empires may have shifted around but the basic pattern of life is still the same.

And most fiction isn’t that topical.

But that’s the thing. It doesn’t have to be topical. Cellphones are ubiquitous and you have to plot around them now. This isn’t like changing the word “taxi” to “uber,” this change in the world means your characters (unless you specifically do something to stop them) are always in contact, always able to call the cops, and always know where they are.

And have a flashlight and a compass.

That’s the thing about a contemporary setting; it needs to reflect the world of the reader. The reason you do a contemporary setting instead of a historical one or an alien planet or whatever is so you can get on with the story without having to order a pizza like David Weber.

If you change or omit key aspects of that modern world (like pretending cell phones aren’t a thing), you’ve put the reader back into those speculative fiction shoes where they have to keep asking how things work here.

And why your heroine can’t just call “one to beam up,” when trouble starts.

https://boards.straightdope.com/t/how-david-weber-orders-a-pizza/606473

I set out to write archaeological thrillers. When I eased into the globe-trotting with a slow character origin story, it morphed the series into more like a travel adventure.

Already, in the first book, I was noting the overcrowding, the damage tourists were doing to old monuments, and the reaction in places like Venice (there were two characters and even a song about the problem of the “big boats" in that book.)

I started a year or two before COVID. That was bad enough. Now, this year, there's a world-wide reaction to tourists (and particularly the plague of Instagrammers, which is more-or-less how my protagonist is depicted making a living between stealing Golden Idols).

And for an American in Paris (or elsewhere in the world of 2025) the combination of tariffs, new VISA requirements, our already hostile TSA and ICE on top of that, has made not only America a less popular tourist destination, but Americans in particular less popular tourists elsewhere.

Sure, the option is on the table to just pretend none of this is going on, or somehow hasn't happened yet in the world of the story. That line can get a little fuzzy; do you pretend that certain wars aren’t happening?

But there’s another problem. As I’ve been thinking about what I want to do with the next book, or the series, and as I’ve been reading discussions on Reddit and other places, I realize that this is the very stuff I want to write about.

It was never going to be jungle adventures and golden idols without context. It was always going to occur within the framing of our real world. Real history, real places. Even real archaeological practices. The London book was about how a city changes in war and under economic pressures. The Paris book was about how history has made its mark on the city. The Japan book was about how changes in economics and society affect people and the way they view themselves. Okay, those were just some of the themes out there, but since the first book, ideas about the actual currents swirling about us and the way they change our lives was there.

The current book has the archaeological plot really revolving around NAGPRA and the associated issues. A true Indiana Jones type wouldn’t be worrying about the proper disposition of that mouldering corpse that just fell out of the spike trap.

So that leaves me with a fork.

I can continue as what is increasingly a period piece. How I would be showing world travel is both in a sense of what we’ve (currently) lost and, honestly, some of the things we did to fuck it up.

The other advantage is that since nobody is going to go to those restaurants (they closed them during COVID) or see those exhibits (they’ve already remodeled) I can let go of a lot of accuracy. Which is good, because it is increasingly difficult to get at detailed data for a period so near in the past.

The downside, besides the text becoming increasingly dated, is that there is so much cool stuff happening right now I’d love to talk about.

The other choice is to move the story. Let time pass. Or invoke magic. Or go to comic book time where it stops being specifically moored to a chronology (Tony Stark was always injured in a war zone, but the particular war has changed multiple times).

Oh, yeah. I’ve closed my order with 100 Covers. The result was rote. Both it and the communication was very literal, delivering exactly what was in the spec with no creativity and no joy.

I won’t be using it.

At this point I’m too dispirited to deal with cover people again. Plus I don’t like what the market is doing at the moment. I know the advantages in following the trends, but sometimes fashion goes down a dead end.

Last year, the prime look for “Archaeological Thrillers” (history-adjacent adventure and mystery) was low panorama of city or other story-appropriate environment, with a small figure in silhouette, back to the camera.

That didn’t seem to work with many people, and this year seems to have already moved to the same general idea (and still dark and crowded) but the back-to-the-camera figure moved closer and the lighting shifted for better modeling.

Really, I would be happier if the Urban Fantasy stalwart had stayed; same dark city, same dark figure, but in a much closer crop, the dark colors were mostly in the obligatory leather jacker, and the back-to-the-camera pose had an over-the-shoulder element to it.

In any case I’m thinking more of a Regional Mystery (apparently, not a category Amazon supports as a search term), with some rural setting in the same low panorama, and NO prominent human figure.

New Mexico desert, footprints in sand (superimposed on the sky if that works), the test tower of the Trinity explosion.

If I was going Full Hillerman I might be temped by using the NM flag instead of a sun…but since that is a well-known bit of cultural appropriation in the first place (Zia Pueblo did NOT give permission to use it)...

Thursday, August 7, 2025

Chatter

So here's the big thing that is wrong with using AI to write your book.

The book is the writing.

This works for visual imagination, too. Hell, in that case we can go right down to models of the human visual system. Know the blind spot? Yes, but you can't see it. That's because vision is an illusion. You aren't seeing this 3D world in high detail, straight lines and everything. That's constructed for you in your mind.

Or, rather, the illusion of it is. You think you see the world in high detail because anything that catches your attention, your eyes flick to it in that ceaseless motion they are always making. Your mind is maintaining this sense of the rest of what is in the visual field and giving you this emotional impression of it all being there, even though the reality is that it is in lower detail outside the center of your vision and your moment's attention.

It is like a dream. When you experience it, you think it is all there in detail. You also think the story makes sense and that's what I am winding back to.

Because anyone who actually crafts an art realizes that while the basic shapes and that impression of it all making sense is there at the outermost level of detail, at the most zoomed-out level of perceiving it, the experience of the book or movie or artwork is the encountering of details that agree with and support that impression.

And these details aren't in the writer's head. They aren't in the artist's visual imagination, no matter how good. Because the human brain isn't big enough to hold it all.

That artist above had a concept of the character that informs every stage. She didn't have to draw the shoes before she knew what kind of shoes that character would wear. But at the same time, she didn't know how those socks folded or how many laces or any of that because those details didn't matter.

In many cases they unfold from the underlying conceptions in a logical way. Or can be reconstructed from basic principles. They don't need to invent the concept of "shoe" just to finish a drawing. They can also start that drawing knowing that shoes exist, that they as an artist have drawn shoes before, that they know how to look up a reference if that fails. It is, to borrow the math joke, a problem for which "a solution exists."

But the specifics of that shoe supports that original idea of the character, and the execution of it is unique to that artist in many ways, and the combination is that which makes this her drawing.

At the very best, if you ask ChatGPT to write your novel for you, it is only using that first gestural drawing. None of that input the artist makes is there.

And that's best case. The AI operates not with a deductive logic but statistically; it will add the kind of shoes that are more likely to be added in similar circumstances. This is a place where an artist could say, "ah, but he might have penny loafers with tassels, and that could add a little flair that isn't otherwise visible in his dress." The AI can't make those kinds of decisions.

It can give the illusion of making them, because it will make some decisions and some of those will be low probability. But even outside the "death of the artist" argument, since there's no connectivity here, the details won't support each other.

More artist talk. See that gestural drawing that's first in the series, and how that captures how the character is standing? Now look closer. The drape of the clothing follows how that clothing would have to move from that person assuming that position. The line-work points and subtly accents the underlying line of action.

Look at an AI image and the line is broken. Because there never was a line; any of the parts that remain are borrowed chunks from similar poses and similar choices made by similar artists that may or may not resemble each other in this specific aspect.

Every single line of dialogue in a novel is doing something. Every choice of a word in a description is doing something. It isn't a a "gaunt" stoney outcrop because that's a synonym for bare, it is because the writer wanted you to be thinking of sunken cheeks. Of hunger, perhaps, thus helping to establish that this is a place bare and inhospitable. Or it is "gaunt" because they'd used "barren" in the previous sentence and those two sound too much alike. Or it is "gaunt" because three paragraphs down there's going to be a little joke with it.

Again, the AI can do this. Not through intention, but because it is in the training data, and the patterns are familiar, and some other writer once made a similar choice even if for different reasons. So it can come up, and it can convince, create that illusion of mind, the way a dream can appear to have a rational plot at the moment you are dreaming it.

But none of this is the choice made by the person who asked AI to write their book.

No. You didn't find the cheat code to make art. You didn't find a way to skip the boring part -- because the part of it that is your book isn't there in the idea, in the outline, in the prompt.

It didn't exist. It never existed. You've got an illusion of this wonderful book that just needs someone to put the words down for you. No. You don't. That is the blind spot speaking, the dream speaking so compellingly. The book in your mind doesn't exist yet. And it will never exist.

Unless you write it.

Diffused Goals

Hit a stall on The Early Fox. Possibly due to the new meds — which are promising, at least.

I was reading up (well, mostly listening to a podcast series) on the Apache, and looking at videos of Cloudcroft, NM. Sigh. Cloudcroft really doesn’t fit the vibe I was going for.

Took several days to figure out that this could be a good direction after all. And now the cast living inside my head has made adjustment to their new status and they’ve come to life again.

But they are no longer searching for Doc Noss’s lost treasure. That pulled the narrative too far off course. Pity, because I’d even worked out a clue they could enlist Penny into helping with. (See, an old letter was using bad schoolboy Greek, but Penny recognizes it is a paraphrase from Xenophon, because that’s the schoolboy lesson. And her Greek is equally bad and she’s making the same kinds of mistakes so she understood what she was looking at...)

Anyhow.

We’ve got this deeply ingrained instinct to learn shit. And once we learn a new thing, we get proprietary about it. We want to get better, and we want to boast. Yesterday at work I got into a conversation about 7400 series chip codes. It is hard not to remember once being good at a thing, and wanting to pick it up again and polish those old skills.

We get into Jeff Goldblum territory too easily, where we start pushing at a thing because we’ve gotten intrigued by the technical challenge. And we lose track of why (if there ever was a reason) we wanted to do it in the first place.

So, Stable Diffusion. AI image creation is moving with lightning speed. This is more of a tech bubble thing where the industry is visibly trying to excite people, and throwing a ton of money at it (which hides much of the true cost), but nobody has quite answered what it is they are actually trying to solve.

They’ve got a cool thing, and someone must be willing to pay money for it. Dot dot dot profit.

All of us down much lower on the tech pyramid are chasing around trying to learn about it, trying to figure out how it will affect us. And in some cases, playing with the thing. A project that started out as fun but is now increasingly just about the technical challenge.

I’m still on the aging Web UI based AUTOMATIC1111 front end. Mostly because I already know where everything is. And my hardware might not be able to take advantage of the modular structure of ComfyUI.

The WebUI SD implementation was originally built around the SD1 model, based on the LAION data set, a 512x512 pixel data set. The SD1.5 proved the most popular and long-lasting.

I’ve never been particularly lucky with AI upscaling. Probably because I’ve been generating with a variety of LoRAs with narrower and more specialized data sets and focus, and lacking those resources, the upscalers tend to try to turn everything into a variation of what it is they expect to see.

A basic and perennial problem with AI. Even the more recent data sets are mass data scrapes of largely copyright archives. Poses, for instance, are over-represented by advertising, fashion, news; meaning they default to the standard upright and facing (with a 20-something, good-looking, white model, too). The AI borks when asked to do fighting poses because that’s such a smaller part of its resources. Even if it starts with the right pose, it drifts off (or it fleshes out its equivalent of a gestural drawing with the wrong muscle groups and clothing details — all of them belonging on a model in a more familiar pose.)

And you may ask, how can you generate at a higher resolution in the first place? Because the source images weren’t all taken at the same distance. One might be a full-length person, one might be a close-up of hands. It uses the later to fill in when it is doing the later passes.

Theoretically. Since it is looking for any resemblance that fits the guidelines, you can (and sometimes do) find a clear kneecap instead of a knuckle. Because the dice have no memory; there is no underlying plan. At every sequential numbered step between the original gaussian blur and the final render, it is treating it as a new problem of “what is this blurred image and what in my training data might look like it?” Modified of course by prompt and other weighting such as ControlNet.

This relates to what is seen as the problem of hands but isn’t a problem itself; it is a diagnostic. But I’ll get back to that.

The next model was the SDXL, which used a 1024x1024 set of sources. With some curation towards representation et al (yet, still massive copyright violations). So with that as a base you can generate at 1024 native, and up to at least 2048 with a low level of artifacts.

For me personally, I couldn’t get XL to run correctly. There’s a fork called Pony which added a ton of anime images (2.5 million scrapes of anime, and furry -- or couldn’t you guess?) That biases that model, so there are some forks of Pony towards more realistic images.

I’m using one of those as the base model now. Each model has its own peculiarities, both in the variety of training data, the weighting of parts of that data, and the prompts which are recognized. One model might completely ignore “Mazda,” another immediately spit out four-door compacts.

(Or ancient sun gods).

This is the basic and endemic problem of AI; it converges on the norm. More than that, it produces a convincing simulacrum of that norm.

Which is not to say people aren’t able to explore personal visions. But that convergence means, among other things, that the dice memory effect gets amplified. The AI does not understand it is supposed to be a steampunk dirigible. At every step of the render it will be attempting to relate what is in the image to what it finds familiar.

LoRA attempt to swamp this effect by having their own pool of training images which are heavily weighted. But since that is a smaller number of images, they can’t handle the variety that might appear in the final image. So it started to render a brass gear, but it ran out of reference material that matched what was currently in the render and swapped it out for a gold foil star.

But back to my current process.

Inpainting is the key. Inpainting is basically the img2img process with a mask.

When you are generating from scratch, the engine fills a block of the requested image size with gaussian noise. It then progressively looks for patterns in what is first noise, then a noisy image. In img2img mode the starting point is a different image. A selectable amount of noise is added; basically, the AI blurs the original, then tries to construct what it has been told (by prompt and other weighting) to expect to see.

Inpainting mode further restricts this with a mask, meaning only certain parts are corrected. In a typical render-from-scratch workflow, the area of a badly rendered hand is selected, then that part of the image re-rendered until a decent hand appears. (Not picking on hands here, regardless of how meme-able those have been. It just makes an easy example).

For my process, I select the dirigible (made-up example; not sure I’ve ever attempted a proper steampunk image) and load up a specific LoRA and rewrite the prompt to focus attention on what I need to see. Then I switch to the guy with the sword, inpainting again with a pirate LoRA and prompt, and so on until all the image elements of this hybrid idea are present.

I want to get back to this, but the idea of a dinosaur in Times Square is easy to achieve with any of the various AI implementations, but only casually. It will not be a good dinosaur, or a good Times Square, and the ideas will get contaminated. The dinosaur will get Art Deco architecture and the buildings will sprout vines. At a casual glance, it is fun, but this is why AI is and will probably remain unsatisfying.

When you drill at all deeply, it is getting it wrong.

I just tried to do some desert landscape and at first glance, sure, it does all the desert things. Sand, rocks, wonderful sky. Except. I’m no geologist but look any longer than a second or two and the geology looks just really, really wrong. And there’s a reason for that besides lack of sufficient specific references that force it to repurpose more generalized resources.

That reason is that this is entirely built on casual resemblances. There’s nothing in the process resembling the rules that underly the appearance of almost all things. It doesn’t put two hands on a person because it is working through the denoising process in assembling a person of standard anatomy, it does this because most of the training examples present it with more than one and less than three hands.

It has no concept of hand. It finds hands in proximity to arms but, like a baby, there’s no object permanence. An arm that goes behind something else now no longer carries forward the assumption of a hand being involved.

That's why you will hear the AI bros shouting that hands are a solved issue. They aren't "solved"; the symptom was attacked brute-force style by giving it more reference images of hands until the statistical probability of looking like a normal hand rose sufficiently. The underlying problem remains.

If you look closely, and especially if you have any subject-matter competence, the details are always wrong. No matter what it is -- and the more out of the mainstream the subject, the more likely it will be wrong.

The big models were trained on a shitload of human beings and the average of that mass is a thing that looks to the casual eye like a human. We apes are trained to pay attention to apes so one of the ways AI images convince for a moment (before Uncanny Valley yawns wide) is a nice smile and a pair of eyes you can make contact with...and it is only with a longer glance that you see the scenery behind this particular Mona Lisa is a worse fantasy than whatever Leo was painting there.

That's the trick of AI. It has this glossy, convincing look that up until AI came along took a lot of labor and a lot of skill to achieve. Just like LLMs can convince us with four-dollar words and flawless grammar that the facts contained in that text are also correct. But there's no connection. The kinds of details that took a photograph or a really dedicated painted are achieved without effort, because these are just surface artifacts.

In a slightly different context, Hans Moravec talked about why we overestimated computer intelligence for so long: because we humans find math hard. Adding up large numbers is hard for us because we are general-purpose analog machines and the reality of the elaborate calculus we are doing just to catch a ball in on hand is hidden from us. So the machine, by adding big numbers, looks smart. And we can't understand intuitively why recognizing a face should, then, be so hard for it.

So AI images have this same apparent competence. It takes an artist’s eye, or an anatomist’s, to see the pose is fucked up, the muscle groups are wrong. The more you know what to look for, the more you fail to see the things that a Rodin was able to carve into clay and stone. How this finger flexed means this muscle is tensed. There’s reasons for things. There are underlying structures.

It’s not just a pile of rocks in the desert. It is the underlying rock partially covered by weathered material.

But back to the art process. I am well beyond inpainting the bad hand. I am doing this inpainting cycle right down to the basic composition, because what I am after lies too far outside any trained concepts or available references.

Part of that fault lies in my base model of choice. Especially the 2.5 million image Pony set is character art, so very presentational in a single large posed figure. It doesn't want to do a long camera view of three people having a conversation.

I usually start with another image. It might be a generated image — but one that might be using a different model entirely. And even that will be so far off, that image goes outside to be painted on with a tablet and digital brush.

Other times I've started with a photograph that is close in some way. Or a rough sketch. Or once (and it went so well I mean to continue the experiments!) a posed artist's mannequin.

Multiple passes at different levels of blur and different focuses of prompt are needed to get the thing to move in the direction I’m envisioning. For this, another useful dial to tweak is the “steps” dial. A low blur and a low steps means it doesn’t change much, but it changes it with very little resemblance to the original.

A high step count means it moves conservatively from the blurred image, refining a little bit at a time, and thus tends to preserve details in a way that low blur doesn’t do.

High blur, on the other hand, frees the engine to make radical changes in shape and color; changes that are not the same as the conceptual changes aimed at with low step count.

Often, though, the AI needs a more direct hint. So back through an external paint application.

This part is peculiarly fascinating to me because, a bit like the Moravec example, it requires me to think in a very different way. For one very basic lesson, the AI responds to value, not lines. That's one of the tough things for many young artists to learn because from the first moment we pick up a pen we tend to think in terms of lines. Of outlines, of borders. Seeing things in shading planes is a further step. But seeing just raw tones, divorced from other clues; that's an unfamiliar way of looking.

That pride in learning a new skill? I have a certain pride in knowing the shortcuts that communicate to the AI. It isn't about realism, it is about certain tricks it recognizes. And on the flip side, avoiding things I have learned confuses it. This happens in prompting, too, have no doubt about that, but it is a particular joy in being able to use those skills of seeing and visualizing that I used back when I was designing for the stage. Or trying to learn how to draw comic books.

In any case, the last steps are performed on the whole image, using a more conservative LoRA and prompt, low blur and high step count. This emphasizes refining, cleaning up what is already there. The final pass is done with multiply on — my graphics card can handle up to 2.5x the working resolution without tiling (and I’ve gone 4x with).

I know the upscaler is supposed to use the prompt and LoRA from the image in question but this method gets much, much closer to conserving the details that are peculiar to that LoRA.

And at the end of it I look at it, say, “That looks cool” and then close the file. Because there’s really little purpose in it otherwise. The goal was learning something technical.

Sunday, August 3, 2025

Singing the Blues

Part II is complete. I'm aiming for a shorter book this time so I have as little as 20K to go before the end game. That's 2-3 set-piece scenes, a couple of long drives, another desert wander if I can do it and a bunch of conversations. For what was planned as a novel with mostly silences I'm ending up with a hell of a lot of conversations.

Have decided to eschew continuing the outline, and just see how the story unfolds. Maybe I should plan more. I am really, really looking forward to switching gears to a couple of SF novels where I can ration the world-building. This series is largely about showing off a region and a culture and there's an additional constraint that the real world isn't tidy. I can't have a single planet, war, piece of tech that sums up a theme or idea I'm trying to put across. Instead I just have to deal with the mess of nineteen different tribes in New Mexico alone. Even when you go back to the Ancestral Pueblo (who we used to call Anasazi) they aren't the only or dominant culture in the area.

So I got Freeman singing some blues and a song that might be too on the nose. And I felt obligated to at least mention the Mound Builder Myth -- it is part of the themes I'm developing but I can't spare pages to go into it properly. And I really need Jackson and Sanchez back for another scene before the ending so I'm dreaming of a sequence now where they get in the way of a truck that's trying to run her off the road, a la Silkwood.

The only episode I've got at all planned out is I'm gonna go to Cloudcroft. That's gonna be the big fix of western history, Indian Wars, treasure hunters standing in for prospectors, and a group I'm calling the Asshole Apache in my notes (that's how the NAGPRA rep at White Sands referred to them). Another bunch of retired guys in a bar, but instead of playing the blues (and old protest songs) they are talking up past exploits and plotting how to get at the Victorio Peak treasure.

The whole thing might be too on-the-nose to feel right. Too much easy stereotype there. But they've been living in my head long enough the scenes and settings and conversations have all grown around them and at this point all I have to do is write them down.

Oh, and do some research on Cloudcroft, Lozen (and the Apache generally), plus I've found some good stuff on the early days of Los Alamos. (The Netflix series is...a Netflix series. Too much history is changed because the story they wanted to tell is sex and suspicion under the tensions of building the first atomic bomb.)

And listen to some more of that Delta Blues.

My opinion of the moment is that AI was pretty much the inevitable next step in what was already happening in publishing. And in the pop music industry, for that matter. Amazon Kindle is the literary equivalent of streaming music services, and when you build a business model on quantity, the pinch point is how much you pay for processed creative product.

People have been exploring that with low-content books and short books. They'd already reached a point where writers going into the eBook market couldn't afford editing or boutique cover services. In fact, the pressures of that algorithm running the firehose of "more books but cheaper, please" means even spending longer than four months to write the damned things is a luxury unaffordable for the self-published writer.

I am writing faster. I feel I've finally made a breakthrough where it really is starting to come easier. But, as with so many things, I seem to have arrived too late. I fixed some of my outstanding health issues maybe four months too late to jump on a new position I really, really wanted (and I'm still dealing with the fallout from that). I put money in the stock market just before one of the big crashes. Joined the Maker Movement and a hackerspace when that was imploding under the weight of commercialism. Doctorow had only the corner of it; enshittification is happening everywhere (and has been happening for a long time, will always happen).

The landscape of fiction is changing so rapidly I don't even recognize it now.

Of course, here I am writing a travel adventure series where we finally crawled out of COVID to hit world-wide revolt against the growing problems of mass tourism (something I did indeed write about in my first book, with the horrendous problems suffered by Venice). And as of this month it has become increasingly difficult to Fly While American. We've managed to piss off so much of the world that even (unfairly) pretending to be Canadian doesn't return travel to where it was even ten years ago.

Hell, I had story lines planned both in Moscow and in Tel Aviv. Not really stories you want to be trying to tell at the moment. Everything is changing so rapidly. I struggled enough dealing with the ubiquity of GPS and translation software and Google (although, oddly, that enshittification is actually helping there. It has reached the point where "I'll just Google up this obscure historical fact that will solve the mystery" is no longer the panacea for the problems faced by an Archaeologist-Adventurer.)

Oh, yeah, and my latest mass-produced cover is so...meh...I don't even have the heart to get back to 101 Covers and see if it can be rescued. I'm close to just writing off that hundred bucks and doing something different.

Not AI, though. I'm not desperate. Or stupid.

(We didn't need a university-level study -- there's at least two I've read on pdf -- to show this idea that the original training data is so finely ground it would be impossible to return the original images from it. Well...poke around enough, and I'm pretty sure you could identify that artist's signature that the AI put in there without even being asked for it!)