Wednesday, December 18, 2024

Come here Watson, I need you

About midway through the game Horizon: Forbidden West the character Sylens does something that appears on the surface so utterly stupid it feels out of character.

And immediately you are pushed into contemplating the Watsonian vs. Doylist question. Is there a in-world explanation for this baffling choice? Or is this something that was necessary for game purposes, necessary enough to supersede the lack of proper motivation?

Not that it really matters for that game. You can be as paranoid as you like, try to predict where the inevitable betrayal is coming from, but that betrayal is going to be on the other side of a cut scene -- you won't even be able to lay traps in preparation.

And, really, it doesn't matter so much in reading a book or watching a movie, either. That you know what is coming won't change the story that is presented. It will, however, change your experience of it. There is that perfect balance, sometimes referred to as "the expected surprise," when you can look forward to something happening with pleasurable anticipation and then get that jolt of satisfaction as it finally unfolds.

I say delicate balance because if you guess too early or too fully, then the revelation feels pointless when it finally happens. If you guess too late or not at all, there's less power in that key moment. The goal as an author is to ramp up the anticipation, whether it is a revelation or a long-overdue retribution, until it drops with the greatest possible impact.

Back to patterns. There's people (I've met several) who can watch a Perry Mason episode and know by the second reel who did it. There are two main strategies here. One is Watsonian; the desk clerk did it because nobody else would have had an excuse to open the cloakroom door after ten o'clock. The other is Doylist, but comes out of an instinct for structure. The murderer is the maid because she is the third suspect introduced, and the only suspect with a perfect alibi.

I'm calling this story patterns until and unless I find a better term. These are more than tropes or genre conventions. These are basic ways that story tends to get told, patterns that can be recognized.

And they are on a continuum. There are story patterns that come out of the needs of the media; such as recognizing a character will be important because he is being played by a well-known actor. And there are patterns that are part of the language of media and intended to be understood; like the soldier displaying a picture of his girlfriend. Story-telling shorthand, in other words; ways to inform the audience about the structural shape of the story without spelling it out.

The go from the near-universal expected to be grasped by all audiences, to the more subtle that require experience with that particular genre to read (those Perry Mason guessers were people who had seen a lot of episodes). So they aren't always read by all audiences. Not with equal ease.

***

That same day, I also started reading a new urban fantasy set in Paris. Almost immediately I had two Doylist realizations; the writer was not American. And the writer was also not French (turns out she's German born and now living in New Zealand). It was also pretty obvious the writer was female.

The European attitude is more subtle and harder to boil down to specific observations. It just didn't feel like the way an American author would approach it. The French thing was...it was a little too "look, here's the Eiffel Tower!" Things that were distinctly French and communicated that idea to the reader, but that aren't what a French person would think of as what was important to the story.

This one is a woman who can talk to ghosts (and unlike for Hotspur, the ghosts answer back). It opened in Pere Lachaise and I was already hooked. When it was revealed her day job was at the Pantheon...that's when I bought the book (the main action of the book, however, takes place in the Catacombs).

I was also admiring the experienced way the writer was building the story. The complications (the protagonist's relationship with her family, a suspicious cop, a veiled warning from Victor Hugo) were dropped at exactly the right places in page count and the rhythm of the story. This is another one of those expected surprises that come out of a well-established structure. You don't have to drop the complications -- you don't even have to have the body drop, or any of the other big ones -- but it is so satisfying to a reader when they are happening just when you anticipate they are going to happen.

It is like the experience of listening to music, when the chord sequence is pointing you towards a cadence that finally falls. After reaching all the way to the 9th or the needle tension of the dominant 7th, to finally drop back to the root. (Or to go somewhere completely different, if you are Sondheim...!)

***

I like reading the first book of a new series, but I think I like watching the opening of a television show even more. Because those guys are really, really good at the job. Introducing the world, the cast, the conflict. 

I got a few episodes into Continuum. There's a difference between an older series like Bonanza and a more recent series; the long form. Something like The Expanse doesn't have a status quo. You aren't expecting to find the same cast and the same situation. Take the Enterprise away and Star Trek stops being the same show, but take the Rosinante away and that story continues without a problem.

This means that, as in the self-contained form of the novel, questions are being continuously raised and answered. This also means that not all of the world-building is front-loaded, because those are some of the questions which are reserved to be answered slowly, as the series progresses.

***

It is good I am getting some reading done (and watching) because, since at least my nasty bout of COVID at least (and possibly since finishing the Paris book) I've been unable to write. Not at all.

More on that later.

Wednesday, December 11, 2024

Hugin and Munin were drones

Yes, that was actually said by someone. Ran into that "theory" at the APN (Archaeology Podcast Network), which I finally remembered to add back to my podcast feed.

I don't think I can pivot the Athena Fox stories. They've gone four volumes and given how slow I work, I'm not about to rewrite all of those from scratch. And, oddly, several of the comments I've gotten say that pivoting into more of an adventure/thriller direction is not what those commenters would want. A friend -- an American ex-pat I visited in Paris -- didn't know what to make of what she called the "James Bond" stuff in the last book. Good thing I didn't try her on the Japan one!

The books want to be mystery books spliced with travel books. They'd be better if they were less dense, and if I did rewrite from scratch, I'd make changes to my protagonist and her background to take it away from going quite so in-depth in culture and history.

Which would also be plausible with a bad idea I had just today.

The latest 'cast of Writing Excuses talked about opening a story with a thriller. Not necessarily a body drop; more like a Call to Action. I have known for a while, talked about on Quora, and made a conscious chose to start my first book with stakes. Only there weren't really stakes; Penny started the book with a goal, but the obstacle wasn't immediately obvious.

The other aspect that podcast got into is that this opening thing is usually a shaking of the status quo. It isn't "Here we are in a YA dystopia, things suck, we should do something about it." It is more like, "Here we are in a YA dystopia and oops -- the secret police are banging on the door." Meaning, we have joined the story at that moment where the characters literally can not go on as they have been.

Thing is, even starting with a body drop (as the New Mexico book will) is hardly the same as having the clear and present danger of a destructive force just uncovered and our heroes the only people in place to stop it. That is really so very much easier with fake history. Not just fake, but a particular kind of fake, where ancient aliens or items of power or a pharaoh's curse or whatever are, well, real.

Which I was loathe to do, which is why the Paris book never gets higher than the stakes of an idiot would-be treasure-hunter about to take a crowbar to one of the grotesques on Notre-Dame de Paris.

The bad idea is...what if they are real now?

As in, history is taught in that world the way it is taught in ours, and as with ours, it is largely correct. But something happened. A mad wizard did it (or in SF circles, Alien Space Bats). Now both versions are correct. Imhotep stacked a bunch of mastaba to invent pyramids. And Grey aliens beamed down to awe the puny humans with the ability to stack a bunch of rocks.

This way, we don't have to insult working historians. And our protagonists can declare their astonishment without looking like idiots who never noticed the actual suit of armor worn by King Arthur is on display in the White Tower.

But two big problems. One is on me. Not just that it is too easy to have the protagonists snark about how stupid the idea seems, but that it is very, very hard not to get dragged into all kinds of related story tropes.

There's a fun little two-book series by Seanan McGuire (Indexing) where fairy-tales are coming true in the real world. Warts and all; these are the Grimm versions indeed. But her hand-wave is like Sir Pterry's "Narrativium," where Story (as Seanan puts it) is an almost conscious and alive force that really wants the fairy tale to happen and to play out properly. So all the trophic elements happen -- weaponized by both the good guys and the bad guys, even.

Having a mystery thing happen and now Atlantis was real and divers can reach it is far too much temptation for a writer like me to have protagonists and others start reading the Evil Overlord list and weaponizing being genre-aware.

https://www.giantitp.com/comics/oots0763.html

The other is...

The really well-known pseudo-history and magical artifact and lost city stuff is, well, usually not good story. There's no internal consistency. Arthur above; even Mallory, combing every single Arthurian epic he could get his hands on and combining them into a coherent whole, was only somewhat successful at it.

As with conspiracy theories in general, the point is never about how Atlantis actually works and how it was hidden -- it is about how "mainstream science" are all poo-poo heads. The most common stuff is along the von Danieken ilk where it is a bunch of "you can't explain this!" thrown at a wall in hopes some will stick. There's no consistent through-line, no single underlying theory. 

(The Apollo Program deniers and the Creationists are so very much like this.)

Sometimes you do get a good story. If you are talking about the modern and specifically against-the-mainstream creations, the full phantasmagorical story of the land of Mu, for instance, has all the right cast of characters and geography and deep history and all of that.

As do older myths and legends, as inconsistent as they are.

There's pretty much shit-all for a well constructed story of how everybody managed to mislay a continent the size of Asia

Which suggests to me that the "suddenly all the myths are true" isn't a good way to construct a fictional universe that one can have adventures in. I think you need a spoof explanation for why many myths may have a basis in truth. Like the Stargate universe does.



Saturday, December 7, 2024

The Adventure Continues

Horizon: Forbidden West went on sale during the massive and ongoing black-cyber-frimonday sales and I've started it. It looks gorgeous. But something has slipped in the character acting. Even compared to the pre-remastered Horizon Zero Dawn the characters feel less lifelike and less interesting.

A lot of this is intensity. Sun-King Avad is a bit of a dork in the original. He is young and inexperienced and doing his best. But the weight of his position gives him gravitas, he has charisma and he's hot. But above all that, there's intensity.

The HFW version he just comes across as a bit of a goof. Intensity is the same complaint with several other returning characters, even Varl (despite growing a beard). I've got a sneaking suspicion at the moment that this was intentional. That they told the voice actors to dial it down a little, possibly to make the later original characters look better.

***

Anyhow.

The thing about an adventure archaeologist character is that there are always ideas for them. Especially if you have already decided not to go the way of the greatest hits, because there's only a dozen big name artifacts and even fewer big-name locations to discover (and once you've done the Atlantis story, it is hard to top it with somewhere even more big important and magical.) There are, however, millions of ways a student archaeologist can get in trouble, even if they end up being more cozy murder mystery plots and the archaeology is tangential.

I know the things I don't like about the previous books, but I don't know how to pivot. Making a big pivot in a series is tough because unless you want to re-write all the early books, you are setting up for one audience then changing to a different audience. How can you introduce the new reader you are after when the first books aren't like that at all? And what about the readers that got interested in what was happening in the older books?

My version today of what went wrong is that the absurd detail and the way that detail is presented is baked in, and will be there as long as I continue with the format of throwing an inexperienced traveler into a new culture.

That's the biggest part of it. The first book was largely about Penny being overwhelmed, and gradually coming into confidence in navigating strange places. And I've kept that, at least as long as she is still going into places that are fully immersive; where she doesn't speak the language, where she is having to eat locally and sleep locally and otherwise deal with the unfamiliar culture 24-7. 

I backed off a little in the Paris book. There are several unusual things about the Paris book. She is largely in tourist areas and most of her conversations are with a fellow American. And her dip is less into modern Parisian culture and more into history -- and at that, it is art history, so further divorced from her Japanese experience of finding herself in a wooden room with tatami mat floors and going, "What am I supposed to be doing here?"

Thing is, I am also doing classic mysteries. My read is that the Cozy Mystery genre introduces a cast of characters with issues and that is the Gordian Knot that needs to be unravelled (cutting isn't usually allowed in those stories).

I don't know if there is a name or even a recognized genre, but I am writing mysteries where the place and the culture are the thing that has to be understood. The solution to the mystery in each of the Athena Fox books is reached through gaining a gestalt of a place and people. And the process of gaining that gestalt is through being a sponge. Learning everything she can because she doesn't know what the important stuff is yet.

Come to think, Asimov's Caves of Steel and some of Niven's Gil the Arm stories also hinge on grasping subtle elements of culture. Many is the case in those stories where someone says, "Belters don't do that because on a single-ship..."

This may change. The things that are at the top of my list right now for new Athena Fox stories have several that are a local sub-culture that can be experienced in small doses with a ready retreat back to the familiar. And Penny is gaining confidence and experience to where she isn't intended to be overwhelmed but instead has the tools to pick up what she needs and keep her cool.

***

So what's at the top of the list right now? I mean, I want to do underwater archaeology, and do the Holy Land, and visit Antarctica...but my list of plausible and might do them soonish is rather smaller.

The White Sands one. I've already backed off on trying to work in Old West stuff, or ghost towns, and I might have to put the UFO stuff further back in the mix. More and more, it is about that specific bit of geography and the various peoples that have inhabited it. Three in particular; the neuvomexicanos, who are connected to the pueblo -- mostly Tewa. The Los Alamos group. And the hominid who may have left footprints well before the Clovis peoples.

The Darien Gap one. Archaeological tourism, some mayincatec stuff (whatever seems appropriate) possibly the fake artifact trade, and a survival story.

The Minnesota Vikings one. Penny revisiting a different life-path by getting hooked up with folk music and Viking re-enactors, plus of course some pseudo-archaeology like the Kensington Runestone.

The one on a boat. The private yacht of a billionaire collector is in international waters and an eclectic group of feuding experts (and ringers and spies) are gathered to try and figure out which artifacts should be repatriated, and to whom. 

And last place is split between hanging with warbird fliers and the kind of WWII buffs who dance to big band music at the Hornet Museum, with of course an experimental flying machine unwisely named Icarus in the mix. Or, one about a brand-new science museum with a living exhibit on the space race and early visions of the future; L5 society, plus maybe work some Lustron Houses in there somehow.

Sunday, December 1, 2024

Impatient Inpainter

I started a quick throw-away artwork just as a demonstration of some of the ways img2img and inpainting works within Stable Diffusion, via the Automatic1111 webUI (off-line installation).

I tried to pick something that was fantastic enough that it wouldn't get too hung up on trying to make the tech look plausible or the coke bottles the right shape, something that didn't have obvious ambiguities the AI language parser would get hung up on, and that wasn't layered or a complex pose or something that would be difficult to stack (like my first idea, which was a sort of typical golden age SF cover of a guy in a space suit with a blaster in foreground, a spaceship and barren planet/moon surface behind him).

So went for a retro, well, pretty specifically Love and Rockets sort of flying bike thing.

Here's what the AI is spitting out when I tried to do it with prompts alone (aka the txt2img mode):


Nice details and a lot of surface gloss, but what the hell is that thing? It's also not the classic racing motorbike pose that these flying motorbikes seem to attract. 

So better to rough out the composition in paint first:


That is actually too detailed. Really, the less you put in, the better the AI is at working things out. Among other things, your way of drawing a shape won't be its way of drawing a shape, and even to get at the same end point you are better off handing it less to begin with.

Similarly, the first prompt was pretty bare-bones:


(That checkpoint is a fork off SDXL, but doesn't require the weird two-passes approach of the latter.)

First pass of img2img, with a denoising of .4 or thereabouts:


Yeah, I ran it about a dozen times, keeping the one that preserved more of the details I cared about, and sacrificing others. That's the trouble with doing things in this mode, with a single pass.

The more powerful mode is inpainting and selecting just part of an image. For which you also get to change the denoising level, plus you can edit the prompt to reflect just the part of the image you are trying to iterate on:


Also an example of prompt engineering. Things like automobile grill or air intake wasn't leading the AI in the right directions, Fortunately the model I was using was sensitive to vintage stuff, so a reference to the famed Shure-55 (not by name), triggered the look I was after.

I also canned dieselpunk and Cadillac pretty quickly. The latter kept adding a caddy logo to things. The former turned out not to be in the language model but it did seem to be hauling "punk" out of it and was starting to add mohawk and piercings.

At various stages bits had gotten too far off what I was after. Like that fin -- that one, I used the built-in paint window in Automatic1111 to slap red paint over the hood ornament and forced a re-render of a different look.


A couple dozen iterations as I fixed some details and changed my mind about others, and it was more-or-less what I was thinking when I drew it. A final beauty pass at a denoising of .3 to get it to all gel together and, well, good enough for the exercise.

But...what would happen to it if I stayed in img2img and rolled the denoising up past .5 ? (as a very rough guide, and it depends quite a bit on the model and less, but still there, on how easy that specific image is for the AI to interpret, up to .3 is a "clean up," at .4 it begins to change things, and somewhere just before .7 it "snaps" and gives up completely on respecting the original image, making up something new that only vaguely resembles the colors and masses.)


Ooh, an unexpected prompt-ambiguity lesson. "Riding" a flying bike, indeed. Yee-hah! (Plus, where the hell did the AI get those feathers from?)

But this does show the basic ideas are imbedded into the training data; there are stereotypical elements like the hotrod paint job or the fluffy white clouds that the AI puts in there even without me specifying those (the first image, after all, did not include the cloud background in the prompt. The AI put it in because that's what this sort of image usually gets).

So plausible. Try a few more runs and see what happens:


This one had promise. You can see both the bane and the boon of AI here; not just the details, but the angle/perspective that gives it such a better look...but yet, the perspective is messed up, and some of the elements are contradictory. 

So back to inpainting. That fin wasn't working, the jets had gotten just stupid...but I knew those were easy fixes. The "magic words" to fix the air intake, after I'd roughly painted out most of it with blue sky and sketched in a better shape, was "P-51." For the jets, "contrail" and "afterburner," even if I did have to make a second pass to remove a Blue Angel that snuck in there.

And that is where I really stopped.


(This is AI, and furthermore, the modified SDXL and one LoRA I used during a few trouble-spots -- not specific to flying rockets or retro SF in particular, just one that was good at this sort of illustration look -- are via Civitai therefore even more copyright-violating than the original SD training data. I present these under the shade of "academic or criticism" as my sop towards fair use.)

Wednesday, November 27, 2024

The eight-fingered hand

The morality of AI art. 

The subject has complexities, but it isn't anywhere near as murky as the people trying so hard to sell AI to the consumer are making it. Admittedly, whatever the core issues are, they are hard to untangle from discussions of the purpose of art, the economics of art, the process of art, and empowerment.

Take the last. Art is hard, and talent doesn't strike all of us, but even those with talent need to have time and -- for many arts -- finances to pursue it. It is far from impossible to go from the wrong side of the tracks to the concert hall, but there is no denying that it is harder than it would be for someone who went to properly-funded schools, had parents who could afford tutors, could afford a decent student instrument, etc. (Don't get me started on the heartbreak of Violin-Shaped Objects.)

There are always gatekeepers who, when art is democratized, cry out that it is being debased. Cutting loops in Abelon "isn't real music." And AI makes it so easy to cry "they didn't make art -- they just pushed a button."

But on the other side, there's a difference between enabling people who might not otherwise be able to pursue art, and selling the illusion of making art in order to make a profit. Not exactly peculiar to AI, this. Any hobby you name is almost instantly overrun by people looking to sell the hobby to you (in the form of "must-have" tools you were getting just fine without, and so on). AI, viewed this way, is a digital paint-by-numbers kit.

The potential customers aren't the only ones buying the illusion, either. Artists are right to be concerned, just as musicians were when they began to be replaced wholesale in certain fields. There is always the economic drive to replace what is good but expensive with what is good enough. And that's a race to the bottom, as the current "good enough" soon becomes the "good but expensive" and the search is on for something else...

As long as the intended audience will accept it, and that is one of the fears. Flood the marketplace with "good enough" and keep it there long enough, and the public would lose the ability to tell the difference. That's what the Academy de Beaux-Arts was afraid of...but what they tried to keep out, and failed to, were the Impressionists.

I think the public is canny enough to keep looking for better, if given the economic opportunity. For all the cheap schlock, there remains a paying audience. For all the fast food, grocery stores, markets and restaurants aren't going out of business. 

Enough of the public have learned to despise the quick-and-dirty AI art that the economic model of many of the big art sites has forced them to take steps to control the flood. Which leads into a discussion of the value of gatekeepers to the consumer. Self-publishing, for instance, exploded. There are so many self-published works that the costs to place your work are going up, and the readers are complaining about the difficulty of finding anything.

Self-publishing did flirt with AI. It got smacked down by Amazon the same way that every other get-rich-quick scheme that tried to use their site did (like low-content books). Writing is in a weird position as it seems to promise wealth and fame, but it is currently difficult for software to do the work for you. The people selling dreams to would-be authors are selling books on how to write, world-building software, and services.

And outright scams, because vanity publishing remains the best money-maker.

Not to say people haven't tried AI, but there are no easy riches in publishing. It is a buyer's market, and despite how much effort individual writers may be putting in, to the greater market their labor is cheap; so cheap there isn't a need for an alternative.

AI art, meanwhile, is having trouble pulling up money outside of the flash-in-the-pan niches of, say, monetized YouTube slide shows. The challenge is similar. The world has no lack of hungry artists willing to work cheap; all that AI can offer is its novelty (the surface gloss, largely) and volume. And the latter is self-defeating, just as it is in publishing.

Which slides into another problem with democratization. The big players in AI art have expensive rigs, and spend a lot of time at it. More and more, they are looking less like artists than like bitcoin miners. Even to how critical a high-end video card is.

And that looks like a segue into "but are you really being an artist." I believe that those slide-show makers are prioritizing pushing output. AI is in a peculiar place that might be inherent, or might be the current circumstance of technology -- and I am biased towards the later. Right now, the way to make money is to push out a bunch of art before the market gets over-saturated (too late!) and the way to push out art is not by being an artist but by pushing buttons on a powerful and expensive rig.

Exactly the model all those salespeople are wanting. "You too can be an artist...if you drop a thousand bucks with us on the right graphics card."

So here's the thing. We talk up freeing the inner artist from their inability to hold a pen or afford paints, but much more importantly, from the need to have a liberal arts education and time spent in traditional art classes.

But the infinite world of possibilities is...smaller than it appears. This has always been so in the arts, I hasten to add. An artist that wants to be seen or heard uses the modes and forms that are currently understood by the audience. There are always those (like those Impressionists) who are fighting to get something different accepted by a potential audience, but economically this is at best a gamble. The market reality is "do what everyone else is doing."

Technically an AI art engine can create anything, in practice, the people using it are narrowing the existing constraints of the training data even further in order to pursue the flavor-of-the-month and get those eyeballs they crave.

Let me explain in a little more detail. The training data was what the original academic researchers could scrape off the internet. Which means that a well-known painting is more represented than the output of an outsider artist. Meaning the engine is already primed to regurgitate the "look" of current media (which itself is feeding off itself, looking to other shows or other advertisements or other book covers as to what the audience is primed to expect).

It is very, very focused on what is common and normal in mass visual media. Poses, for instance, trend towards the presentational. "Showing off the new winter jacket" is the pose a figure will take even with a heavily weighted prompt that attempts to put them in an action pose.

As I said, many artists in those social media circles where popularity rules are going after what gets eyeballs. To focus in on that flavor of the month (or, more benignly, to focus in on whatever personal vision they are pursuing) the tool is LoRA (and checkpoints, and embeddings...but let's keep it simple).

And this is the thing. The academics who trained the original models had some shred of honesty and did their best to anonymize by using as much data from as many sources as possible. LoRA are more tightly focused. When a young artist thinks "I want to make stuff that looks like Masamune Shirow" they are drawn to a LoRA that was trained specifically on that artist. On a small number of works. Overtrained on them. So much so, given the right prompts it can and does recreate enough of a specific image you can recognize it.

Again this is implicit in the concept of the training data. Tell the AI to give you a Florentine woman with a mysterious smile and you could get anything. Tell it to give you the Mona Lisa and what you get back will be recognizable as Leonardo's painting. But there's a difference between training on a million images, and training on as little as six (some LoRA are that small.) In the former, you get a guy in a jacket but it isn't a recognizable individual guy or brand of jacket. In the latter...you might get one of the six images back complete in far too many details.

(It gets worse. Some LoRA go right out and say, "For best results use this image." That is, to base the new, supposedly unique image on an actual specific piece of art. And not with text prompts as in the Mona Lisa example above, but by, basically, taking that image, blurring it slightly, then reconstructing it with AI. But more on this when I post about the inpainting process.)

The social art world is fads of the moment and the successful focus is hyper-narrow. The original inspiration is clearly seen. Basically...this is digital fanfic. I mean; there are lists of prompts for the hopeful new AI artist that are the names of other people creating AI artwork.

(This is really nth-generational stuff. It requires LoRA that are trained on the output of artists who were probably already using the same...)

So while the AI proselytizers are going on about how stealing from a million anonymized images isn't really stealing, the practical reality is far from that case. Yes; the big commercialized online engines are filtered now with long lists of illegal prompts that can't be used anymore, including the names of public figures, the names of artists, and even the names of some art styles. But they are only part of the picture -- and the users are really, really good at finding the loopholes because the data is still there. They just changed the names to make it harder to find it.

Is this, though, different from being inspired by the style of another artist, or even the movement they have begun, and doing work in the same way? How does the specific kind of work done in making this derivative matter? Or is it the nature of the link between them?

From one perspective, AI is absolutely stealing the original because that original was fed into the computer. From another perspective, it has been digitally shredded in a way that makes it impossible to reproduce it exactly. No matter how close the AI reproduction may appear to human eyes, the pixel patterns are not the same.

Is a copycat more ethical than a straight clone? Is the fact that on a pixel level, down at the digital heart, it isn't actually the same an important distinction, or is this just a fancier way of filing off the serial numbers and selling it as unique? There are people flipping, cropping, or blurring clips from movies so they can post them on YouTube without getting caught by the automatic check for copyright violations. Is this really, substantially, different?

And does it matter if the creator could have painted it from scratch themselves? Does it make it better if they are a skilled artist in their own right? Does this have to be the same skill set as the original? Does it make it worse if they were "pressing buttons," that is, doing things that don't look like how we conceive of the process of creating visual art?

Because your average "hand painted" art these days is done on a screen with a hell of a lot of computer assistance. And resources which are not original. And some of those resources might not be paid-for commercial stock or copyright-free (cough Greg Lang cough).

On the third hand, it is somehow worse if a forger is skilled enough to have made original art, and chose not to?

This gets really tangled because all the way across the art world, homages, training by copying, working with a mentor in their studio, doing cover versions...this is all how artists learn. And so very many of the good artworks are part of a dialog; Vietnam vet Haldeman reacting to Robert Heinlein's jingoistic Starship Troopers and moved to write The Forever War. Saint-Saens spoofing themes from Offenbach and Mendelson in his Carnival of the Animals. Generations of artists using the pose from Michelangelo's Pieta

I mean, look, I'm currently working on a novel that is consciously and openly using "used furniture." Something that is meant to be recognized as retro. The characters and background are being carefully crafted to remind the reader of things they know (or think they know; the thing about retro nostalgia is that so much of it is rooted not in a deep understanding of the original, but an exposure to other people's distillation of those elements they find most cliché).

Thing is, though, it is arguable that AI artists are not having a dialog -- because they aren't engaging with the material personally and at that level. They are dancing about architecture; they are entering text instructions to a computer for it to make a mindless reconstruction of what it thinks is happening in the original work.

Perhaps. It is certainly true that one can go to a model that other people are recommending and copy a list of prompts from some forum, push the button and sit back. But I think that even the most production-oriented, assembly-line artist has that urge to chase their own vision. It is difficult not to engage your aesthetic senses. And there are functional choices that can be made at every step; all the way down to picking which generated image to up-rez and post and which to throw out.

For many, they are engaging with the image itself, on the terms of visual art. Adjusting the composition with their internal sense of aesthetics and whatever understanding they have of traditions. Discarding or altering poses and hands and musculature because they understand anatomy in the way a practicing artist does. Perhaps not as deeply, but not every artist has those years of figure drawing behind them. And, even, choosing prompts because they have some grasp of the history of art and the figures in it.

This is of course the basic Google Query problem; understanding that what you want is not the precise and technical term, but a common term -- one which may even be incorrect. You don't type "Elizabeth Tower," you type "Big Ben" to get the result you are seeking. Often in prompt crafting you know the AI will misinterpret, taking the most popular meaning of a word. It is the visual version of autocorrect gone rogue. Especially if what you are targeting is obscure, the best strategy might be to describe something similar but better known. Instead of "the gadget" (as the Trinity device was known as), type "sea mine with wires" to get a similar-looking thing.

Again, this is why direct cloning of source images gets used. The AI gravitates towards the easy to understand. Dial up the amount of regeneration and your actual source image of a vintage locomotive will be warped into Thomas the Tank Engine. Which is, again, why AI can be stealing to a degree rather more than the AI fan club likes to admit.

(As a sideline -- more when I talk about the process of inpainting -- it is true that you can make an original digital painting, that is, something more akin to manual painting with traditional tools, and then hand that to the AI to add detail and gloss. But the AI doesn't see things the way we do. A fairly decent sketch is actually less effective than blobs of color. The things we do as traditional artists to sell an image are to the AI artifacts that have to be interpreted. Better to paint a blob and dial up the "denoising" to give the AI a relatively clean slate. So, in this way, AI works against the use of traditional painting skills.)

(The exception to the exception is models that are specifically trained -- or different operations such as Control Nets -- that are designed to interpret drawings or paintings. These will bridge the idea expressed in an outline to the object that this line describes to our human understanding. Without it, this isn't the external contours of an object to the AI; it is a physical thing itself, a black string hanging in space.)

(And, even, it is possibly good training for the artist in learning to think in color masses, values, and planes, and not get misled into outlines and external contours.)

So this is work, and it does take skill, and some of it is traditional art skill. It is also not debatable that AI takes "less" work. The AI artist may spend a lot of time, but that time may be hitting a button over and over again and waiting for the next generation to complete. It isn't with pen in hand, strongly engaged with every aspect of the artwork from brush stroke to composition.

This is why we draw a line between a mixer or recording engineer and the musicians. Between actors and playwrights. Between authors and editors. 

I don't think you can say that AI isn't art. But you absolutely can say that it isn't traditional art. Being able to paint, and in fact doing that painting, can be a part of the process. But they can also be omitted.

Does this have anything to say about the morality of it, though?

When you are on the social media sites that are currently flooded with the stuff (and many are actively beating back the tide) it is absolutely stealing art. But that was happening before AI, as these are sites where people are sharing their own versions, or remakes, or mark-ups and distortions, of commercial IPs. Where one person will take an image out of a movie and add some corny dialogue, and another person will like it so much they steal that person's marked-up steal, add some of their own scribbles, and post that.

The questions around the training data are one large reason (that and the backlash) are the bigger reasons why AI is not being used as much for commercial work. Or when it is, they attempt to hide that they have. Sightings out in the wild have included such things as illustrations in a textbook, or in a paper submitted to a scientific journal, though!

AI is tainted, now. Adobe Software is just one entity that is trying very hard to sell it, and mollify the consumers that reacted badly to the first surge of clumsy images and the questions raised about copyright. On the latter, Adobe (and a growing number of other companies) have pledged that they are not using copyrighted work.

Okay, first, this pledge is coming from a company known for suspiciously qualified statements along the lines of "We are not training this AI engine on work that our users have uploaded to this part of the cloud."

But even then...is it still, morally, stealing art when what you are stealing is free to use? Sure, there are copyright free and royalty free resources being used all the time in the arts. They are usually used in a transformative fashion, if for no other reason than that a hundred other people bought that stock photo and you really should do something to make your book cover not look just like the other guy's book cover.

(Guilty. I did my own repainting of the stock images I used for the Athena Fox books, even if I then handed them off to the actual cover artist).

But...the way AI is used is a lot more like the guy that goes to the tray of free cookies and takes two handfuls of them, stuffing some in his pockets for later.

I also worry about small reference pools. If your training data is nothing but what they could get that was royalty free and cheap, it is going to slant the nature of the data. You risk a sameness. You risk artifacts of drawing too deeply on too few sources.

There are already those structural problems within the commercial art world. Already there are external pressures to make the art look like what the market is currently saturated with. The artists are already working with tools (brushes, filters, stock photography) that are a small slice of that infinite possibility, meaning that even without those external constraints the tools themselves are pressuring the artist to do certain things in a certain way. And more so when they are in a commercial setting where the art directors and supervisors and buyers and so forth are tacitly urging them to use the tools that the company is already using.

I worry that AI tools, the data those tools are based on, and the specifics of the look, are going to spread. Basically, the visual equivalent of Autotune.

Will we forget the lessons of classical art because all we are training on (or meme-ing on) is a cliché impression of the most well-known works, or worse, crass commercialized imagery. Will we become so inured to the artifacts and flaws of AI images we cease to see them any more -- and stop trying to correct them. (Like clipping in video games. We just don't notice it any more).

And is it driving out people who are taking the slow path, doing things by hand because they want more than the results of button-push art? As I touched on above, having a traditional approach and skills is, at the current state of the art, counter-productive. Knowing and caring what a Dutch Elm looks like is counter-productive when the AI is over-trained on other trees and is almost impossible to force away from duplicating them. Knowing and caring about weird discontinuities and bad anatomy when, again, the very process conspires against fixing those things?

AI is very good at gloss; at those surface effects, textures, blending, lighting, that are difficult to achieve manually. It is very bad at poses and composition and logic and story. The one conspires against the other, though. You can rarely do a half-decent hand painting that does do good composition and uses the correct historical research or whatever, and then have AI add that last bit of gilding, because the mindless AI will paint the roses gold as well. In the process of adding ground fog or godrays, it will muck the hell out of the foliage.

But that's not...ethics.

Monday, November 25, 2024

The Handbag of Minerva

So every book is a learning experience. I'd never done a contemporary adventure book before, and it took a lot of experimenting before I found the direction the series seemed to want to go.

So that's a lot of choices that were more-or-less made for me as I put in what seemed to work and then found myself stuck with it. No place more than with my protagonist. She started with a checklist of "strong female character" tropes I was determined to avoid. And then accumulated a bunch of baggage I hadn't planned -- things that, again, worked for that book and ended up becoming canon.

She's being pulled in too many directions and is too unfocused as a character. I long for having a single-minded character with a clear emotional goal; "I must find the Lost Talisman of Abraham that my father spent his life searching for."

It hit me over brunch. Maybe it was a couple of BookTube videos on trends in modern fantasy, or a really long dissection of Totally Spies! by a non-French-speaking Canadian (yes, language came up frequently).

I want to write a stock 90s Urban Fantasy heroine. I want to check every one of the little boxes I so carefully unchecked when I created Penny.

A loner that hates crowds, hot as hell but always wears black jeans and leather jackets, never skirts and especially nothing "girly." Hates cheerleaders. Skilled martial artist and fighter and confident in her skills. Viciously snarky. No romantic confidence at all (or experience) but every hot guy in the story -- and there are many -- are all chasing her.

Trouble is, I wouldn't be able to get through it without lampshading. 

Wednesday, November 20, 2024

Bar of Adventure

That's not actually the name of the TVTropes entry. The idea is a bar or tavern or club that provides a meeting place for the various heroes, and in many cases, is the neutral grounds where they can also meet up with the bad guys.

There is entangled with it the concept of a central spot, a hub location, for loosely connected adventures. This bar might be in the borderlands of space or in a sprawling multi-cultural trading town or a big city with a violent underbelly.

Basically, this bar is not just a place to relax, it is a place where the next adventure begins.

And boy does this tie into tiki culture, because tiki has since its invention packaged exoticism and the lure of adventure right down to the very drinks.


I am working on the first chapter of The Tiki Stars. My protagonist Rick Starr had a Mai Tai in his hands...and I stopped to look it up. Because part of the game I'm playing is that Ray's bar, out on a sandy spit and across the tidal flats from the spaceport of this colony world, is not just a Bar of Adventure where crooks and drunks, smugglers and revolutionaries are known to meet, it is also the Birth of Tiki. An imaginary version that is no less re-invented than Donn Beach's colorful past.

I was also crossing it with the idea of a pilot's bar, with the typical memorabilia (pictures from the war, a prop hanging on the wall, that sort of thing). I wasn't sure if that made a good blend with the tiki decor, which historically started as, basically, all the crap Donn had lying around that he could pin on a wall.

Yeah. That Mai Tai? Aside from the contested origins of the recipe...so there's Victor Bergeron in the mix, and Sven has a word about that, of course...it also was likely based on and certainly has a relationship to the Test Pilot, the Q.B. Cooler and the PB2Y.

Yeah. Q.B. stands for "Quiet Birdmen," a fraternity of...pilots. There is very much already a connection to pilots, and specifically, W.W. II pilots, in the early days of tiki.

***

But outside of that...

I think this is a good project for me. I am still reading up (now a great -- and angry -- book on the relationship between the nuevomexicanos and Los Alamos). And getting new ideas. But the tiki book is bringing me back to very basic writing, writing on the sentence level. Stopping at every adverb (I've got a problem with those) and with every noun of world-building, asking if this detail is really necessary, and if it is confusing.

The incredibly freeing thing is that I don't feel constrained by the real world. This is a big problem for me in the Athena Fox books; I started them in reaction to the bad history of the typical "Archaeological Adventure" and that sort of infected the process to where I didn't want to alter the menu of a (real) restaurant I wasn't even giving the name of in the text.

I'm not the only one. There's a phrase I've forgotten that gets used around writing circles for when a writer goes out of their way to explain something that the audience was, actually, quite willing to just take on faith. This one is so old, it shows up in Homer. There are several bits where Odysseus goes on laboriously about why he didn't do this other thing instead.

One suspects in that case the "imagined reader" was in fact a very much there in person heckler as the story was being told (from memory) by a story-teller. And that answer to audience objections ended up getting codified in text when Homer wrote it all down.

For the tiki book, if something in the world looks like it could raise a question with the reader...I just go and change it.