Thursday, December 4, 2025

Turducken

I don't know why I'm rushing to find another project. I'm finally into the meaty scenes of this novel. The stuff I was looking forward to doing when I started this book.

Took three days to hammer out a draft of the "Footloose" scene. That's one of those scenes that's not in the outline but comes organically out of the story. From the first moment I introduced Penny to Alamogordo and sent her to a Blake's Lotaburger, I realized there was a thing I could do there. 


It grew, until it became a scene where Penny confronts a bully like something out of an '80s movie -- a connection she, with her Media Arts degree, makes herself. Which is why the scene indirectly references Flashdance, Back to the Future, Footloose, and War Games. But I also name-drop the Marianas Trench and Manchester United. 

(You might also count in Breaking Bad, as a Blake's makes frequent appearances in that as well.)

It ain't about the name-dropping. That's just an observation. The scene is about how we see ourselves in movies, how some of the plots in movies reflect unhealthy trends in our society, and the point of it is Penny finding a way not to play out those tired old stereotypes. But it is really a side-note scene, at best the resolution of a tiny sub-plot; the real thing going on is pulling her off the path of solving the mystery no matter what, and placing her emotionally where she can take a different path at WIPP as well.


Anyhow, I've been thinking less of potpourri, and more of sequences that stack multiple elements to make something bigger than one alone. And I've been trying to figure out a simple way to describe one of these Turducken set-pieces in the Horizon Zero Dawn series, that becomes one of the more memorable sequences in the second game.

Our protagonist, Aloy, travels to the ruins of Las Vegas, now half-buried in the sands, on her quest to rebuild the terraforming system needed to bring the Earth back from disaster.


When the terraforming system was attacked, key sub-functions achieved a sort of unhappy self-awareness and fled to whatever distant surviving servers they could use as hosts. POSEIDON took refuge in Las Vegas, its arrival triggering the old desalinization plant and flooding the ruins of a grand casino that was once the center and showpiece of Vegas -- a Vegas already rescued from the desert once, through the efforts of a man named Stanley Chen, both investor and inventor of that desalinization system.

With me so far? A lot of stories might have stopped at the vista of ruined casinos overtaken by the desert sands. That's pretty spectacular already. But a big part of the adventure takes place in the transformed lower floors, now filled with water and the holographic illusions of sea life as a drowned god dreams.


In the middle of this mix is a trio of Oseram delvers, their leader driven by visions of the Vegas that once was discovered then lost again by his father. Delvers who are also showmen and who make the choice in the end to stay and to rebuild the town once again. And all three have amusing quirks and work well off each other; these are hardly throw-away side characters.


And as you the player explore, completing the challenges and puzzles to return POSEIDON to the task of saving the world, you encounter recordings that outline the story of Stanley Chen; his betrayal by his own business partners, his long-shot gamble at reviving Vegas, his success, and his final lonely trip through the town he had saved and made his home as the terraforming fails and the deserts sweep in again...leaving the computer core running out of sheer nostalgia, never dreaming of how that would one day save Vegas again in ways he could not have imagined.

Of course, all of this is set against the story of the Horizon Zero Dawn series, and Aloy's own personal journey. Themes and plot and story and character are all near-seamlessly interwoven with game mechanics. The very traversal mechanism (a sort of magitech SCUBA mask) that you use comes from those Oseram delvers and their personal story arc. (One reviewer used the subtitle "Gear and Clothing in Las Vegas.")

I love it when you can pull together different elements like this. I was recently trying to talk about this in a Reddit answer to a writer's question about using AI for inspiration. I am split here. Stated baldly, you could assemble one of the combinations mad-lib style. Or with a dartboard, or via ChatGPT. The tough part is joining them in writing.

But I think it is more likely you would come up with a winning combination because you understood the kinds of connections that worked for the story you wanted to tell. So you aren't trying to force something from a limited selection into working. Instead you are open to ideas, so when you are in the middle of constructing a story or story part, you recognize the germ of a thing that could be added in as it floats by in the form of a random news story, a face in the crowd, a spilled cup of coffee. Instead of an external generator of ideas, the story itself generates them out of wisps and whispers.

I do still expect a few more of those. I recently added a sort of apocalypse log to the Atlas Missile sequence, which I hope will make that more interesting (and is also my current solution for getting both Project Pluto and the next clue for the mystery into what otherwise is a bare silo).


The beats weren't quite working. This would be a good time for Penny to make some wrong choices, to fail a little, and for once not be so analytical about things. 

So Wargames is out, but Kubrick is in. Not the one you are thinking of, either. This isn't a line quote but a cinematographic quote; the Kubrick Stare.

Monday, December 1, 2025

Beta reader


ProWritingAid has really been pushing their AI beta reader. Many, many emails and popups and so forth on the sale of the credits necessary to run the thing. With the subscription I already have comes a small number of "free" credits so before I succumbed to the "now 35% off the 50% off of the special sale price!" Black Friday/Cyber Monday stuff I should really, really try the thing out.

I did.

It is AI.

Okay, it took a while to find how to activate it at all. All of the buttons went to the sale page, not to the "run this thing on my existing credits" page. And full compatibility with Scrivener requires putting ProWritingAid in as the always-on grammar checker and wanna-be Clippy for every single bit of text you type on your Mac. Not something I wanted.

But I was able to finally find it on the web version of the software and threw an opening chapter at it.

On the positive side, it seemed to grasp what it was reading. And possibly answered the biggest question I have about my writing (probably, that every writer has); does the reader understand what I'm saying?

Possibly, because this is AI. Which is to say, all it knows is that my text has the same text-shaped objects that it has seen in the other text-shaped objects it has been shown. Which might have been examples of good writing from good writers, or fanfic dredged from wherever it could get it.

The fact that it seemed to understand the three character names as belonging to, well, characters, is a trick that ELIZA was capable of. And that program can be emulated in a few dozen lines of BASIC.

On the downside, it praised the sample for having a fun and engaging narrative voice, and for weaving the modern-day setting with historical information. But, shit, that's what I was trying to do. So I made text-shaped objects that my meat-brain thought looked like the text-shaped objects made by writers who could actually pull those things off. I borrowed ways of saying things that I'd seen other writers use. 

So the silicon-brain agreeing that I'd accomplished my goals is really it saying, yes; I'd borrowed things from other writers it had seen. What does it bring to the table that allows it to tell if I pulled it off? In what way does it replicate the experience of a human beta reader?

(The pic above is of course the human, Beta; clone-sister to Aloy of the Horizon Zero Dawn franchise.)

The effusiveness of the critique gave it the flavor of friends-and-family feedback. That is, praise you can't trust, especially as it is so content-empty. As with all things, the most trustworthy things the AI spat out were the few small criticisms it was willing to risk.

Yes, I am very suspicious. LLMs are being trained both evolutionarily and programmatically to coddle the users. An AI that criticizes and corrects is going to be less popular and in the end sell fewer copies. And from all the sales, ProWritingAid really wants to sell some copies. So a tool from them that praises my writing is a tool I can not trust.

Even that, I could work from. Except for the so-very-typical empty AI phrasing. The "many writers have agreed that this may be a better way to phrase..." stuff that ends up saying almost nothing, but wrapped in language that does its best to hide the lack of anything inside.

The other tools of ProWritingAid are more useful. Sure, it is wrong a significant amount of the time, but it is absolutely clear about which words it doesn't like and why it thinks those words are wrong. So you can work with it, looking into everything it flags and checking, yourself, to see if it found something that should be corrected.

The same sample I threw at the AI engine was automatically sent through the checks for spelling and passive voice and so on. And it found things I would fix. But on the gripping hand...that chapter had already been through ProWritingAid. A couple of years back, but...it missed the stuff then. So what is it missing today?

Saturday, November 29, 2025

Metered

I got into the usual back and forth about the opening of the next scene. Couldn't resist an Eagles reference but decided to save most of the desert highway (dark or not) for the trip to WIPP.

I'd hit the references and I thought I picked out a decent Geiger counter for her. I didn't remember the references but I remembered the trail I took and pretty soon one model was leaping to the fore. Okay, good, but what did she learn about such things during her (as yet unwritten) museum trip? Did I have pictures of that cute little interactive exhibit?

I did. And what was in it?


 Yeah, that's clearly the same CD-V 700. Cool!

I picked it because it is still a good tool, if basic and lacking any modern logging capabilities. But also made of metal, painted in John Deere yellow, issued to Civil Defense and even has the triangle on one side. And we're still not done. The dial is marked for mr/hr (yeah, not even mR). And despite the manual claiming strontium, what is under the nameplate as a test source for most of them is depleted uranium.

***

So that was a tech success. Also this weekend, I finally found out why my gaming machine was crashing randomly. It wasn't Steam Cloud, it wasn't a bad thumb drive, it wasn't bad RAM, there weren't any bad sectors on the SSD and it wasn't even the old SSD transferred from the other machine.

It was the two-month old Samsung M.2 that was randomly disconnecting. I can't entirely blame them for not ameliorating; it is a rare issue, at least from the reviews. They could just put a tiny subroutine that would detect the drive was disconnecting and save those users all that pain, though...

...especially since when I'd finally figured it out (due to the boot being corrupt and the repair tools refusing to fix or reinstall Windows), it was too late to get my files off before it disconnected for good.

Yeah, don't talk to me about cloning the system. I can reinstall Windows fast enough. It is getting Python and the Nvidia drivers and everything back in order that's the pain. Oh, and over a terabyte of files that I also didn't have anywhere big enough to make it worth backing them up. 

Well, Steam is mostly back up. Maybe just as well my AI pipeline is gone.

Sigh. You'd think, four days off, I could get some stuff done. Clean house, take a walk, write more than a short driving scene. It is already Saturday evening and I've just got Penny standing in the dirt looking at a suspicious drum and wishing archaeology covered more nuclear physics than how radiocarbon dating works in practice.

I am dreading the end of the year. A short vacation for us; we take off Christmas Day and return the first full week of January. Seems like a lot of time, and when we get back everyone will be expecting us to be fully rested up and ready to work the next three months without a single day off (have to, since we have to burn our own vacation days for these mandatory days off).

Really, it is just over a week, and it will pass in an instant.

Sunday, November 23, 2025

Hatches

The "Tewa taco" scene is finally done. And it ended up overstuffed like a Mission burrito.

(For those who don't know, the American burrito is considerably larger than its Mexican ancestors. In SF's Mission District, they came up with a way to steam the burrito in order to fit even more inside it.)

There were a lot of constraints going on there. I did want to do more of the thing I just did above; the processual view of things that I've had Penny doing since the first book (if only to underline that she is an archaeologist). So spotting, say, a connection between the red soil around Santa Clara (many houses are painted that same color) and the famed Santa Clara blackware. 

I also wanted to keep this scene low-key, without conflicts -- and that meant without uncomfortable questions -- because the theme here is a moment of peace. And on top of all of that, I find it harder writing about Native Americans than I did shamelessly exoticizing the French. That means a lot of the obvious things like "how do you say that in your language" are things I don't feel right including.

(Yet for some reason, I get three words in Navajo. And none in Tewa, which I had intended for my focus. In part this is because the character we get closest to is Mary Cartwright, who is Tewa but doesn't want to share. The Navajo man, Edward, we got those three words from is someone they both encounter as an outsider.)


The process of getting though this whole sequence went long enough I forgot the details of my outline. Besides, discovery writing. Some stuff doesn't feel right now that I've looked at it longer. But I've more or less got it figured out now and I am ready for the Sheep Ranch scene and the Atlas crawl. 

The latter went a strange direction. I'm going for a Fallout sort of vibe now, and Penny is discovering a series of short poem-like writings scattered about the place by a prior urban explorer, notes I'm calling an "apocalypse log" after the thing that's in so many games.

But I am seriously considering opening up a C emulator or something and writing a bare-bones text generator for these. I want them to be poetic and I've had my (recent) fill with poetry. I want it to be mad and mad is hard to write. And I just thought of this and I've got so much to write already to finish this one anyhow...




So I don't have a next book yet. At this point, I usually have another Athena Fox I already want to write. I do have three, rather more fantastical, projects in the wings...but I was thinking today how much they are not so much a problem of theme, but one of philosophy.

The philosophical thing that's beneath my Steampunk Venus story (well, almost everything is beneath them, being as you can't land on the surface of Venus and live)...anyhow, that's about the inertia of systems.

Yeah, sounds thrilling.

Basically, that politics and culture, technology and government and industry, all of these things are complex structures because they need to be. You can make a stone axe with two stones. You can't make a turbojet without factories making the parts you build the factories out of that make the parts. At least that many layers deep, and probably more.

A city-state with its necessary physical infrastructure (um, floating above the acid clouds, anyone?) and the relationships with other polities is going to be big, and have a lot of interconnected systems, and have a lot of history and a lot of cruft.

And this is set against the backdrop of a Venus that someone started terraforming. Or something. The planet is changing, and ecosystems especially when they interconnect with some crazy high-temperature chemistry get really, really complex and potentially chaotic.

There are villains, because people are people and some of them are gonna angle for "what's good for me" regardless of the cost to others. But mostly there is inertia, the inertia of past choices and present command structures and fragile economics and the big-ass problem of changing the spark plugs while the damned engine is running. 

The tension of the story is whether humans and their systems can move faster than the changing environment of Venus.

 


Which actually said the way I just did makes it seem interesting. Regardless, I am thinking a lot more about the new idea, my engineer-hero space opera. To sum up the theme of that book, it's post-processual. Err, that is, it is about how structural understandings are a powerful tool -- as long as one understands their limitations.

So there's a lot of that structural understanding going on. Some of it weaponized. But, and especially once some of it has been demonstrated, it gets abused by people who want to take the process without the caveats, or worse, take the results without the process at all. 

That's the watsonian. On the doylist side, this is plot written by one of the older underlying conceits of science fiction as a form; to start with a question, and then consider the implications.

Not exactly new. In Asimov's The Caves of Steel every clue is rooted in something about that environment and the implications thereof. In his robot stories, each story is the working out of possibly implications of the Three Laws of Robotics. In Niven, plot points come out of his behaviorist view of his alien species. Oh, wait -- that Kzin wasn't smiling.

The purest form of this being right at the bread-and-butter are those gadgeteer sorts of books. Sometimes they invented something, sometimes it is something alien they are working out the possibilities of. But there is fast-paced, mad-scale development of this ideas in real time and that forms the backbone of the plot. 

And yeah, that goes back to the Edisonades. 

And that's what I've been wanting to see in an engineer hero. But even when the book was co-written with Mr. James Doohan himself it tends to stick with conventional fisticuffs and when tech is encountered, someone "techs the tech."

(That was what they actually advised outside writers to do on The Next Generation. Just write that with the planet about to explode, Geordie techs the tech and the Enterprise is saved. The regular staff writers will put the right technobabble in there. Which certainly works for story purposes but only underlines that the science and engineering is never the point.)

Doing a book which wears that Edisonade history on its sleeve, in which the characters actively talk about technical debt and design-for-manufacture, means it is an active part of the ideas being explored when your engineer-hero crawls into a duct to cross-wire a critical circuit. Not just some fine work by Matt Jefferies.


This sort of thing -- usually found in harder SF -- is akin to the mystery form that choses to play fair with the reader. I am reliably informed, however, that even in Ellery Queen's this style is not the most popular. Plays fair, in this context, means the solution is in the clues that have been provided equally to detective and reader. They can, and sometimes do, guess the solution before the detective does.

Really, though, there is much to be said on having what is at play be character and emotion. Us monkeys want to watch other monkeys dance, after all. Not stand in a blank room solving math puzzles. The point being, by both using an engineer hero of a particular mindset, and by making that sort of thinking about design goals and understandable compromises and the implications thereof an explicit theme and in-universe plot points, the process of solving those particular math problems becomes a thing our protagonist does and the reader (hopefully) enjoys following along.

That all puts me in mind of Holmes. Reading or watching him work now, it feels like he pulls it all out of thin air. He might as well be asking the Bat Computer for the answer.

But back when they were written, the often extremely structured lives -- in a class society with clearly defined trades and roles, with the esoterica of those trades rather less esoteric than the nature of a transducer-test technician is to us today -- meant his guesses were more believable. Some of it is the satisfyingly comfortable stereotypes, so his deductions felt emotionally right. Some of it was trivia that one could find someone who actually was a printer and confirm that, more-or-less, that was how it worked.

Maybe this is why back with Doyle, or with Christie, we could have these very structured mysteries, these locked-room murders and so forth. And why we've gravitated towards character instead, to the point where it perhaps doesn't even matter that the clue or the method of acquiring it is nonsense.


But alas, science has also increased in complexity and detail and thus in distance from the reader. You could present a clever bit with throwing a rock around an asteroid back when we were mostly in a Newtonian universe. Not saying it can't be done today -- but the clever things Mark Watney did took a lot longer to explain. The Martian spent a lot less time solving chemistry problems, and a lot more time going "Ahhr!"

Which sounds like I am talking myself out of Ensign Blue. Not necessarily. Maybe this stuff is too nerdy and the readership isn't there for as much as I want to put it. But the fact that I can do it, attracts me a lot more than the rather more hidden themes lurking behind those sulfuric-acid clouds.

Which brings me to mixed drinks. There's some potentially fun stuff in The Tiki Stars. Colonialism, exoticism, the commercialization of leisure. The uneasy balancing act in which "cultural appropriation" is but one slice and one label. That and a sort of fable about the birth of Tiki culture, taking the existing mythologies and re-mythologizing them in a different setting.

There's also a writerly question, about how much you can do and have fun with old-school pulp in this modern age.

But this is definitely the lesser of this trio, when it comes to having an interesting philosophy to work off of. There is, in plainer and simpler words, a lot less to say (or at least less that I am interested in saying).



Thursday, November 20, 2025

Frybread

I'm calling it the "frybread" scene now. They are making Tewa Tacos. I did some more reading up and watched a couple videos so if it felt like I wanted to do some kind of "sounds of cooking came from the kitchen" I'd have some idea of what was going on there.

Well, now I do.

I'd taken a sick day but just before the grocery store closed I woke up with a need to actually cook something. Ran out and threw the basics for Indonesian hot rice in the basket. And then said what the hell and added a bag of flour and a box of baking soda.

No recipe, no measurements, just memories of what I'd seen on YouTube and mild experience with dough in the past (making pasta from scratch. I'm glad I did it, feels nice to know how, don't really want to do it again).

And the frybread came out...okay. Could have been fluffier, and crisper, but it was tasty. 

The other bit of research that cropped up during this scene is I wanted to show family and relationships. The opening there came by accident; I opened the scene with Helen Naranjo on the couch and wrote in another woman just to open the door for Penny. So now I've got three people and there's a chance to do genealogy stuff I've watched my own family do. The "You remember Sarah, don't you? She's the sister of the mother of that friend of my daughter's from school. Well she just moved in with Peter, a man from Boston who goes to the same poetry readings as my friend Lana."

For many Native American nations, this is clan stuff. The whole "Born to Bright Water Clan, born from Beaver Clan." Except. Turns out the clan names (and associated things) are considered private in Tewa culture. They are not shared with outsiders.

My research method failed on this whole sequence. I'd read about a third of The Tewa World (there's actually two books by that name) but by the time I did the scene where Mary talks a little about it, I'd forgotten practically everything. And because of other story reasons, my Tewa character ends up spending most of that conversation talking about Dine mythology instead!

(And, yes, there was a hole in the narrative that Mutton Man fit into perfectly. So I did get to use him after all.)

And apropos of nothing, I'm oddly attracted to San Antonio for the next Athena Fox story. No idea of a plot. I just like the history, and their extensive underground world. 

Anyhow, even with adding this "sister of the mother of the friend of the" stuff, I'm within a half-dozen paragraphs of finishing the Rez sequence. This is the tipping point; the book accelerates from here, and for various reasons both internal to the text and having to do with my working methods, it should go much faster after this.

There's a lot of stuff to go, though. A "Glowing Sea" sequence as Penny pokes through an illegal dump site with a borrowed Geiger counter in one hand, the "Duel" sequence on the highway that ends with Penny being very glad she got collision coverage on her rental, a visit to the War Zone -- sorry, "International District" -- of Albuquerque, the descent into the pit and a few bad moments at the very bottom of an old Atlas-F silo, a confrontation with a bully at a 50's diner that goes strange, a trip to the Waste Isolation Plant Prototype, a doped-out conversation about nuke cats and the heat death of the universe, another "Hello Clarise" scene, a confrontation with the senior archaeologists and digging up a grave for the third time, a confrontation with a mysterious assailant in the trailer of a dead conspiracy nut with a convenient weapon-lined wall of everything the aspiring mall ninja would want, and then the long long trek through White Sands on foot after a horse that probably does have a name, dreaming of Lozen and Etgveld Girl and even Lucy...

Finishing with a little rock-meets-laser as Penny shows what she's learned about knapping flint at the very base of the Trinity monument. And the final scene with Jackson and Sanchez, when at least they explain what part of the Air Force they work for. Although I don't think Jackson will ever explain what inspired him to buy a hummer.

Tuesday, November 18, 2025

Structural Elements

 

I am almost through the "Rez" chapter. Just finishing up the conversation between Penny and the aunt who Mary Cartwright choses to spend time with. There's a lot about Mary that I'm leaving for the reader to unpack. How much of what she has said and done in other chapters came out of her interactions with her extended family, life in Santa Clara and Albuquerque, even the way this favorite aunt used to work "on the hill" (aka Los Alamos).

In archaeology, they call this processual (there was a lot more to unpack in this movement, also called the New Archaeology, and why we are somewhere in post-post-processual today). In SF, we call it part of the fun; Gernsback's "What happens next?" and so on. And it is absolutely all over the Athena Fox books.

How did this thing come to be? What forces shaped this thing? What are the implications of this thing? You can't stroll through modern Paris and not have the very path you take be shaped by the decisions taken by Baron Haussmann, working out of the philosophies of his time -- and the desires of his Emperor Napoleon III.

This is why I keep coming back to the Athena Fox series. And why I have so much trouble with the tiki book. Because the latter is just conceit. It is surface texture. There are interesting things I'd like to unpack and explore about the nature of tiki; about exoticism, appropriation, the commercialization of leisure. But it doesn't really have those meaty questions of "why."

And that's why I keep taking notes about Ensign Blue. (Working name for the file folder, I will have you know! Not an actual title or character name.)

Which came, mind you, out of experiments I was doing with WAN2.2 et al towards telling a story in animation via AI. Which is a fool's quest but that's another story. So; those same questions you ask of "Why didn't the Maya use the wheel?" or "why are barns red?" are baked into this project from the start. "Blue" is because the renders I used as a starting point had a character in a blue uniform. Okay?

And in longer twisty paths I don't feel like spending the time going over, there are things about the various cultures and their interactions which came out of those WAN experiments. Things about them that became ways of thinking about them, and the kind of exploration I've been talking about.

A theme, even; behavioral determinism with its insights, its process, and its limitations. (Which, to any student of the history of archaeology, is familiar ground; the way the understanding of cultures was shaped by the systems of thought of archaeologists themselves, which themselves came out of the same processual -- and other! -- forces.)

(Something which was way back in the first Athena Fox book. Our way of looking at classical cultures is shaped too much by our history with the Classics. The Greeks and Romans wrote. A lot. And via Rome -- and to a lesser extent the Greek Orthodox strains of Christianity -- western culture preserved the ability to read Latin and Greek. Which circled around and became a status symbol -- something that started back when the clergy were, essentially, the lettered class -- and that meant privileging of a particular way of viewing the classical cultures that also not-coincidentally privileged those doing so.)

(Or so goes the gloss. It would take a very long essay to unpack that one even slightly more than that.)

But to boil it down, Ensign Blue gives me an excuse to play Jared Diamond/Larry Niven games with biological and environmental determinism, while at the same time making pointed commentary when the facile "The Hrunt are descended from herd animals; they will never go to war" gets shown wrong on the pointy end of an incoming armada of Hrunt warships.

And all the fractal way down; Blue is an engineer, a ship's engineer, and there is always a world of "why the hell did they design it this way!" that can, at times, be ways that make it difficult to maintain, ways that make it prone to break under certain conditions, ways it can be repurposed, and ways it can be hacked.

Like, well, vents. You can call them Jeffries Tubes if you like, but at the bread-and-butter, there's a way that your clever characters can get around the boarding party.

One thing, though. (Well, there's a lot more than one...) Having this sort of underlying structural rationale that can be leveraged to generate "what kind of ships do they have on this planet" answers, or exploited to explain how the good guys (or the bad guys, who are allowed to be clever, too) manage to get some plot-necessary thing to happen, means it would all work better if some of this world was planned before I started writing.

Yeah. I actually have a structural and even thematic reason to want to embrace world-builder's disease.


(That, and I'm plotting backwards. Well, plotting is always a dialogue, but I've had a lot of experience lately with having to build the plot around what is actually on the ground. My concept for this book, it works just fine if I already have the map and the tech and I try to figure out stories that work with that groundwork.)

Wednesday, November 12, 2025

Non-Linear Narrative

Actually, non-linear plotting. Maybe that's why the simple-concept, easy-to-write books are taking so dang long in reality. 

Sigh. I set out to give this one a clearer plot progression. Distinct breadcrumbs; each plot point was something specific Penny needed to know, and took a specific measure to learn. And there would be a defined moment when she learns the thing.

Also, the plot would change course. Not just the direction of the next question, but the shape of the world. Largely, that has turned into different environments the plot takes her. Where I am in the story right now, it took her into the reservations. So this is the desert level, right after the underwater level that everyone hates.

(Huh. I just realized that in Horizon: Forbidden West the underwater level is in the desert level!)


I knocked down a lot of the reaction, by military, law enforcement, mystery men in black trucks, etc. I always seem to cut back from what I imagined in my outlines, to where there's less action, fewer bad guys, less intense emotions and all-in-all a more restrained (ahem; "realistic) scene. 

So the world changing is mostly that Penny gets kicked off the dig. And that's as much a change in environment as it is the world changing in response to her efforts. 

And here I am again, sending her on a quest after a specific question (who are these guys in a truck following me around) and discovering, Dirk Gently style, random ideas that eventually synthesize into a new realization.

Okay. Heading into the climax of Horizon Zero Dawn, Aloy puts together that HADES inspired the Shadow Carja to attack Meridian so HADES could get access to the MINERVA array and achieve its own goal (aka, wake up the ancient war machines and wipe Earth clean of all life). This is something Aloy puts together not because of specific clues, but through a gestalt of understanding how Zero Dawn worked, what HADES was programmed to do, what role MINERVA had in the project, and so on.

This despite that on a day-to-day, mission-to-mission level, Aloy is, "go there, talk to this guy, figure out how to climb to a place, beat up the machines there, return to the guy, fight totally expected boss-fight machine that shows up for no reason, collect XP and a new bow."

Because that's what I was thinking of in terms of linear plot. Bad guys have the McGuffin. Chase them. Guys with guns get in the way. Shoot them. Lather rinse repeat.

I'm just talking myself up to why I think the next book won't be as hard.


Friday, October 31, 2025

Poor man's outpainting

So one of the big uses of AI as a tool is filling in blanks in an image (say, if you did a Stalin on it), or extending the image.

(Oddly enough, one of the most famous images of the space program is extended. Buzz cut off Neil's head, but since the surrounding negative was black anyhow, they re-cropped it. And straightened it, too.)

So there's a stupid trick you can do with AI video generators; command a change of pose or camera orbit and let the AI interpolate the new image in three dimensions. With work, you can get what (the AI thinks) your model looks like when seen from a different angle. Pretty much, you can turn around the guy in the photograph. It's just the AI will make up a new face on the fly.

In practical terms, it will probably require some rework. But it is a fast-and-dirty way to get a different starter pose or camera angle on the same basic set/model/composition.

***

As I posted a bit back, I think the limitation on long renders is not actually a problem. Well, there are shots where you want to do a long tracking shot or a walk-and-talk. And there's formats like talking heads interview or podcast where the camera setup remains the same for minutes at a time. But especially if you are trying to tell a story, intercuts not only don't harm, they may even be necessary.

But back to that longer shot. After all, depending on which models you are using, how strong the prompting is, if you have useful LoRAs etc., the image can lose cohesion in as little as three seconds. Especially if that outpainting effect comes in; if the camera turns further than the previously seen setting, and the AI has chosen to put elements that don't fit your vision into those previously un-imaged locations.

In general, the video models are strongly biased towards taking the pixel patterns they see and mapping them to motions that are in their training data. It is a lot like the interpolations img2img uses all the time, except the idea of time/animation progression is added into the mix as a strong constraint.

Unfortunately, the AI really can't separate character moves from camera moves and it is almost impossible to lock the camera. That active, steadicam or hand-held camera, language is baked in to the models. It's the usual figure/ground, map/territory problem with AI. They don't know what a forest is, or what trees are. They get there by the fact that most forests have trees, and many trees are in forests.

So I've been messing around with extended videos.

The simplest solution is a cutaway, or change of angle or subject. I rendered a separate set of insets I could switch to whenever I needed to cover a break or change.

These still require observing the 180 rule and preferably keeping line of action as well. The latter is particularly important when cutting between related views. If the vehicle was moving right to left, even if you are cutting to a steering wheel, preserve that right-to-left. It makes the cut much smoother.

***

After that there is daisy-chaining. Especially since you can cut in and out using different angles and insets, you can go a pretty arbitrary length while maintaining the model. Keeping clarity on the set is a different matter and I don't have solutions to that yet.

i2v is the workhorse. This takes a starter image which is on-model, and animates from there. At some point it will diverge enough to become objectionable. In any case, the last-frame-extract node is great here; it pulls out a png of the last frame before compiling the video. (You can also pull the entire image stream and sift through them).

Why? Because you can take the last image, clean it up, and run a new i2v on that. Or you can do an arbitrary "generation" animation to get a different starting point, pull an image off that, and clean that up.

f2l has some advantage here. It is especially good for generating a join. You take as first image, the saved last image of the first animation. Then you take as last image for the f2l, the starter image for the clip that will be following.

The AI will do some weird things getting from A to B, though. As with all things AI, it sees things in a different way. We didn't notice a subtle change in the background because we were watching the action. The AI did, and has the martial artists suddenly engage in a little moonwalk to get to where it can join up with the background in the final image.

Best one I had yet was I had done a long daisy-chain and texture and LoRA burn-in had made the back wall look like a set from Beckett. The AI had the answer; a dozen frames before the end of the splicing clip, it had buckets of mud appear out of the air and throw themselves at the wall.

The odd one out here is s2v. I love sound-to-video because the presence of voice and sound effects makes the AI generate action. As with all things visual AI, it defaults towards static posing. "Model stands looking vaguely at the camera" is what you get so often even when you fill the prompt with action verbs.

I haven't learned that much about controlling the sound. A few experiments show that it is slightly better than Prisoner Zero at figuring out which mouth to work when given multi-character dialogue. I haven't tried it out yet on multiple musical instruments. It does seem to react to emotional content, though. Where it is extracting physical motion from, I don't know.

The other oddity of s2v is it allows use of an extension node that passes the latent on to the next node in the chain. It can get out to thirty seconds before the image degradation becomes too objectionable to continue on.

So what is this about "clean up?"

Yeah, this is what many people are doing now, at least according to the subreddit. Unsurprisingly, everyone wants to let the AI do the work, or at least automate it. So throw it into a Quen node at low denoise, possibly within the same workflow.

I'm cheating right now in that I haven't finished learning how to make a character LoRA. So instead of being able to plug-and-play, I drop the image back in AUTOMATIC, and I flip back and forth between several different models, employing various LoRAs and changing the prompt to focus in on problems.

And, yes, not just inpainting, but looping through an external paint application to address problem areas more directly.

It is a bit more than I need to address image degradation and get a clean starter image in high quality, as well as to stay on model, but it also means that taking an animation that produced a new view or state can be repainted, manipulated, inpainted, and otherwise brought on-model.

This stuff does mean I have a whole scatter of files, for which I have no consistent naming scheme. And can be a pain searching through clips to find the one that actually bridges two other clips properly. But it all sort of works.

Now I want to explore more interesting story beats. Something to do with fixing spaceships.

Thursday, October 23, 2025

More Little Buildings


I needed a break. And I have a new video card. So what is more natural than a bit of games?

(It beats more messing with AI.)

I've been doing a Connecticut Yankee run on Satisfactory. This is starting with the entire tech tree unlocked (and yes, that includes alternate recipes and Ficsit Shop). I also abused the "allow flying" advanced setting to collect a whole bunch of slugs and sloops.

The reason being, I just wanted to build shit, and I saw no reason to build giant-ass factories if I could overclock everything and use alt recipes to further reduce the footprints.


I didn't have a big plan going in. Mostly just wanted to enjoy that Robinson Crusoe vibe (well, more like Verne's The Mysterious Island, which is a Victorian Robinsonade on steroids). Even with the tech tree unlocked before-hand, the old Satisfactory Red Queen's Race continues; I'd started a huge coal plant to secure my power needs, but by the time I'd scouted coal and built up my industry to the point where I could build it, I'd already discovered oil and had the tech base to exploit that.

That's the turbo-fuel plant on the left there, with the generator towers behind it. In the center, having fun with "what can you make with petroleum coke" alternate recipes. Including the stripped-down aluminium process in that work-in-progress building behind it.


Another thing I wanted to do is avoid the plop-factory look, with buildings just sitting out alone in the middle of nowhere, and things like miners or extractors sitting on top of nodes on their steel skids, lifting eyes visible. So I blueprinted a couple of ad-hoc buildings to hide the latter.

And added various support buildings (which mostly do nothing) and bits of roadway, container yards, plus extended the pads under the buildings to make it look like it actually got built and continues to be supported, not just magicked into existence.


There is one bit of logic here; cleaning up the power lines makes it a lot easier to figure out which way they go (and what you can afford to cut or change when the inevitable upgrades ensure).






 

Thursday, October 16, 2025

Ideas are Easy

There's two stories appearing this week on the BBC's feed. Someone's been photographing the hyenas that live in an abandoned diamond-mine town in Namibia. The other article is about a different mine in a completely different country that the Nazis were using to hide looted artwork.

Yeah, that's a central image right there. A ghost town haunted by memories of old wars and potentially dangerous current wildlife, and somewhere below, in a maze of shafts barely propped up by rotting wood, are priceless art masterpieces.

I've said this before. Writing an archaeological adventure series? The BBC has a plot germ at least once a week.



EDIT: and it continues! Not the BBC this time, but researching a nice "ominous black car" for the current book and while reading about hummers discovered the Northwest Passage Drive Expeditions, including brightly-colored humvees refitted with tracks to use as a test bed for potential Mars rovers. Which is amusing enough already, but connects to the Haughton-Mars Project and their sort-of Mars base on an ancient impact crater on Devon Island, which also has a nice history of failed colonialist attempts, 5,000 year old archaic Innuit settlements, and the first place where traces of HMS Terror was found (the Franklin Expedition).


And just the topping on this "hoo boy" cake, the one above is named the Okarian. After what, you might ask? The Okar nation of the frozen north of Barsoom.

Sunday, October 12, 2025

Blue

Another plot bunny visited. This one is blue.

 

(From fanpop.com)

The first novel I finished, I had been writing an article for a gaming magazine. This was a little more straight-forward; I was thinking about concepts as a way to grasp cinematographic challenges of AI.

In that mix is thoughts I've had about wanting to do one of those Hornblower-esque career space navy things, and about wanting to do an engineer (who is more of a hacker), and thoughts about how to help Penny survive those situations where being Mistress of Waif-Fu would really, really come in handy.

And I ended up with a nice demonstration of how one idea can snowball into elaborate world-building...if you follow the potential implications. Start with a "heuristic implant." We're not going to get into the specifics of the tech here. Practically speaking, the young would-be soldier is sat under a helmet for a few hours, and when they get up;


Except experimental, very think-tank, with all the McNamara that promises. It is a whole set of combat skills that are deep-level muscle memory and happen basically automatically under the right stimuli. That right there is a whole host of problems. Bad enough your hands need to be licensed as deadly weapons -- now they are self-driving.

So that's a great character flaw, this killer instinct that could fire at the wrong moment, but at the same time, something that could help them through a sticky situation. Obviously (obviously, that is, when looked at from the needs of story, not through empty speculation on fantasy technology), they start meditating to at least control when it happens. And their relationship with this phantom driver...evolves.

(Yeah...Diadem from the Stars, The Stars My Destination, The Last Airbender -- not like this is exactly new ground.)

And that suggests a evolving situation, where a new kind of low-intensity warfare is challenging a military that organized entirely around capital ship combat. Of course that's a tired truism; the "always fighting the last war."

But that leads to wondering if this is really a navy at all. Or more like United Fruit Company, a massive corporate mercantile thing that works by rote and training and regulation, has been largely getting by with having a huge industrial base and leading-edge tech, but has expanded into a sector of space where the rules are a little different.

And now we've got multiple parties in the mix; old-school frontier traders who have the wisdom of experience, a cadre of experienced officers who want to create an actual military with a good esprit d'corps (not the same thing as free donut day at the office and "work smarter" posters in the cubicles), the friction between what is becoming an actual navy versus what is more like a merchant marine, the fresh-from-university theorists who pay far too much attention to how management thinks the world works (or should) and want quick-fix technological solutions over expensive training...

...and Ensign Blue in the middle of it, a trial run of one of the crazier outliers of the "super-soldier" package that various hard-liners have convinced themselves is the best way to win low-intensity conflict in an extremely politicized environment, a ship's engineer for a merchant ship who has no business at all getting hauled off to do dangerous missions on contested planets.

Oh, yeah. And engineer? Blue is the kind of engineer I've been seeing a lot of recently. Can (and often will) science the hell out of something, making the most amazing calculations. Then can't resist trying out a idea with ham-handed duct-tape and rat's-nest wiring that too often breaks (and sometimes catches fire).

Where does this go in character and career progression? Is there some third party, some out-of-context threat lurking behind what are still seen as basically raiding parties on the company's mining outposts? Are the rules about to go through another paradigm shift, tipping everyone from new enemies to unexpected allies into a brutal war?

Yeah, I got other books to write.

Tuesday, October 7, 2025

Anaconda

My go-to ComfyUI workflow now has more spaghetti than my most recent factory.


 (Not mine; some guy in Reddit.)

The VRAM crunch for long videos seems to rest primarily in the KSampler. There's an s2v workflow in the templates of a standard ComfyUI install that uses a tricky little module that picks up the latent and renders another chunk of video, for all of them to be stitched together at the end. With that thing, the major VRAM crunch is size of the image.

Of course there's still the decoherence issue. I've been running 40-second tests to see how badly the image decomposes over that many frames. Also found the quality is acceptable rendering at 720 and upscaling to 1024 via a simple frame-by-frame lanczos upscaler (nothing AI about it). And I'm rather proud I figured that our all by myself. At 16 fps and with Steps set down at 4 I can get a second of video for every minute the floor heater is running.

Scripting is still a big unknown. I've been experimenting with the s2v (sound to video) and as usual there are surprises. AI, after all, is an exercise in probabilities. "These things are often found with those things." It is, below the layers of agents and control nets and weighting, a next-word autocomplete.

That means it seems to have an uncanny ability to extract emotional and semantic meaning from speech. It is strictly associational; videos in the training material tended to show a person pointing when the vocal patterns of "look over there" occurred. More emergence. Cat logic, even.



So anyhow, I broke Automatic1111. Sure, it had a venv folder, but somehow Path got pointed in the wrong direction. Fortunately was able to delete python, clean install 3.10.9 inside the SD folder, Automatic1111 came back up and ComfyUI was still safe in its own sandbox. And now to try to install Kohya.


Experimenting with the tech has led to thinking about shots, and that in turn has circled back to the same thing I identified earlier, a thing that becomes particularly visible when talking about AI.

We all have an urge to create. And we all have our desires and internal landscapes that, when given the chance, will attempt to shape the work. Well, okay, writing forums talk about the person who wants to have written a book; the book itself being of no import, just as the nature of the film they starred in having nothing to do with the desire to be a famous actor. It is the fame and fortune that is the object.

In any case, the difference between the stereotype of push-button art (paint by numeric control) and the application of actual skills that took time and effort to learn is, in relation to the process of creation itself, just a matter of how granular you are getting about it.

Music has long had chance music and aleatoric music. Some artists throw paint at a canvass. And some people hire or collaborate. Is a composer not a composer if they hire an arranger?

That said, I feel that in video, the approach taken by many in AI is getting in the way of achieving a meaningful goal. As it exists right now, AI video is poorly scriptable, and its cinematography -- the choice of shots and cutting in order to tell the story -- is lacking. This, as with all things AI, will change.

But right now a lot of people getting into AI are crowding the subreddits asking how to generate longer videos.

I'm sorry, but wrong approach. In today's cinematography, 15 seconds is considered a long shot. Many movies are cut at a faster tempo than that. Now, there is the issue of coverage...but I'll get there. In any case, this is just another side of the AI approach that wants nothing more than to press buttons. In fact, it isn't even the time, effort, or artistic skills or tools that are being avoided. It is the burden of creativity. People are using AI to create the prompts to create AI images. And not just sometimes; there are workflows designed to automate this terribly challenging chore of getting ChatGPT to spit out a string of words that can be plugged into ComfyUI.

Art and purposes change. New forms arise. A sonnet is not a haiku. There is argument to recognize as a form the short-form AI video that stitches together semi-related clips in a montage style.

But even here, the AI is going to do poorly at generating it all in one go. It will do better if each shot is rendered separately, and something (a human editor, even!) splices the shots together. And, especially if the target is TikTok or the equivalent, the individual shots are rarely going to be more than five seconds in length.


Cutting to develop a story, using language similar to modern filmic language, is a different beast entirely. The challenge I'm thinking a lot about now is consistency. Consistency of character, consistency of set. There are also challenges in matching camera motions and angles if you want to apply the language correctly. For that shot-reverse-shot of which the OTS is often part, you have to obey the 180 rule or the results become confusing.

One basic approach is image to video. With i2v, every shot has the same starting point, although they diverge from there. As a specific example, imagine a render of a car driving off. In one render, the removal of the car reveals a fire hydrant. In the second render from the same start point, a mailbox. The AI rolled the dice each time because that part of the background wasn't in the original reference.

One weird problem as well. In editing, various kinds of buffer shots are inserted to hide the cuts made to the master shot. The interview subject coughed. If you just cut, there'd be a stutter in the film. So cut to the interviewer nodding as if listening (those are usually filmed at a different time, and without the subject at all!) Then cut back.

In the case of an i2v workflow, a cutaway done like this would create a strange déjà vu; after the cut, the main shot seems to have reset in time.

So this might actually be an argument for a longer clip, but not to be used as the final output; to be used as a master shot to be cut into for story beats.

Only we run into another problem. It is poorly scriptable at present. In the workflows I am currently using, there's essentially one idea per clip. So a simple idea such as "he sees the gun and starts talking rapidly" doesn't work with this process.

What you need is to create two clips with different prompts. And you need to steal the last frame from the first clip and use it as the starting image of the second clip. Only this too has problems; the degradation over a length of a clip means even if you add a node in the workflow to automatically save the target frame, it will need to be cleaned up, corrected back to being on-model, and have the resolution increased back to the original.

And, yes, I've seen a workflow that automates all of that, right down to a preset noise setting in the AI model that regenerates a fresh and higher-resolution image.

My, what a tangled web we weave.

Monday, October 6, 2025

Cryptic Triptych

I got the PC I built up and running, after the usual 22h2 hassle (tip; don't use the internal updater. Run the web installer at Microsoft. For as long as that lasts!)

ComfyUI is sandboxed (and a one-click install) and Automatic1111, though now an abandoned project, also installs a venv folder within the stable_diffusion folder, meaning it can run on Python 3.10.6  Now trying to get Kohya running, and learning venv so I can get that on 3.10.11 or higher...without breaking everything else.

I still like the primitive but functional GUI of Automatic1111 for stills. But ComfyUI opens up video as well. Motion.

And that got me thinking about linear narrative.

There does exist a form call "non-linear narrative." But that refers to the relationship between the narrative and some other chronology. The latter may be shifted around. A writer can at any point refer to a different time, including such techniques as the flashback and flash-forward. But the narrative itself remains linear. One reads one word at a time.

(Arguably, from our understanding of the process of reading, we parse chunks of text and thus multiple words may be included in what is experienced as a phoneme of extracted meaning.)

This means it is extremely difficult to capture the near-simultaneous flow of information that a real person not reading an account in a book would experience. In our old gaming circles, the joke was the monster quietly waiting until the room description was finished. It is a basic problem in writing; you can't tell it all at the same time. And the order you chose influences the relative weight given.

Again, arguably, our attention can't be split too many ways. In most cases, the realization you had while you were in the middle of doing something else arrives as a discrete event. You may have heard the voice behind you, and been processing it, but the moment of understanding that cry of "Stop!" can be treated narratively at the moment it becomes the focus of attention. And the observations that lead to that moment of realization back-filled at that time, as they, too, rise to the top of the consciousness.

Or in another way of putting it, a narrative is an alias of the stream of consciousness and the order of presentation can be taken as the order of items brought into focus.

This idea of the sequential scroll of attention has been used in artwork. We normally absorb a piece of art by moving from one focus to another (in a matrix of probable interest including size, color, position, human faces, etc.) The artist can construct a narrative through this shifting of focus.


This one sneaks up in stages. The first impression is very calming. The next impressions are not. Especially in some periods, there could be subtler and subtler clues and symbols that you don't notice until you've been looking for a while.


Or there are artists, from the triptych to the Bayeux Tapestry that arrange distinct framed panels in a sequential order.

Motion controls this flow of narrative more tightly. Not to say there can't be the same slow realizations. But it means thinking sequentially.


In comic book terminology, the words "Closure" and "Encapsulation" are used to describe the concepts I've been talking about. "Closure" is the mental act of bringing together information that had been presented over a sequence of panels in order to extract the idea of a single thing or event. "Encapsulation" is a single panel that is both the highlight of and a reference pointer or stand-in for that event.

In text, narrative, especially immersive narrative that is keyed to a strong POV or, worse, a first-person POV, has a bias towards moving chronologically. Especially in first-person, this will lead the unwary writer into documenting every moment from waking to sleep (which is why I call it "Day-Planner Syndrome.")


I've been more and more conscious of the advantages and drawbacks of jumping into a scene at a more interesting point and rapidly back-filling (tell not show) the context of what that moment came out of. I don't like these little loops and how they disturb the illusion of a continuous consciousness that the reader is merely eves-dropping on as they go about their day, but I like even less spending pages on every breakfast.

And speaking of time. The best way to experience the passage of time is to have time pass. That is, if you want the reader to feel that long drive through the desert, you have to make them spend some time reading it. There's really no shortcut.

I decided for The Early Fox I wanted to present Penny as more of a blank slate, and to keep the focus within New Mexico. So no talking about her past experiences, comparisons to other places she's been, comparison or discussion of the histories of other places, technical discussions that bring in questions of where Penny learned geology or Latin or whatever, or quite so many pop-culture references.

And that means I am seriously running out of ways I can describe yucca.


In any case, spent a chunk of the weekend doing test runs with WAN2.2 and 2.1 on the subject of "will it move?" Which is basically the process of interrogating an AI model to see what it understands and what form the answer will take.

My first test on any new install is the prompt "bird." Just the one word. Across a number of checkpoints the result is a bird on the ground, usually grass. A strange and yet almost specific and describable bird; it is sort of a combination of bluebird and puffin with a large hooked beak, black/white mask, blue plumage and yellow chicken legs.

In investigating motion in video, I discovered there are two major things going on under the hood. The first is that when you get out of the mainstream ("person talking") and into a more specific motion ("person climbing a cliff") you run into the paucity of training data problem. When there is a variety of data, the AI can synthesize something that appears original. When the selection is too small, the AI recaps that bit of data in a way that becomes recognizable. Oh, that climbing move where he steps up with his left foot, then nods his head twice.

The other is subject-background detection. AI video works now (more-or-less) because of subject consistency. The person walking remains in the same clothing from the first frame to the last. It does interpolate, creating its own synthesized 3d version, but it can be thought of as, basically, detaching the subject then sliding it around on the background.

We've re-invented Flash.


Now, because the AI is detaching then interpolating, and the interpolation makes use of the training data of what the back of a coat or the rest of a shoe looks like (and, for video models, moves like), it does have the ability to animate things like hair appropriately when that subject is in motion. But AI is pretty good at not recognizing stuff, too. In this case, it takes the details it doesn't quite understand and basically turns them into a game skin.

Whether this is something the programmers were thinking, or an emergent behavior in which AI is discovering similar ways of approximating reality to what game creators have been doing, the subject becomes basically a surface mesh that gets the large-scale movements right but can reveal that things like the pauldrons on a suit of armor are basically surface details, parts of the "mesh."

It can help to think of AI animation as Flash in 3D. The identified subjects move around a background, with both given consistency from frame to frame. And think of the subject, whether it is a cat or a planet, as a single object that can be folded and stretched with the surface details more-or-less following.

But back to that consistency thing. For various reasons, video renders are limited to the low hundreds of frames (the default starter, depending on model, is 33 to 77 frames). And each render is a fresh roll of the dice. 

It is a strange paradox, possibly unavoidable in the way we are currently doing this thing we call "AI." In order to have something with the appearance of novelty, it has to fold in the larger bulk of training data. In order to have consistency, it has to ignore most of that data. And since we've decided to interrogate the black box of the engine with a text prompt, we are basically left with "make me a bird" and the engine spitting out a fresh interpretation every time.

That plays hell on making an actual narrative. Replace comic-book panel with film-terminology "shot," and have that "Closure" built on things developed over multiple shots, and you are confronted with the problem that the actors and setting are based on concepts, not on a stable model that exists outside the world of an individual render. If you construct "Bird walking," "Bird flies off," and "Bird in the sky" with each time interpreting the conceptual idea of "Bird," in a different way it is going to be a harder story to understand.


That is going to change. There are going to be character turn-arounds or virtual set building soon enough. As I understand it, though, the necessary randomness means the paradox is baked into the process. No matter what the model or template, it is treated the same as a prompt or a LoRA or any other weighting; as a suggestion. One that gets interpreted in the light of what that roll of the dice spat out that run.

And that's why the majority of those AI videos currently clogging YouTube go for conceptual snippets arranged in a narrative order, not a tight sequence of shots in close chronological time. You can easily prompt the AI to render the hero walking into a spaceport, and the hero piloting his spacecraft...now wearing a spacesuit and with a visibly different haircut.

For now, the best work-around appears to be using the "I2V" subset. That generates a video from an image reference. The downside is that anything that isn't in the image -- the back of the head, say -- is interpolated, and thus will be different in every render. It also requires creating starter images that are themselves on-model.

A related trick is pulling the last frame of the first render and using that as the starter image for a second render. The problem this runs into is the Xerox Effect; the same problem that is part of why there is a soft limit of the number of frames of animation can be rendered in a single run.


(The bigger problem in render length is memory management. I am not entirely clear why, 

As with most things AI, or 3D for that matter, it turns into the Compile Dance. Since each run is a roll of the dice, you often can't tell if there is a basic error of setup (bad prompt, a mistake in the reference image, a node connected backwards) or just a bad draw from the deck. You have to render a couple of times. Tweak a setting. Render a couple times to see if that change was in the right direction. Lather, rinse.

With my new GPU and the convenient test size I have been working with, render times fall into the sour spot. 1-3 minutes; not long enough to do something else, but long enough it is annoying to wait it out.

I still don't have an application, but it is an amusing enough technical problem to keep chasing for a bit longer. The discussions on the main subreddit seem to show a majority of questioners who just want "longer video" and hope that by crafting the right prompt, they can build a narrative in an interesting way.

The small minority is there, however, explaining that cutting together shorter clips better approaches how the movies have been doing it for a long time; a narrative approach that seems to work for the viewer. But that really throws things back towards the problem of consistency between clips.

And that's why I'm neck-deep in Python, trying not to break the rest of the tool kit in adding a LoRA trainer to the mix.