Is AI Really That Scary? The Bad, the Good, and the Meh

For Halloween, here’s the plot line of a nightmare: The monster Robovoice takes hold of the audiobook industry puts everybody out of work. Bwaahahahaha!

Okay. Very scary. But let’s breathe into the bag, and think it through.

Bad: Robovoice is here, though I’ve heard only anecdotes about real applications in audiobooks. But they’re coming. My guess: Text-to-speech (T2S) will make inroads in long backlist (nonliterary) nonfiction, especially older titles. Not sure whether the trade book publishers would be early adopters in that space, but it seems like a decent spot for other licensors. But these titles aren’t currently being recorded. No one loses here. (At least, no one so far . . . )

Better: “Newspapers use these applications,” she says. “Why not book publishers?” Conceded: Some information junkies treat audiobooks like newspapers. They fly through programs at 2x+ speed. But is this a typical way to consume, say, fiction? What’s the entertainment value of hearing Mr. T2S read literature? A related matter: Many people developed a taste for audiobooks through podcasts (style of storytelling: naturalistic, spontaneous). Can Robovoice reproduce that style?

Good: Algorithms that personalize voices are restricted by statute. (See “right of publicity” legislation in NY and elsewhere.) You can’t just take a repository of data from X narrator’s audiobooks and start hawking the “X Narrator Robo-Reader.” I’m afraid SAG-AFTRA and others would object strenuously.

Meh: Other stuff to consider before the Great Robovoice Takeover.

  • Authors. Q: Will literary artists choose AI over performing artists? A: I doubt it—until AI is good enough, and the money is right!

  • Production and distribution. Just wondering how this would work. Two possibilities come to mind:

    • Case #1: Publisher contracts with the algorithm developer, Robo Inc., the “producer” of the program. Robo Inc. processes a final text file and delivers master files to the publisher who forwards them to the retailer.

    • Case #2: Publisher transmits a final text file to the retailer whose licensed product (from Robo Inc.) generates the audio stream on demand. The consumer might be able to select the voice type or even a particular narrator’s voice.

#1 seems clunkier to manage but easier to implement and control. #2 seems more elegant (and less costly for the publisher) but legally harder to pull off. For starters: The conventional publisher-author relationship is bypassed as the retailer’s robo product dictates the sound of the author’s intellectual property.

But what do I know?

Big takeaway: There are a lot of contradictory signals out there. It’s not worth stressing out about monster takeovers. Yet. Have a Happy Halloween!

Previous
Previous

2 Outcomes of Talent Marketplace Applications

Next
Next

At the Outset