Posts Tagged ‘computer-generated voices’
The thread I began a couple of weeks ago on the topic on Next-Gen synthetic voices still has legs.
Voice-actor, computer expert, and website wizard Chris Wagner linked to my article, then proceeded to further launch into a fresh vector, extending the discussion in a way I never would’ve anticipated.
Click HERE to read Chris’ Blog: ”Computers as Narrators, really? naww…”
Thanks for contributing, Chris!
CourVO
Reaction to my blog post about new advances in synthetic speech (Synthetic Voice: Revolutionary or Repugnant) was mixed, but almost all agreed the sample voice on the Loquendo site was better than any had heard before.
Before I get into some of the responses, let me refer you to yet another site of this sort I’ve since become aware of: Lessac Technologies.
Click HERE to go to their FTP site with a long list of sample audio files.
Right. I didn’t think they were as good as the Loquendo site either, but certainly understandable. As these technologies improve, a myriad of creative, market, and technology questions arise.
My friend Brett Bumeter, the man who helped me build this WordPress blog you’re reading right now, and a self-avowed student of artificial speech… had this to say about acceptance of this level of quality as it relates to price-point:
“I think the pricing of the platform is the thing that essentially secures a temporary safe place for voice actors. The price points for using an emotional voice are not all that different from lower end price points for voice actors. Voice actors on the higher end . . . well . . . they are on the higher end sometimes for a reason.
That said, I think the writing is on the wall, or at least on my computer screen, this will definitely be a force to contend with in the next 5 years.
If that price point comes down by half or if the quality goes up by another 20% or the application to convert text to voice becomes even easier (thus saving time for the ‘producer’ and saving money that way), then voice acting price arbitrage will open up to synthesized voices for sure.”
Fellow voice-actor Peter Drew made these observations:
“Audiobook producers will certainly hold out longer against using synthetic voices. The more immediate concern is the industrial/corporation market and retail marketing on the Web via video. Why pay someone to read an already dull script to accompany a human resources video on the latest changes to the company’s benefits package? With hi-def hand-held cameras and desktop video production, many small retailers can make their own videos or contract a local agency to crank out a video for little money, saving even more by using a “voice in a box.”
It’s not a matter of if but when. Technology marches on and the human voice will be synthesized to a relatively high degree of realism and natural character. Will a synthetic voice ever match the artistry and subtlety of a well-trained and experienced actor? We’ll just have to
wait and see…”
Peter also referred me to a previous article he had written about this.
Voice over artist Daniel Wallace said:
“After listening to several of the samples I was shocked at how close the voices were to the real thing. The artificial voices are as close as I have ever heard. It also concerns me as a narrator. As the artificial voices get closer and closer to human voices, will publishers for the sake of expediency turn to this technology in place of a human narrator? I do believe no matter how close a computer generated voice gets one cannot replace artistry.”
David Sigmon, a guy I shared Pat Fraley’s weekend AudioBook Workshop with last year didn’t mince his words:
“While artificial intelligence and computer technology may eventually be able to mimic the human voice flawlessly, it will never be able to mimic the human imagination or insight that comes from living one’s life and experiencing our varied environments and relationships. No software will ever be able to adequately convey the emotional connection acquired when holding a crying newborn peeing all over your shirtfront, or the emotional letdown earned when failing to revive a drunken auto accident victim using CPR. All that said, non-fiction or text/reference books, as well as automated style text, may end up relying on computer voices.”
My friend Steve Hammill offered yet another take on why synthesized voices may or may not succeed:
“There’ll still be a market for live V/O, but synthetic voices will be a real threat to lesser talent very soon. And I have a theory about which markets synthetic voices will hit hard. IMO, short form work will get it first. My reasoning for this is that the human ear/brain will be annoyed by a computer voice in long form. My only “proof” of this is in testing mic preamps. In :30 second A:B comparisons, the differences between preamps were nearly impossible to hear; in long form the differences in preamps became dramatic. My theory is that the same will hold true for synthetic voices. Commercial tags, :10 VOs and other vocal bits (…dare I say imaging which is pretty synthetic already…) will be food for bottom feeders because there won’t be enough synthetic voice to offend the ears of most people.”
Finally, my VO friend Bobbin Beam did not mince her words when she commented:
“I can’t believe that someday someone would pay somebody to take the time to try and articulate just the right nuance of read that can out of one person, which captures heart, brain and vocal instrument for every given piece of copy, and marketplace trend. Hell they can hire an actual voice actor for that, and most probably invest a lot less in the long run!”
Please feel free to chime in on the conversation by commenting below.
It’s clear software engineers and audio creators are not going to rest on what they’ve done so far. As artificial intelligence algorithms improve, my guess is we’ll hear synthetic voices that rise to the level of chess-playing software in their ability to innovate, learn, and approximate human nuances. That’s when market forces will determine whether it’s worth the cost to customers.
CourVO
There’s been a lot of traffic recently on a forum populated by AudioBook readers, bantering about issue of computer-generated voices.
That topic is traditionally disdained by a group so dedicated to the finer nuances of a good read. These are serious audio-book listeners who celebrate the various human narrators, and the interpretation each one brings to a narrative.
But something new and improved has surfaced, and it’s making some converts even among this hard-core group of those favoring the real human voice.
So listen to the samples at LOQUENDO and then I’ll finish up below. It’s an international site, so you have to scroll down to hear the US/English samples.
——————————
So….whadya think? I agree, it’s the best computer generated voice I’ve ever heard.
Much of the give ‘n’ take on this forum moved into the realm of where the artistry is in this sort of software solution…and how would the audiobook publisher business model change. Beyond that, the discussion also addressed who has rights, and what is the revenue stream. Can a programmer replace a narrator? How labor-intensive and artistry-intensive is that?
This hard-core group of audiobook aficionados, agrees this is the best “fake” voice they’ve heard, but also agree it’s not there…yet.
Which, of course prompts the question: “When?”.
Text-to-Speech and voice recognition programs (eg. Dragon Naturally Speaking) have always been reliant on complex formulas or algorithms that incorporate the finer points of artificial intelligence. They’ve steadily gotten better with each new jump in computer speed and function.
It’s likely that we’re not far from a computer-generated voice accomplished enough to satisfy a sector of buyers who aren’t as discerning as the audiobook group mentioned above.
So now, I’ll state the question that has already likely bubbled-up in your own mind: “Is this likely to hurt yet another sector of jobs/clients now available to us as voice-actors?”
Your thoughts?
CourVO








