The KHIT Blog: Developments in AI "voice cloning"

Tuesday, May 7, 2024

Developments in AI "voice cloning"

ELEVENLABS IS BUILDING AN ARMY OF VOICE CLONES
A tiny start-up has made some of the most convincing AI voices. Are its creators ready for the chaos they’re unleashing?
By Charlie Warzel

My voice was ready. I’d been waiting, compulsively checking my inbox. I opened the email and scrolled until I saw a button that said, plainly, “Use voice.” I considered saying something aloud to mark the occasion, but that felt wrong. The computer would now speak for me.

I had thought it’d be fun, and uncanny, to clone my voice. I’d sought out the AI start-up ElevenLabs, paid $22 for a “creator” account, and uploaded some recordings of myself. A few hours later, I typed some words into a text box, hit “Enter,” and there I was: all the nasal lilts, hesitations, pauses, and mid-Atlantic-by-way-of-Ohio vowels that make my voice mine.

It was me, only more pompous. My voice clone speaks with the cadence of a pundit, no matter the subject. I type I like to eat pickles, and the voice spits it out as if I’m on Meet the Press. That’s not my voice’s fault; it is trained on just a few hours of me speaking into a microphone for various podcast appearances. The model likes to insert ums and ahs: In the recordings I gave it, I’m thinking through answers in real time and choosing my words carefully. It’s uncanny, yes, but also quite convincing—a part of my essence that’s been stripped, decoded, and reassembled by a little algorithmic model so as to no longer need my pesky brain and body…

The Atlantic, Charlie Warzel: AI voice cloning

This stuff is getting so good, so rapidly. Beyond burgeoning anxieties with regard to job losses, we're gonna have increasing difficulties with "autheentication" (the "disinfo" thing).

OK, it's been on my to-do list to record myself on my Mac reading the entire U.S. Constitution from Preamble through the 27th Amendment and then post the mp3 online. I've read it aloud from start to finish several times and have studied it closely piecemeal going all the way back to graduate school in the mid- 1990s. I am fairly SME with the 4th Amendment in particular. It comprised a central focus of my nearly 300 page Master's Thesis (pdf). My personal study of a range of legal and constitutional issues has continued ever since graduate school. Here’s a post from last year. So, no, I don’t have much patience for people who blab on about such topics without any substantive underlying knowledge.

Why bother? Well, again, it just goes to my ongoing irritation with our overpopulation of dilettante Barstool ConLaw Geniuses, most of whom have likely never read all of it, or rationally grasped its provisions (many of them elected officials, from Donald Trump on down). I would never be so arrogant as to claim ConLaw expertise (uh, for starters, IANAL). Nonetheless, in addition to my lengthy, ongoing reading-comprehension "hermeneutic"-level efforts, I have dug into and tabulated a bit of info perhaps of interest to all those "textualists" out there.

Current English language word count, 171,476, obsolete 47,156 (some authorities think the current active tally is a significant undercount)

US Constitution

7,420 words total, *
1,065 unique words,
497 appearing only once (48%)
(Preamble thru 27th Amendment, *Signators’ names excluded)

(Only 0.53% of all English words are in Constitution)

Words not found: “democracy,” “privacy” “outer” “perimeter”

Appearing 16 times: “vote” 14 “votes”
Appearing 9 times: “election”

Phrases not found:
   “Co-equal branch(es)”
   “Separation of Powers”
   “Checks and Balances”
   “outer perimeter”
___

153 sentences, 2 Declarative, 151 Imperative. Mostly compound / complex.

Punctuation:
89 semicolons
11 colons
559 commas
195 periods
24 dashes
5 open/close parentheses

Commonly listed English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, interjection, numeral, article, or determiner.

Numerals 1-10 (total 139, 1-27 inclusive):
1      32
2      29
3      15
4      18
5       8
6       6
7       5
8       3
9       3
10     2

And, while there's legal scholar consensus that the bulk of "Conservative Strict Construction Originalist Textualism" hovers at the semantic/contextual/phraseological level, SCOTUS Justice Barrett recently remarked at Orals "putting aside for the moment the meaning of the word 'and' and the placement of a comma..."

I kiddeth thee not.

Back to my mp3 idea. I apparently can now just pay ElevenLabs $22 to digitally "clone" my voice, after which I could simply post fake recitations of all manner or prose I'd never read. Naysayers might well scoff at any genuine audio V/O.

So, for the near future, one might have to go all the way to a video talking-head recording of such material using YouTube to demonstrate one's chops. But, full-on “deep-fake“ video is likely not that far off either.

Charlie Warzel continues:

The uncomfortable reality is that there aren’t a lot of options to ensure bad actors don’t hijack these tools. “We need to brace the general public that the technology for this exists,” Staniszewski said. He’s right, yet my stomach sinks when I hear him say it. Mentioning media literacy, at a time when trolls on Telegram channels can flood social media with deepfakes, is a bit like showing up to an armed conflict in 2024 with only a musket.

BUT, THEN, ON THE OTHER HAND, THERE'S THIS...

That made me cry. Tears of joy. Back during my musician days, I used to sing his song “I’m Gonna Love You Forever” during my solo acoustic days. He was a great CW artist. Using AI to extend his work in the wake of his severe stroke misfortune is a wonderful application of this technology.

BTW, I riffed a bit on malign AI potential back in December.

UPDATE: SOME THOUGHTS OF JACOB COLLIER

JACOB!!!

Jacob Collier is now 30. He is without any exaggeration a complete musical / music technology genius. He is also no mere studio / tech performer. Below, 2 hours of jaw-dropping live concert performance in Lisbon.

His band, (half brilliant female), is simply fabulous.

_________

No comments:

Brave New Health

Commonwell Health Alliance

Another important read (pdf)

I love this kind of stuff. It sustains and humbles me. "As politicians, advertisers, salesmen, and propagandists for various political, economic, moral, religious, psychic, environmental, dietary, and artistic doctrinaire positions know only too well, fallible human minds are easily tricked, by clever verbiage... Common language—or at least, the English language—has an almost universal tendency to disguise epistemological statements by putting them into a grammatical form which suggests to the unwary an ontological statement. A major source of error in current probability theory arises from an unthinking failure to perceive this."

Quotes

"An economist is a person who sees something that works in practice and tries to figure out whether it will work in theory."

- J.D. Kleinke, medical economist
___

"The only person who enjoys change is a baby with a wet diaper."

"Every misspent dollar in our health care system is part of somebody's paycheck.

- Brent James, M.D., M.Stat

“We could do healthcare, at markedly higher quality, for everyone in this country, without rationing or denying anybody the care that they need, without having the government dictate how doctors practice or whether hospitals could expand, at half the cost we do it now.”

- Health Care Futurist Joe Flower

Most of the sciences, unlike parts of medical science, are not concerned with the impossible. There is not complementary and alternative physics, or chemistry, or biochemistry, or engineering. These disciplines compare their ideas against reality, and, if the ideas are found wanting, abandoned."

- Mark A. Crislip, MD

"Q: How much alcohol is too much?
A: More than your doctor drinks."

- a physician I once heard speak during a CME presentation

“Just because science doesn’t know everything, doesn’t mean you get to fill in the gaps with whatever fairy tale most appeals to you.”

- Dara O’Briain

'[I]t is one small step from using the computer for "helping" doctors to monitoring them, judging them, dictating to them what to do, and withdrawing payment for computer non-compliance. The use of computer data is a multi-edged sword. It can be used for the "good," facilitating diagnosis and treatment and making it more accurate and up-to-date, and for “evil,” invading privacy, inviting security breechs, and making decisions based on the opinions of remote authorities rather than those present at the patient-doctor encounter.'

- Richard Reece, MD

“[T]here ARE statistics which are non-political. Just because The Washington Post/Fox News reports the temperature is 75 degrees doesn’t mean it’s really snowing and sunscreen is a liberal/conservative plot. Even if you earn a living being ideological.”

- Michael L. Millenson

"It is a generally a fairly convincing argument that people shouldn’t have to be subsidized to undertake a change which is in their best interest.

The reconciliation seems to be that EHR is not supposed to make a doctor’s practice more efficient and higher quality. It is supposed to make the system of care more efficient and higher quality, which is not the same thing. Those of you who took calc recall that maximizing the total of variables is not achieved by maximizing any one variable and this is a perfect example of that.

Those of you have served in combat certainly noticed that too — if everyone works as a team the unit takes fewer casualties. If you try to save your own hide, you might, but at the expense of more casualties overall."

- Al Lewis

"There are two ideas to keep in mind about Bayesian reasoning and how we tend to mess it up. The first is that base rates matter, even in the presence of evidence about the case at hand. This is often not intuitively obvious. The second is that intuitive impressions of the diagnosticity of evidence are often exaggerated."

- Daniel Kahneman, "Thinking, Fast and Slow"

"Physicians apply advanced scientific knowledge, but they must do so without the favorable conditions that experimental scientists create for themselves. Multitasking is forced on physicians, often in chaotic environments and under severe time and resource constraints."

- Lawrence and Lincoln Weed, "Medicine in Denial"

"It’s time to stop the whining about Obama care and acknowledge we already have universal health care. We just pay for it in the stupidest way possible that ensures problems are that much more disastrous and complicated when they’re finally treated."

- Mark Hoofnagle, MD, PhD

"Every act of conscious learning requires the willingness to suffer an injury to one's self-esteem. That is why young children, before they are aware of their own self-importance, learn so easily."

- Thomas Szasz, MD
___

"Of course, one reason that process metrics* are so popular is that processes are much easier to define and measure than outcomes."

- The Skeptical Scalpel
___

"There is an “illusion of validity” for any random data point, a seductive sense that is colored by what we hope will be true. Mountains of pharmaceutical claims are often made from mere molehills of data."

- Danielle Ofri, MD
___

"Joy empowers people. It is a source of energy that enables people to hope and plan and change their lives for the better. Spend some time around someone who is relentlessly negative and how do you feel–drained, right? More and more research shows that joy is not something that just happens to you, like a bolt of lightening out of the blue. Joy is, instead, a habit to cultivate. Negative thinking and despair are the crabgrass of our souls–weeds that take root and spread, sometimes to all areas of life. Joy, in contrast, is a soul’s rose–hardy when cared for, able to put down roots over time and withstand disease and extremes. Like a rose, however, your joy can become blighted from neglect or harsh conditions. We all need to tend to our joy–to prune away the badness, and to know that, even though it may look like a prickly bare root, if you invest time in a joyous outlook, gorgeous things will bloom, even in the harshest conditions."

- Dr. Jan Gurley
___

"'Solutions' exist only in mathematics."

- Karen Martin
___

"The issue of how to regulate clinical software is, in the long run, indistinguishable from the issue of how to regulate medicine. The only difference is that medicine is practiced in the open, without secrecy, subject to peer review, and under a merit-based state license."

- Adrian Gropper, MD
___

"Economist, rope, tree: some assembly required."

- Source unknown

DISCLAIMER:

I write this blog wholly on my own time and my own dime. The views proffered are expressly my own as a concerned and active citizen/taxpayer (in addition to being the result of my substantive experience in the various IT fields), and in no way reflect any policy views of my former employer, notwithstanding that some of the thinking has indeed obviously been spurred by the implications of the work with which I have been doing for them.

FAIR USE POLICY
I cite a ton of news and web sources spanning the breadth of relevant technical and policy domains, sometimes at substantial length. I believe I remain well within the bounds of "Fair Use," as [1] I am not doing any of this for profit, [2] I always provide attribution and links -- which, [3] far from negatively impacting any copyright holders' commercial interests, might actually increase traffic to and interest in their offerings.

Nonetheless, should I post anything of yours regarding which you have any objection, just let me know and I will remove it forthwith.

The KHIT Blog

Search the KHIT Blog

Tuesday, May 7, 2024

Developments in AI "voice cloning"

No comments:

Post a Comment