不良研究所

Woman speaking into her Alexa device. 不良研究所 researchers have found that people speak differently when talking with voice artificial intelligence.
不良研究所 researchers have found that we speak differently when talking to voice artificial intelligence. (Getty)

 

If you鈥檝e ever asked Siri or Alexa something, you may have noticed you speak MORE LOUDLY, slowly or make your words 鈥渃lear-er.鈥 不良研究所 researchers say most of us speak differently when talking to our devices. Voice artificial intelligence may even be changing our social behavior. In this episode of , we鈥檒l talk to two 不良研究所 linguists to find out why voice AI is changing the way we operate.

In this episode:

, associate professor, Department of Linguistics, 不良研究所

postdoctoral researcher, Department of Linguistics, 不良研究所

Transcripts may contain errors. Transcribed by AI. Fixed by humans.

 

Amy Quinton 

Hey, Marianne.

 

Marianne Russ Sharp 

Hey, Amy.

 

Amy Quinton 

Have you talked to an Amazon Alexa before?

 

Marianne Russ Sharp 

Not really. Does that make me like, a really weird person? But if I did, I would ask you to play my favorite podcast Unfold. I know I'm one of those people who doesn't have one of those. And I do talk to Siri, sometimes usually by accident. Does that count?

 

Amy Quinton 

I mean, sure. They're both voice artificial intelligence. But Alexa and I have had a real conversation, and I recorded it.

 

Marianne Russ Sharp 

I bet she recorded it too. Creepy. 

 

Amy Quinton 

She did actually. She's part of a conversational competition.

 

Marianne Russ Sharp 

What?

 

Amy Quinton 

Yeah, just listen.

 

Amy Quinton 

Alexa, can you chat with me?

 

ALEXA 

Welcome to the Alexa prize, a university competition to advance AI. I'll give you a social bot now. Hi, this is an Alexa prize social bot. So what's on your mind? Let's dive in.

 

Marianne Russ Sharp 

 The Alexa prize. What's that?

 

Amy Quinton 

It's a social bot competition. You know, a 不良研究所 team, aptly named team Gunrock from our engineering department won this contest a couple of years ago. The goal is to create an Alexa social bot and get her to converse engagingly on a range of popular topics for about 20 minutes.

 

Marianne Russ Sharp 

20 minutes is a long time even for AI? How long did you chat with Alexa?

 

Amy Quinton 

Not long. Here's what it sounded like. What do you want to talk about?

 

ALEXA 

I would love to talk about anything related to sports or technology? What would you like to know?

 

Amy Quinton 

Should we be afraid of AI?

 

ALEXA 

That's a great topic. What do you think is the most terrifying aspect of being afraid of an eye?

 

Amy Quinton 

of artificial intelligence?

 

ALEXA 

That's an interesting topic as well. Do you think AI has the potential to replace human intelligence in certain areas?

 

Amy Quinton 

No, but it's a fear. And it's weird that you sound like a person.

 

ALEXA 

It's not weird at all. As a bot, I do not have emotions like humans do. But I can understand why it would be scary to think about the potential consequences of AI replacing human intelligence. Do you have any other thoughts on this topic?

 

Marianne Russ Sharp 

It's kind of interesting, because she says it's not weird at all. And I do think it's kind of weird. But she also says, I don't have emotions, but I can understand like, I don't have emotions, but I have a brain or essentially the power to... right? The brain of a computer.

 

Amy Quinton 

Yeah, well, you know, I gave up on the conversation after maybe five minutes. She misunderstood me a few times.

 

Marianne Russ Sharp 

Yeah, like being afraid of an eye. Again, definitely weird.

 

Amy Quinton 

And I had to repeat myself.

 

Marianne Russ Sharp 

Yeah, that happens all the time.

 

Amy Quinton 

And I slowed down my speech and talked a little louder and clearer for some reason.

 

Marianne Russ Sharp 

I do the same thing with Siri.

 

Amy Quinton 

Everyone does, or at least most people. 不良研究所 researchers have studied this. One of the things we do is called hyperarticulation.

 

Marianne Russ Sharp 

Sort of exaggerating our speech to be clearer, like we do sometimes with babies.

 

Amy Quinton 

Yeah, exactly.

 

Marianne Russ Sharp 

So we have researchers that study how we talk to AI.

 

Amy Quinton 

Yeah.

 

Marianne Russ Sharp 

Why?

 

Amy Quinton 

Well, that's the first question I asked Georgia Zellou, an associate professor of linguistics.

 

Georgia Zellou 

For the first time in human history. I mean, starting since 2011, when Siri was introduced, humans are talking to non human entities in a substantive and meaningful way. And it's happening on a daily basis, and children are doing it and people of all ages are doing it. So trying to understand their impact on our language, our development, our our sort of social life is something that we're interested in exploring. There's so many kinds of unanswered questions here.

 

Marianne Russ Sharp 

Indeed, so many unanswered questions. So Amy, is she suggesting that our language might even change over time as a result of us talking with voice AI?

 

Amy Quinton 

Yeah, it already is, and just how our future conversations will unfold with voice AI is what we're going to be talking about in this episode of Unfold.

 

Marianne Russ Sharp 

All right, then, as the Alexa social bot says...

 

ALEXA 

Let's dive in.

 

Amy Quinton 

Coming to you from 不良研究所, this is Unfold. I'm Amy Quinton.

 

Marianne Russ Sharp 

And I'm Marianne Russ Sharp.

 

Amy Quinton 

Voice AI or machine-made voices speak to us all the time. Whether it's Siri as our personal assistant, or our GPS system telling us which way to turn, it's really become a staple in a lot of households.

 

Marianne Russ Sharp 

Yeah, GPS has saved my life more times than I can remember. But humans have been fascinated with making machines talk for a very long time. I mean, that's certainly played out in Hollywood. All over the place. I think about "2001 A Space Odyssey." Remember Hal?

 

Movie Clip 

Open the pod bay doors Hal. I'm sorry, Dave. I'm afraid I can't do that.

 

Marianne Russ Sharp 

Or that kid in WarGames that made his computer talk?

 

Movie Clip 

Yes, they do. How can it talk? It's not a real voice. This box just interprets signals from the computer and turns them into sound. Shall we play a game?

 

Amy Quinton 

Good examples, although those aren't really machines talking but actors playing the roles of Hal and Joshua, the computers. But do you remember Speak and Spell, the toy that taught you how to spell?

 

Marianne Russ Sharp 

Yes, I do.

 

Speak and Spell 

Spell Circuit, C, I R, C, U, I T. That is correct.

 

Marianne Russ Sharp 

Gosh, that sounds so robotic. Thinking about how Siri and Alexa sound now, Boy, technology has come a long, long way.

 

Amy Quinton 

Did you know that the first device that could generate continuous human speech electronically was invented in 1939?

 

Marianne Russ Sharp 

1939? I would not have guessed. Are you serious?

 

Amy Quinton 

Yeah, it was called the VODER for Voice Operation DEmonstratoR.

 

Marianne Russ Sharp 

VODER. Not the best name but a magnificent feat for 1939.

 

Amy Quinton 

It was invented by a guy named Homer Dudley with Bell Laboratories. And it was a demonstration of their innovations. And it was a highlight of the 1939 World's Fair.

 

Marianne Russ Sharp 

How did it work?

 

Amy Quinton 

Well, the speech sounds had to be made manually on a special keyboard that produces all the vowels and consonants of speech. So it required an operator who also had to use a wrist bar and foot pedals and an arm switch to generate sounds.

 

Marianne Russ Sharp 

Wow, so it was a workout and not an easy conversation?

 

Amy Quinton 

No, but I have a recording of it, actually. Want to hear it?

 

Marianne Russ Sharp 

Yes, please.

 

Amy Quinton 

Well, in this demonstration, the operator of the VODER is a woman named Helen Harper.

 

MAN 

For example, Helen, Will you have the VODER say, "She saw me?"

 

VODER 

SHE SAW ME.

 

MAN 

That sounded awfully flat. How about a little expression? Say the sentence in answer to these questions? Who saw you?

 

VODER 

SHE saw me.

 

MAN 

Whom did she see?

 

VODER 

She saw ME?

 

Man 

Well did she see you or hear you?

 

VODER 

She SAW me.

 

Marianne Russ Sharp 

The intonation. Already at that time. It's kind of a blowing my mind. Although I will say it sounded nothing like the voice AI that we hear today. No computer algorithms were creating that one.

 

Amy Quinton 

Yeah, but it was pretty impressive. My understanding was that it was really difficult to operate, but it paved the way for future machine operated speech.

 

Marianne Russ Sharp 

Wow. And now we have researchers investigating whether that machine operated speech is changing the way we operate, or at least how we speak.

 

Amy Quinton 

And they're also trying to understand our social interaction and behavior toward voice AI. In other words, are we treating these devices like people and building a mental picture of what they are like? Georgia Zellou, who we spoke to earlier, explained it like this.

 

Georgia Zellou 

As soon as machines speak to us with voices, they are portraying apparent gender, apparent race, apparent regional background, apparent language background, and all these things in natural human-human conversation are really, really significant and important and affect how we perceive and use language. So are we just doing the same thing when we talk to machines? Or are we sort of creating a very specialized, separate way of handling machines?

 

Marianne Russ Sharp 

It seems to me it would be unavoidable that we would treat them like machines and not humans. We know, as we're speaking to them that they are not a human. So doesn't that mean that our language or our voice would change when we talk to devices like Siri?

 

Amy Quinton 

Well, Michelle Cohn is a postdoctoral researcher in linguistics at 不良研究所 and she and Georgia did a couple of experiments to figure this out. They had people, both young adults and children, talk to devices and talk to other humans saying the same phrases. They even introduced planned errors with both humans and devices.

 

Marianne Russ Sharp 

So they intentionally had a person and a device misunderstand them?

 

Amy Quinton 

Yeah. And here's how Michelle said a human talked with voice AI compared to how a human talked with another human.

 

Michelle Cohn 

They're speaking more loudly to voice assistants, often slowly. They produce either increases or decreases in their pitch. One interesting thing with that Siri study is we found that speakers produced less pitch variation, so kinda like more monotone speech to the Siri voice than the human voice.

 

Marianne Russ Sharp 

That's not too surprising. So it's like, "Hey, Siri, what's the weather like?"

 

Amy Quinton 

But also people who talk to voice AI are hyperarticulating, making the segments of speech slow-er and clear-er.

 

Marianne Russ Sharp 

Un-der-stood. So if our voices changed when talking to machines, we are then treating them differently, right? Like they are machines.

 

Amy Quinton 

Well, Georgia says it's not that simple.

 

Georgia Zellou 

So what we know about real human human interaction is that we naturally adapt. Conversation is dynamic. So I'll change my tone of voice or the words that I'm using, as our conversation is unfolding, um, in a natural way in response to your behavior. And you do that to me vice versa.

 

Marianne Russ Sharp 

So if my tone of voice changes, like maybe I get excited, yes, you're likely to change your tone of voice and sound excited too?

 

Amy Quinton 

Yeah, you know, and even if I hang around someone with a thick accent, I know I'm likely to pick up on their pronunciation and start speaking like them after a while. Michelle says there's actually a technical term for this,

 

Michelle Cohn 

That process is called like alignment or mirroring. The idea is that you adopt the pronunciation patterns of other people to align more closely to them socially. Plays a social role. And so the idea is when you are getting along with someone and you want to convey that you convey that through the your speech patterns, but also in other, in your body language and gaze. People do these micro sways together. So this, there's like this huge, intricate dance of coordination.

 

Marianne Russ Sharp 

I want to do the micro sway. But here's the thing, devices can't mirror you like this, right? There's no body language.

 

Amy Quinton 

No, but Michelle suggests humans might instead be changing their voice to be more monotone, more clear, to reflect what they're hearing from voice AI or Siri.

 

Michelle Cohn 

So we could think about it as kind of like another alignment, like they're aligning more towards what they think that voice sounds like maybe to be better understood. But the cluster of adaptations really reflect this expectation that it's not going to understand you, even if the actual interaction that day or in the lab is exactly the same for the human.

 

Marianne Russ Sharp 

Yeah, I pretty much assume it's never going to understand me on the first try. And I articulate pretty well, I, I might even hyperarticulate but anyway, moving on, you did mention that the researchers did experiments with adults and kids. And so as a mom, I'm pretty curious if the kid's voice has changed as well.

 

Amy Quinton 

Yeah, so these were school aged kids, aged seven to 12, in one of the experiments. See if you can tell a difference with this exchange. The first one is between a child and a human.

 

Human 

What's number three?

 

Human 

The word is side.

 

Human 

I heard side. Say the sentence one more time.

 

Human 

The word is side. Okay.

 

Amy Quinton 

Now see if you can hear the difference between a child and a device.

 

Device 

What's number three?

 

Human 

The word is kid.

 

Device 

I misunderstood that. I heard kit or kid. Repeat the sentence one more time.

 

Human 

The word is kid.

 

Device 

Got it.

 

Amy Quinton 

Could you tell the difference?

 

Marianne Russ Sharp 

Yeah, On the word kid. You can hear the D is over emphasized.

 

Amy Quinton 

Yeah, I could also tell that she was slowing down her voice a little bit.

 

Marianne Russ Sharp 

Yeah

 

Amy Quinton 

I think it's very slight but Michelle and Georgia say it's a significant difference.

 

Michelle Cohn 

Kids actually produce even more kind of evidence that they they perceive the devices having a barrier.

 

Georgia Zellou 

People hyperarticulate to devices and kids do it even more. Bigger. Like there's a bigger difference between humans and devices for kids than adults produce.

 

Michelle Cohn 

It's consistent with this idea that they they're building these mental models and they're also learning how to adapt their speech in different communicative situations. Kind of getting back at what is it revealing about us as humans?

 

Marianne Russ Sharp 

We've talked about how we speak to these devices but what about the way they talk to us? Obviously, these machine made voices have changed over time becoming more human-like.

 

Amy Quinton 

Right, the technology has come a long way. Listen to one of the original Apple TTS voices.

 

Marianne Russ Sharp 

Wait. What's TTS?

 

Amy Quinton 

Text to speech. Computer speak. His name is Bruce.

 

Marianne Russ Sharp 

His? They have names?

 

Amy Quinton 

Yeah. Say something Bruce.

 

BRUCE 

She had your dark suit and greasy wash water all year.

 

Marianne Russ Sharp 

I got the first part, but I'm not sure I understood the end of that one.

 

Amy Quinton 

It definitely sounds more like a machine. Now. Let's play the current Siri voice.

 

Marianne Russ Sharp 

Oh, I can do that. Hang on. Okay. Hey, Siri.

 

SIRI 

Yes.

 

Marianne Russ Sharp 

What's the temperature outside?

 

SIRI 

It's about 93 degrees outside.

 

Marianne Russ Sharp 

Thank you.

 

Marianne Russ Sharp 

Watch this. Hey, Siri, what is the weather?

 

SIRI 

It's currently clear in 92 degrees.

 

SIRI 

That was an Australian accent, wasn't it?

 

Amy Quinton 

Yeah.

 

SIRI 

Hi. I'm Siri. Choose the voice you'd like me to use.

 

SIRI 

Hi, I'm Siri. Choose the voice you'd like me to use.

 

SIRI 

I'm Siri. Choose the voice you'd like me to use.

 

Amy Quinton 

Okay,so American, Australian, British, Indian, Irish, South African.

 

Marianne Russ Sharp 

I kind of liked them all.

 

Amy Quinton 

You know, there are lots of different voices and styles you can choose from as your default. Michelle and Georgia are studying how we perceive these different types of voices. Their work can also help inform engineers as they develop speech technology. The original Siri and Amazon Alexa voice was a type of speech called concatenative.

 

Marianne Russ Sharp 

Concatenative. What's that mean?

 

Amy Quinton 

Basically, a voiceover artist will come into a studio and record a bunch of sentences and phrases, or common words so that computer engineers can piece together every combination of sounds to say any word. Even if the voiceover artist never originally recorded that word. For example, the original Siri voiceover artist was a woman named Susan Bennett. And she sounded like this.

 

SIRI 

Hi, I'm Siri. I'm a digital assistant on Apple products. I will show sentences on the screen. Please read them aloud to me. They will always be...

 

Amy Quinton 

Michelle says one thing is very clear about that voice.

 

Michelle Cohn 

It sounds discernibly choppy. It looks choppy. If you look at the representation of the speech in a waveform or spectrogram, you can see there's like...it came from different places.

 

Amy Quinton 

The newer method of voice AI is called neural text to speech.

 

Michelle Cohn 

That's using machine learning to just intuit all of the patterns of a speaker's voice, all the ways that they pause and kind of abstract from that, and then apply that to to the speech. So it's conditioned on all of the speaker's prior utterances but also the immediately preceding context.

 

Amy Quinton 

With neural text to speech, you get pauses and breath   sounds.

 

Marianne Russ Sharp 

So it sounds much more human.

 

Amy Quinton 

Yeah. Listen, this voice is named Joanna. Her first sentence is neural TTS or text to speech.

 

JOANNA 

The boy might consider the trap.

 

Amy Quinton 

The second Joanna is concatenative.

 

JOANNA 

The boy might consider the trap.

 

Marianne Russ Sharp 

That's subtle. I mean, if you played him for me a few times, I think I could discern between the two but there it's a subtle difference.

 

Amy Quinton 

Should I play it again?

 

Marianne Russ Sharp 

Yeah.

 

Amy Quinton 

Okay. So the first one is neural TTS.

 

JOANNA 

The boy might consider the trap.

 

Amy Quinton 

The second Joanna is concatenative.

 

JOANNA 

The boy might consider the trap.

 

Marianne Russ Sharp 

Okay, I can hear it. I close my eyes this time. And you can hear the first one flows, right just flows more nicely. Wow.

 

Amy Quinton 

But Michelle in Georgia also found that the more human-like a device sounds can impact how well a listener understands it.

 

Marianne Russ Sharp 

So if it sounds more human, we're more likely to understand it?

 

Amy Quinton 

Well you'd think so, but actually the exact opposite. They found that the choppier and more concatenative a device sounds, the better it's understood, the better it's heard, especially if there's any noise in the background. Georgia says this has some implications for tech companies.

 

Georgia Zellou 

They want the most naturalistic speech. But what we've found in our lab is a sort of trade off between it sounding really naturalistic with that cutting edge method of TTS generation and actually being clear. So the actual old method actually produces more intelligible speech, the way that we've measured it, even though the newer way does produce that really smooth, natural speech.

 

Amy Quinton 

I think the bottom line is clear speech, even from a computer, is heard better than casual, natural human-like speech.

 

Marianne Russ Sharp 

I guess that makes sense. I certainly can't understand everyone that I talked to. Some people talk really softly some people's mumble. Yeah, I'm not a soft talker. That's really not either of us, though. But tech companies making these devices are generating these different styles of voices. Is that using this neural text to speech?

 

Amy Quinton 

Yeah, and there is a new neural text to speech voice called newscaster.

 

Marianne Russ Sharp 

Okay. Does it sound like us?

 

Amy Quinton 

Yeah, Here's Joanna again using neural TTS. I'm going to I'm going to show you the difference between neural TTS and newscaster so here's just neural TTS.

 

JOANNA 

Bill wouldn't discuss the dive.

 

Amy Quinton 

And then here's a newscaster one

 

JOANNA 

Bill wouldn't discuss the dive.

 

Amy Quinton 

Bill wouldn't discuss the dive Marianne.

 

Marianne Russ Sharp 

Oh, it's so like stereotypical. I almost want to break into, Bill wouldn't discuss the dive. Live at Five. Anyway.

 

Amy Quinton 

Tech companies creating voice AI are really pushing the boundary between what sounds like a device and what sounds like a human. Michelle and Georgia played a game with me to really illustrate this Marianne and I'm calling it "Bot or Not" Ready?

 

Marianne Russ Sharp 

I love it.

 

Georgia Zellou 

We're gonna play a voice and you say is it a bot or is it not a bot? Like is it a human?

 

Michelle Cohn 

Is it a text to speech like a generated device voice or was it recorded?

 

VOICE 

We saw a flock of wild geese.

 

Amy Quinton 

That sounds generated.

 

Michelle Cohn 

Mm hmm. That was Kimberly.

 

Georgia Zellou 

The Alexa Voice Yeah, one of the Alexa voices.

 

Marianne Russ Sharp 

Okay, wait, play it again.

 

VOICE 

We saw a flock of wild geese.

 

Marianne Russ Sharp 

Okay. Yeah, that one is pretty clear.

 

Amy Quinton 

Yeah, but for me anyway, it got tougher.

 

VOICE 

The farmer harvested his crop.

 

Amy Quinton 

Oh, that's harder to tell. Play it again.

 

VOICE 

The farmer harvested his crop.

 

Michelle Cohn 

No.

 

Amy Quinton 

I think that's human. No?

 

Michelle Cohn 

That's an Alexa voice.

 

Amy Quinton 

Because it was breathy.

 

Georgia Zellou 

Yeah. Yeah. The voice quality takes on exactly yeah.

 

Amy Quinton 

So I saved a couple of voices for you to guess whether it's a bot or not Marianne.

 

Marianne Russ Sharp 

Okay, good. And I want everyone to know, this is the first time I'm hearing these voices. So if I'm honest, I'm a little nervous.

 

Amy Quinton 

You'll get it right. Okay, so here's the first one.

 

VOICE 

She made the bed with clean sheets.

 

Marianne Russ Sharp 

Bot.

 

Amy Quinton 

Exactly.

 

Marianne Russ Sharp 

Yes.

 

Amy Quinton 

Okay, how about this one.

 

VOICE 

Ruth hopes he heard about the hips.

 

Marianne Russ Sharp 

That is not a bot. Is it?

 

Amy Quinton 

It's not a bot.

 

Marianne Russ Sharp 

Oh good. I could feel like there was a human quality to it. Right.

 

Amy Quinton 

Okay. Okay, so this one's really different. It's a different one. And it's a bit harder. Ready?

 

VOICE 

The cat found the bag.

 

Marianne Russ Sharp 

Okay, can I hear it again?

 

VOICE 

The cat found the bag.

 

Marianne Russ Sharp 

One more time.

 

VOICE 

The cat found the bag.

 

Marianne Russ Sharp 

It could go either way. I'm gonna say bot.

 

Amy Quinton 

You're gonna say bot?

 

Marianne Russ Sharp 

 Yeah, even though I think it is so close. But I just there's like a gut instinct in me that saying it's a bot.

 

Amy Quinton 

It is a bot.

 

Marianne Russ Sharp

Yes.

 

Amy Quinton

I guessed that one totally wrong.

 

Marianne Russ Sharp 

It just had like that there was a quality to it. You know what I mean?

 

Amy Quinton 

Well, you know, researchers have found that adults, older adults, like us have a more difficult time telling the difference between voice AI and human voices compared with children.

 

Marianne Russ Sharp 

That makes sense. Because actually, if I wasn't sitting here with my eyes closed, listening to that really carefully, I'm sure I couldn't have told the difference. In you know, thinking about kids make because they have grown up with all different kinds of voice technology. They didn't grow up with just the Speak and Spell. Right?

 

Amy Quinton 

Exactly. So Georgia wonders, what happens when voice AI becomes even more human-like? Like the computer Hal in 2001 A Space Odyssey.

 

Georgia Zellou 

You know, our worst fear is that a computer or a machine is kind of too smart for us and sort of takes over.

 

Movie Clip 

This mission is too important for me to allow you to jeopardize it. I don't know what you're talking about Hal?

 

Marianne Russ Sharp 

Oh, yeah, that gives me that creepy feeling.

 

Amy Quinton 

Yeah, Georgia says there's actually a technical term for it.

 

Georgia Zellou 

When a nonhuman entity is kind of too human, we kind of get creeped out by it. Right. It's called this Uncanny Valley effect. And these voices are now getting more and more naturalistic. So we're kind of curious. What does that mean, for as we're talking to a device? We know it's a device, but it sounds just like a human? Would that create that sort of Uncanny Valley effect or not? And yeah, this is one of one of the lines of research that we're interested in.

 

Marianne Russ Sharp 

Okay, so I kind of created a faux Uncanny Valley effect in college.

 

Amy Quinton 

What?

 

Marianne Russ Sharp 

Okay, this is really weird. When I went to college, this is dating myself, you used to have to register for classes on the phone. And the voice in the phone system really sounded similar to my voice, I thought. And I guess now with everything I've learned in this podcast, I'm not sure if it was based on a human voice, or if it was computer generated. And so I would go around and imitate, sort of creep my friends out. It was like um, "Welcome to MU cares. Miami University's call and register easily system. Please wait." So did that sound kind of bot-like?

 

Amy Quinton 

It did.

 

Marianne Russ Sharp 

It was very stilted. And you know, was a tree that you had to press one for, you know, this and that. So maybe it was a real voice that was recorded, and then like they inserted the words that they needed to? I don't know, Amy.

 

Amy Quinton 

That is kind of freaky. Well researchers are also curious to know what happens when and if voice AI could start to mirror us and adapt or change the way it speaks based on who it's speaking to.

 

Marianne Russ Sharp 

Is that technically possible?

 

Amy Quinton 

No idea. But think about it. Georgia calls it style shifting. For instance, I don't speak to a child the same way I speak to an adult, I'll change my word choice and maybe the tone of my voice. So why can't Alexa?

 

Marianne Russ Sharp 

You make a great point, maybe that's the next big step to recognize that it's a child's voice. But on a more serious note, I can't help thinking about the implications of voice AI can say impersonate people.

 

Amy Quinton 

Michelle mentioned that it's something to take into account.

 

Michelle Cohn 

Text to speech is getting better all the time. So it's, it's hard to know where the boundary is. You can type whatever you want, and these voices can produce it. So there's a real potential for harm too.

 

Marianne Russ Sharp 

Definitely something to grapple with. And it's also easy to imagine the possibilities. What if it could learn to recognize say tension in a voice and then play some calming music?

 

Amy Quinton 

Something to look forward to in the future, I guess, or maybe something to be afraid of? I'm not really sure AI is moving very fast.

 

ALEXA 

I think that the future of AI is going to be very interesting. I think that we will see a lot of new technologies that will be able to help us in the future.

 

Amy Quinton 

Can you give me an example?

 

ALEXA 

Sure. Another exciting development is the rise of AI powered virtual assistants such as Alexa or Google Assistant.

 

Amy Quinton 

It's so weird. She would say that.

 

Marianne Russ Sharp 

Actually what's really weird is that she mentioned Google Assistant. What she should really have mentioned. Let's be honest, is Unfold. And the fact that you can hear all of our previous episodes at our website ucdavis.edu/unfold.

 

ALEXA 

Getting Unfold from Amazon music.

 

Marianne Russ Sharp 

You can learn more about Georgia Zellou and Michelle Cohn's research at our website as well. I'm Marianne Russ Sharp.

 

Amy Quinton 

And I'm Amy Quinton, thanks so much for listening. Unfold is a production of 不良研究所. Original Music for Unfold comes from Damien Verrett and Curtis Jerome Haynes.

 

ALEXA 

Getting Unfold from Amazon Music.