Audio samples for Real-time Textless Dialogue Generation

Abstract: We propose the a real-time textless dialogue generation (RTTL-DG) capable of: (1) generating real-time responses with minimal delay based on streaming input from the conversation. (2) producing coherent and naturalistic responses with spontaneous expressions. (3) enabling fluid turn-taking, including the ability to interrupt or handle interruptions. (4) generating backchannels during others' speech. (5) producing responses of diverse lengths, ranging from short back-and-forth phrases to long-form storytelling formats.

 

Single response generationn


Response with Pitch variation

Dialogue context Wow, how long were you in there for?
Um, about a year.
Nice to see you weren't in there too long.
RTTL-DG
Cascaded
Ground-Truth

Response with filler words, repetitions, and hesitations

Dialogue context Oh. I'm trying to do it every day. I figure it'll come to like $150 or something.
Yeah, really, why not? You know, you get paid for talking to people and, you know, okay, fine, why not?
Yeah I was trying to think about what kind of crazy research grant they must have for this. That's a lot of money to be shelling out.
RTTL-DG
Cascaded
Ground-Truth
-------------------------
Dialogue context Yeah, we can make one every day to like March 22nd or something
Oh, wow.', "Yeah, so that's pretty nice.
Yeah, great. So what are you studying there?
RTTL-DG
Cascaded
Ground-Truth
-------------------------
Dialogue context You just don't live on campus.
Right.
How's the, uh, Wisconsin campus, all right?
RTTL-DG
Cascaded
Ground-Truth
-------------------------

Response with pauses

Dialogue context Must have had a lot of participants from this area.
So, what are you, uh, studying there?
I'm studying biology. What do you do in Madison?
RTTL-DG
Cascaded
Ground-Truth
-------------------------
Dialogue context That's what it is, so I might have a little trouble hearing you, so I'll try my best though.
Okay, I have a cold too, so...
Oh yeah, you do sound kinda... is it cold down there?
RTTL-DG
Cascaded
Ground-Truth
-------------------------

Response with laughters

Dialogue context I don't know. I just talked to somebody from Minneapolis, I guess, yesterday.
Uh-huh.
And I said, what's the weather like? And she said, it's, you know. 15 below. I said, oh, it was windchilling. She goes, no. Just 50 below.
RTTL-DG
Cascaded
Ground-Truth
-------------------------
Dialogue context Yeah, cuz I've had some like I've had a couple writing classes and I've had like a couple media classes and then like an interviewing class and interpersonal communications
I'm cool.
Uh, yeah, so it's a little bit of everything basically.
RTTL-DG
Cascaded
Ground-Truth

Response with breathings

Dialogue context Yeah, like running a funeral. Oh man, I'm so...
Okay!
I didn't really know what to say to that. I'm like, that sounds insane.
RTTL-DG
Cascaded
Ground-Truth
-------------------------
Dialogue context Fuck. I got that test tonight.
I think I cheated on a couple of them, so...
That's it. I'll try to do
RTTL-DG
Cascaded
Ground-Truth


Dialogue continuation/genertion


Conversation #1

Dialogue context I'm sure it is. Do you two have any plans to keep in touch often?
We talked about it, but it won't be the same.
That's understandable. Have you thought about getting a new roommate, or would you prefer living alone for a while?
RTTL-DG
Cascaded

Conversation #2

Dialogue context By understanding his point of view his radicalization, it make Killmonger more complex and more interesting.
The villain must always fall in MCU but Killmonger will be remembered!
Do you Thanos will achieve the same success as a villain?
RTTL-DG
Cascaded