Some researchers/engineers from the Dessa company made a Text To Speech (TTS) engine which uses the voice of Joe Rogan. Articles about this mainly refer to this by the quality of the voice, but systems like TacoTron2 were already capable of very convincing TTS voices before this.
Is the novelty here that there was no additional recording done by the voice model specifically to make the TTS engine?
I haven’t seen any proof of this, but his initialreactions definitely sound like he was not involved with the team who did this.
It sounds like reading ads and in my head the next step would be to learn how to fuck it up, and difficult words would be like stretched and not quite right, and then use wrong terms, which it corrects
OR, just yell speak few words, you could learn that from UFC I guess
Hes been talking about that thing so many times that it was just matter of time
I read about this somewhere else too but all the articles go with how accurate itnsounds like him and I just don’t think it does. It is his voice but there is no animation in it none if the sudden raises in voice he has or the errors like @anon25377527 said. This is like a super clean version of him that is incredibly relaxed and reherssed which is not him normally.
Edit: listening to it again I remembered what I was thinking, a lotnof the sentences end with the same inflection, down. The pauses in between changing sentences don’t feel natural. It is amazing for sure but either it is not good enough yet, or they picked the wrong target. A more slow a leveled personality might have been more convincing.
Combine this with deep fakes video and a concerted organized effort to provide video from different angles from a major event taking place (political rally?) and it will give a whole new meaning to the term “Fake News”.
Who’s to say which version of events is accurate?
I am actually really excited for that eventuality, I have been thinking about it since the deepfakes appeared and looked almost right. Both of these are now so close to being correct that it will be pretty amazing when it happens.
As for the way I am excited though? It will force a change in thinking in the viewer and a level of forethought for the person being impersonated and information will have to be verifiable in a way we just don’t have now. It is sad to think that we will litteraly have to deceive people on a scale never known before to get some sort of awareness and accuracy that has been needed and eroded for a long time now.
Or that could all be wishful thinking and this will just make everything worse while people run around hair in fire style not doing anything about it but screaming that the end is high.
Half the people will be scarred for life, a quarter will have a great time, nd the people that are left are pkssies that joe han eat to death or something.