Well, if this is a deep fake than the editing of the audio is the fucking best I've ever heard. I've got decades of experience working with audio and I can tell you this, there is no way that someone would be able to string together words from past audio recordings and have the sentences be articulated with the exact, in the moment speech pattern you hear on this recording. The speed, pronunciation and flow of thought not to mention the substantive information just spilling out of his mouth is "in the moment". Not the result of editing. This ain't no deep fake.
Audio deep fake isn't about audio editing. It is about creating audio with a trained AI. It is an arms race now: the AI get better, and the tools to detect them too:
https://www.nisos.com/blog/synthetic-audio-deepfake/
The speed is still a hint, the AI doesn't change the speed (yet). Some say, the "Clapper interrogation" was too fast, others say he just was nervous.
(post is archived)