I’d like to be able to use AI to generate a transcript of audio files of multiple speakers. I write about local politics for one of my websites, and routinely cover the local council meetings. The problem is that of course there are several speakers that speak intermittently.
I’d like to create a transcript from an audio fiile like:
Speaker-1: “Blah Blah BLah”
Speaker-2: “Blue BLue Blue”
From what I have heard, OpenAI’s Whisper works very well for a single speaker, however it’s not designed to work for multiple speakers.