I’d like to find an AI model that can parse and transcribe an audio file that includes multiple speakers.
I know that Whisper is very good at parsing audio from a single speaker, however I really need something that can handle multiple speakers (anywhere between 5 and 10).
The use case is for recording public meeting sessions and generating summaries and getting a full transcription that is broken down by speaker…