https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
Feedback Suggestion
Please add support for nvidia/parakeet-tdt-0.6b-v3 in the local transcription software.
This model is listed on the Open ASR Leaderboard and demonstrates stronger multilingual performance.
It achieves higher transcription accuracy than openai/whisper-large-v3.
It is also more than 16× faster than openai/whisper-large-v3-turbo, making it highly efficient.
Additionally, since we have already configured the “Configure Your AI – Bring your own API key to power Vowen’s AI features”, we suggest expanding functionality with:
Translation feature: Transcribed content could be directly translated into a target language.
Real-time transcription + real-time translation: Making the software more suitable for global use cases and multilingual workflows.
Voice Activity Detection (VAD): Automatically segment transcription based on natural speech pauses instead of manual hotkeys.
Speaker Diarization: Distinguish between different speakers in conversations, improving readability and usability in meetings or multi-speaker recordings.

Please authenticate to join the conversation.
Completed
Feature Request
4 months ago

dao tian
Get notified by email when there are changes.
Completed
Feature Request
4 months ago

dao tian
Get notified by email when there are changes.