Add support for nvidia/parakeet-tdt-0.6b-v3 (Windows)

https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

Feedback Suggestion

Please add support for nvidia/parakeet-tdt-0.6b-v3 in the local transcription software.

  • This model is listed on the Open ASR Leaderboard and demonstrates stronger multilingual performance.

  • It achieves higher transcription accuracy than openai/whisper-large-v3.

  • It is also more than 16× faster than openai/whisper-large-v3-turbo, making it highly efficient.

Additionally, since we have already configured the “Configure Your AI – Bring your own API key to power Vowen’s AI features”, we suggest expanding functionality with:

  • Translation feature: Transcribed content could be directly translated into a target language.

  • Real-time transcription + real-time translation: Making the software more suitable for global use cases and multilingual workflows.

  • Voice Activity Detection (VAD): Automatically segment transcription based on natural speech pauses instead of manual hotkeys.

  • Speaker Diarization: Distinguish between different speakers in conversations, improving readability and usability in meetings or multi-speaker recordings.

Please authenticate to join the conversation.

Upvoters
Status

Completed

Board
💡

Feature Request

Date

4 months ago

Author

dao tian

Subscribe to post

Get notified by email when there are changes.