The IBM Watson Speech to Text service uses speech recognition capabilities to convert Arabic, English, Spanish, French, Brazilian Portuguese, Japanese, Korean, German, and Mandarin speech into text.
Watson Speech to Text supports .mp3, .mpeg, .wav, .opus, and .flac files up to 200mb.
Voice Model:
Keywords to spot: