Question

The robustness of this task can be improved by applying transforms like SPLICE and CMVN. An architecture of composed weighted finite-state transducers is used for this task by the open-source package Kaldi, which does not support the frequently-used CTC loss function in alignment during this task. Classical algorithms for this task extract 39 MFCC features for each time frame of a sliding window, then feed them into an (*) GMM-HMM. Context-dependent models are used in this task to account for allophones. Probabilistic methods (10[1])for this task compute the product of an acoustic model and a language model to select the most likely word sequence given a sound signal. For 10 points, speech synthesis is the reverse of what natural language processing task used by digital assistants like Siri to process user input? ■END■

ANSWER: speech recognition [or automatic speech recognition or ASR; accept transcription or speech-to-text or STT; accept speech alignment; prompt on speech processing or natural language processing until read; prompt on vocal recognition or voice recognition; prompt on answers like understanding speech; prompt on captioning by asking “what more general task is used to create captions?”; reject “text-to-speech” or “TTS” or “speech synthesis”; reject “speaker identification” or “speaker verification” or “vocal identification”]
<VD, Other Science>
= Average correct buzz position

Back to tossups

Buzzes

PlayerTeamOpponentBuzz PositionValue
Ali HamzehToyota Tundra Turbos — Twisting Truths - Tackling Trivia - Taming TitansIn the Mood for Buzz8110

Summary

2024 ARCADIA Online2025-05-17Y1100%0%0%81.00