Question

Note to moderator: Read answerline carefully. This type of data is processed with the algorithm Snowball, which is an improved version of an algorithm developed by Martin Porter. This type of data may be organized into synsets, which can assist with the task of WSD that may be accomplished with the Lesk algorithm. This type of data is the focus of a tidyverse-based David Robinson and Julia Silge (“SILL-gee”) textbook which uses tf-idf (“t f - i d f”) to analyze this type of data. The Penn Treebank (15[1])includes this type of data annotated with (*) POS tags. N-gram models may be trained on this type of data that has been processed by being stemmed or tokenized. This type of data may have its valence assessed in sentiment analysis carried out on the decontextualized forms of its corpora. For 10 points, NLP is concerned with the “natural processing” of what type of data used to train LLMs and chatbots? ■END■

ANSWER: text data [accept natural language processing data or language data; accept words; accept WordNet; accept writing or handwriting or written text; accept strings; accept documents; accept dictionary or dictionaries; accept thesauruses or thesauri; accept text corpus or text corpora until “corpora” is read; accept tokens until “tokenized” is read; accept stems until “stemmed” is read; accept lemmas; accept lemmatization; accept descriptions of written or transcribed speech or language; accept Text Mining with R; prompt on topics by asking “of what?”; prompt on speech or parts of speech by asking “in what format?”; reject “voice” or descriptions of recorded noises]
<CH, Other Science: Math>
= Average correct buzz position

Back to tossups

Buzzes

PlayerTeamOpponentBuzz PositionValue
Travis TuI will play anything with a buzzer in front of meSGV Capital8015

Summary

2023 ARCADIA at UC BerkeleyPremiereY2100%100%0%53.50
2023 ARCADIA at Carleton UniversityPremiereY3100%0%33%123.33
2023 ARCADIA at Claremont CollegesPremiereY1100%100%0%80.00
2023 ARCADIA at IndianaPremiereY5100%20%0%103.40
2023 ARCADIA at RITPremiereY2100%50%0%86.00
2023 ARCADIA at WUSTLPremiereY3100%0%0%117.67