natural language - 2023 ACF Winter @ Columbia

Question

Many “large” models of this faculty predict the most likely token given some context window. For 10 points each:

[10e] Name this faculty modeled by LLaMA (“llama”) and GPT-4. Computational tasks associated with this faculty include part-of-speech tagging and machine translation.

ANSWER: natural language [accept natural language processing or NLP]

[10h] Scaling-up transformer context lengths is limited by the quadratic memory cost of this operation. Its usefulness for modeling sequential data led Vaswani et al. to declare that this operation “is all you need.”

ANSWER: attention [accept self-attention; accept multi-head attention; accept “attention is all you need”]

[10m] Words passed to a transformer are embedded into these objects, whose usage is exemplified by “king minus man plus woman equals queen.” The spaces of these mathematical objects are closed under addition and scalar multiplication.

ANSWER: word vectors [or word vector embeddings; accept vector spaces]

Back to bonuses

Summary


2023 ACF Winter @ Columbia	11/11/2023	Y	9	15.56	78%	78%	0%

Data


Columbia A	Penn B	10	10	20
Columbia B	Columbia C	10	0	10
Haverford	Cornell C	10	10	20
Princeton A	NYU B	0	10	10
Yale A	Penn A	10	0	10
Vassar	Princeton B	10	10	20
Rutgers A	Rowan A	10	10	20
NYU A	Yale B	10	10	20
Rutgers B	Yale C	0	10	10