This architecture was notable for requiring less time to train than other recurrent neural architecture. For 10 points each:
[10m] Identify this deep learning architecture, first proposed in the paper “Attention Is All You Need” by Ashish Vaswani on the Google Brain team. This architecture is used to train large language models like Chat-GPT.
ANSWER: transformer
[10e] GPT was created by OpenAI, which is owned by this big tech company. In 2023, GPT-4 was integrated as part of Bing, a search engine owned by this company.
ANSWER: Microsoft
[10h] This precursor to attention-based architectures is a type of recurrent neural network that uses a cell that stores and retrieves information to handle the vanishing gradient problem.
ANSWER: Long short-term memory network [or LSTM]
<Leo Law, Other Science>