Question
Answer the following about the methods used by Google’s DeepMind to train AlphaStar, an agent developed to play StarCraft II that reached the highest rank of Grandmaster in 2019. For 10 points each:
[10h] Two answers required. The reinforcement learning procedure used by AlphaStar is based on a policy gradient algorithm in a framework named for these two entities. A popular RL algorithm is named for “Asynchronous Advantage” and these two entities, where policy and value functions are simultaneously learned and updated.
ANSWER: actors and critics [accept (Asynchronous) Advantage Actor-Critic; prompt on A2C or A3C]
[10m] The supervised and reinforcement stages of AlphaStar combined losses using this optimizer. Momentum and RMSProp are precursors to this often-default ML optimization method that has a four-letter acronym.
ANSWER: Adam algorithm [or Adaptive Moment Estimation]
[10e] The multi-agent stage of AlphaStar avoids solely using naive self-play because of its tendency to chase these constructs, leading to an infinite loop. In graphs, these constructs are paths that have the same first and last vertex.
ANSWER: cycles [or circuits]
<Science - Other Science - Math>
Summary
2024 ARGOS @ Brandeis | 03/22/2025 | Y | 3 | 16.67 | 67% | 67% | 33% |
2024 ARGOS @ Chicago | 11/23/2024 | Y | 6 | 15.00 | 83% | 50% | 17% |
2024 ARGOS @ Christ's College | 12/14/2024 | Y | 3 | 6.67 | 33% | 33% | 0% |
2024 ARGOS @ Columbia | 11/23/2024 | Y | 3 | 10.00 | 67% | 33% | 0% |
2024 ARGOS @ Stanford | 02/22/2025 | Y | 3 | 10.00 | 67% | 33% | 0% |
2024 ARGOS Online | 03/22/2025 | Y | 3 | 13.33 | 100% | 33% | 0% |
2024 ARGOS @ McMaster | 11/17/2024 | N | 5 | 14.00 | 100% | 40% | 0% |
Data
Banned from ARGOS | Pahkin' the Ahgo | 0 | 10 | 10 | 20 |
Hu up Jinning they Tao | "Powers a question on Stancyzk" that's a clown question bro | 0 | 0 | 0 | 0 |
BHSU ReFantazio | hawk two of | 10 | 10 | 10 | 30 |
Clown Senpais | The Love Song of J Alfred PrufRock and Roll All Nite (and Party Every Day) | 0 | 0 | 10 | 10 |
Music to Help You Stop Smoking | Clown Squad | 0 | 10 | 10 | 20 |
Who is the Colleen Hoover of the Zulus? | Northeast by Northwestern | 0 | 0 | 10 | 10 |
BHSU Rebirth | Notre Dame | 0 | 10 | 10 | 20 |
That Feeling When Knee Surgery Is in Five Days | WashU | 0 | 0 | 0 | 0 |
Import Pandas | |madam| | 10 | 10 | 10 | 30 |
Defying Suavity | Grzegorz Brzęczyszczykiewicz | 0 | 0 | 0 | 0 |
Cambridge | Limp Franceskit | 0 | 0 | 10 | 10 |
Cien Años de Quizboledad | Simple Vibes | 0 | 10 | 0 | 10 |
Walston et. al. | 12 Litres of Green Tea | 0 | 0 | 10 | 10 |
Cope is the thing with feathers | NJ TRANSit (and anwen | 0 | 10 | 10 | 20 |
jeff mcneil #1 morningside heights fan club | just one more half-dot bro | 0 | 0 | 0 | 0 |
Stanford+ | Where are the ACF Nationals recordings? | 0 | 0 | 10 | 10 |
number of tang poems = 75 times number of lines in a shi = 100 times number of lines in a haiku | A is for Amy Robsart who fell down the stairs | 0 | 10 | 10 | 20 |
Cry of the Common Loon | Berkeley | 0 | 0 | 0 | 0 |
I wish it were possible to freeze time so I would never have to watch you retire | Aw we're so sorry to hear that maman died today, she gets five big booms | 0 | 0 | 10 | 10 |
CLEVELAND, THIS IS FOR YOU! | UBC | 0 | 0 | 10 | 10 |
throw away your cards, rally in the streets | Thompson et al. | 0 | 10 | 10 | 20 |