We (w/ @kashif ) talked about training LLMs through interaction, using trajectories across games, browsers, or simulators
Room was packed, a clear sign of interest in where RL post-training is heading.
sharing the slides! π€
https://drive.google.com/file/d/16k7YRnf5EJEo0XjXGlRJ_hVeLoFWKyNP/view?usp=sharing