Benchmark LLMs

Beyond technical benchmarking—test and compare large language models in a game-based challenge.

Get Started