GenAI chat arena: Compare model responses
We start with a multimodal chat project. During the setup, we previously selected two frontier models to compare. 

Pre-loaded models include Gemini 2.5 Pro, Claude 3.7, Grok 3, OpenAI o3, and more. You can also load a custom model. The tool can compare up to 10 models at once. Below we can examine and edit our ontology. Here you can customize the ranking, classification, and other annotation fields you want captured on the responses.

GenAI chat arena: Compare model responses