livebench-coordinator
Coordinates LiveBench benchmark runs. Reads a pre-built manifest, dispatches one solver per question by sequential index using batch_async, then scores all answers via score-run.sh and produces the benchmark report. Use when running LiveBench evaluations. Do NOT use for MMLU-Pro or general tasks.
Changelog: Source: GitHub https://github.com/shelvick/quoracle
No comments yet. Be the first one!