livebench-coordinator
Coordinates LiveBench benchmark runs. Reads a pre-built manifest, dispatches one solver per question by sequential index using batch_async, then scores all answers via score-run.sh and produces the benchmark report. Use when running LiveBench evaluations. Do NOT use for MMLU-Pro or general tasks.
更新日志: Source: GitHub https://github.com/shelvick/quoracle
还没有评论,快来第一个发言吧。