benchmark-runner

Category: Research & Analysis | Uploader: RConsortiumRConsortium | Downloads: 0 | Version: v1.0(Latest)

Auto-discover all skills with evals in RConsortium/pharma-skills, benchmark each with vs. without skill using matched isolated sessions, and post scored results to the linked GitHub issue. Use whenever someone says "run benchmarks", "compare skill performance", "eval the skills", or wants to measure whether a skill improves output quality.

Changelog: Source: GitHub https://github.com/RConsortium/pharma-skills

Directory Structure

Current level: tree/main/_automation/benchmark-runner/

  • 📁 runs/
    • 📄 .gitkeep 0 B
    • 📄 README.md 501 B
  • 📁 scripts/
    • 📄 generate_dashboard.py 2.3 KB
    • 📄 get_next_eval.py 17.4 KB
    • 📄 post_issue_comment.py 2.0 KB
    • 📄 record_run_result.py 1.8 KB
    • 📄 setup_r_env.sh 7.3 KB
  • 📄 CLAUDE_CODE_ROUTINE.md 537 B
  • 📄 LICENSE 1.0 KB
  • 📄 README.md 4.9 KB
  • 📄 SKILL.md 13.0 KB

SKILL.md

Login to download/like/favorite ❤ 35 | ★ 0
Comments 0

Please login before commenting.

Loading comments...