eval-driven-dev

Category: Data & AI | Uploader: yiouliyiouli | Downloads: 0 | Version: v1.0(Latest)

Instrument Python LLM apps, build golden datasets, write eval-based tests, run them, and root-cause failures — covering the full eval-driven development cycle. Make sure to use this skill whenever a user is developing, testing, QA-ing, evaluating, or benchmarking a Python project that calls an LLM, even if they don't say "evals" explicitly. Use for making sure an AI app works correctly, catching regressions after prompt changes, debugging why an agent started behaving differently, or validating output quality before shipping.

Changelog: Source: GitHub https://github.com/yiouli/pixie-qa

Directory Structure

Current level: .claude/skills/eval-driven-dev/

SKILL.md

Login to download/like/favorite ❤ 3 | ★ 0
Comments 0

Please login before commenting.

No comments yet. Be the first one!