eval-writer

Category: Research & Analysis | Uploader: langchain-ailangchain-ai | Downloads: 0 | Version: v1.0(Latest)

Create new eval suites for the deepagentsjs monorepo. Handles dataset design, test case scaffolding, scoring logic, vitest configuration, and LangSmith integration. Use when the user asks to: (1) create an eval, (2) write an evaluation, (3) add a benchmark, (4) build an eval suite, (5) evaluate agent behaviour, (6) add test cases for a capability, or (7) implement an existing benchmark (e.g. oolong, AgentBench, SWE-bench). Trigger on phrases like 'create eval', 'new eval', 'add eval', 'benchmark', 'evaluate', 'eval suite', 'write evals for'.

Changelog: Source: GitHub https://github.com/langchain-ai/deepagentsjs

Directory Structure

Current level: tree/main/.agents/skills/eval-creator/

SKILL.md

Login to download/like/favorite ❤ 996 | ★ 0
Comments 0

Please login before commenting.

No comments yet. Be the first one!