AI Agent Skills 搜索与发现平台

Daily Featured Skills Count

04/23 04/24 04/25 04/26 04/27 04/28 04/29

♾️ Free & Open Source 🛡️ Secure & Worry-Free

Import Skills

Composite Most Downloads Most Likes Most Comments Newest

langchain-ai

from GitHub Research & Analysis

📄 SKILL.md

eval data benchmark

eval-writer

Create new eval suites for the deepagentsjs monorepo. Handles dataset design, test case scaffolding, scoring logic, vitest configuration, and LangSmith integration. Use when the user asks to: (1) create an eval, (2) write an evaluation, (3) add a benchmark, (4) build an eval suite, (5) evaluate agent behaviour, (6) add test cases for a capability, or (7) implement an existing benchmark (e.g. oolong, AgentBench, SWE-bench). Trigger on phrases like 'create eval', 'new eval', 'add eval', 'benchmark', 'evaluate', 'eval suite', 'write evals for'.

⬇0 ❤1K 28 days ago · Uploaded Detail →

NoesisVision

from GitHub Development & Coding

📄 SKILL.md

benchmark coding what

nasde-benchmark-creator

Create coding agent benchmarks for evaluation with nasde. Use this skill when the user wants to: - Create a new benchmark project (set of tasks for evaluating coding agents) - Add tasks to an existing benchmark - Create or modify agent variants (configurations that control agent behavior) - Set up assessment dimensions and scoring criteria - Verify that a new benchmark's Docker environment and tests work Even if the user doesn't say "benchmark" — if they're talking about creating coding challenges for AI agents or setting up evaluation criteria, this skill applies. --- # NASDE Benchmark Creator Create and configure coding agent benchmarks for evaluation with `nasde`. A benchmark is a set of coding tasks that AI agents solve inside isolated Docker containers, scored both by functional tests (pass/fail) and by an LLM-as-a-Judge architecture assessment. ## Step 1: Understand what to evaluate Before creating files, clarify with the user: - What programming language/framework? (determines Dockerfile base image) - What kind of coding challenges? (feature implementation, refactoring, bug fixing, etc.) - What source repository should the agent work on? (git URL cloned in Dockerfile) - What quality dimensions should be assessed? (these are benchmark-specific, not hardcoded) ## Step 2: Scaffold or create the project For a new benchmark, run: ```bash nasde init my-benchmark --name my-benchmark ``` This creates the base structure. Then customize the generated files. For adding tasks to an existing benchmark, skip to Step 4. ## Step 3: Define assessment dimensions Edit `assessment_dimensions.json`. Each benchmark has its OWN dimensions — design them for what matters in this benchmark's domain.

⬇0 ❤7 55 minutes ago · Uploaded Detail →

allenai

from GitHub Research & Analysis

📄 SKILL.md

benchmark add new

add-benchmark

Add a new simulation benchmark to the VLA evaluation harness. Use this skill whenever the user wants to integrate, create, or add a new benchmark or simulation environment — e.g. 'add ManiSkill3', 'integrate OmniGibson', 'hook up a new sim'. Also use when they ask how benchmarks are structured or want to understand the benchmark interface.

⬇0 ❤231 29 days ago · Uploaded Detail →

ory

from GitHub Research & Analysis

📄 SKILL.md

add bench-swe benchmark

add-benchmark

Add a new SWE benchmark task from a real GitHub bug-fix. Use when the user provides a GitHub issue or PR URL and wants to add it to the bench-swe pipeline.

⬇0 ❤158 26 days ago · Uploaded Detail →

DexForce

from GitHub Research & Analysis

📄 SKILL.md

conventions embodichain benchmark

benchmark

Write benchmark scripts for EmbodiChain modules following project conventions

⬇0 ❤149 29 days ago · Uploaded Detail →

netease-youdao

from GitHub Research & Analysis

📁 examples/
📁 scripts/
📁 server/
📄 .gitignore
📄 group.jpg
📄 install.sh

data paperswithcode benchmark

scholarclaw

学术论文搜索与分析服务 (Academic paper search & analysis)。当用户涉及以下学术场景时，必须使用本 skill 而非 web-search：搜索论文、查找 ArXiv/PubMed/PapersWithCode 论文、查询 SOTA 榜单与 benchmark 结果、引用分析、生成论文解读博客、查找论文相关 GitHub 仓库、获取热门论文推荐。Keywords: arxiv, paper, papers, academic, scholar, research, 论文, 学术, 搜索论文, 找论文, SOTA, benchmark, MMLU, citation, 引用, 博客, blog, PapersWithCode, HuggingFace.

⬇0 ❤9 24 days ago · Uploaded Detail →

skilltester-ai

from GitHub Research & Analysis

📄 SKILL.md

when description benchmark

skilltester

Before installing or using a skill, check its independent benchmark report on SkillTester.ai. Trigger this skill when the user is about to install a third-party skill, or when the user explicitly says `Check this skill <skill_url>`. Resolve the provided URL to SKILL.md, extract name and description, query the server by name, and return the benchmark result when the description is either an exact match or a high-overlap near match that likely represents a newer skill revision.

⬇0 ❤5 28 days ago · Uploaded Detail →

‹ 1 ›

Creator Leaderboard

Most Published Most Liked Most Replied

1 No data --
2 No data --
3 No data --
4 No data --
5 No data --
6 No data --
7 No data --
8 No data --
9 No data --
10 No data --
11 No data --
12 No data --
13 No data --
14 No data --
15 No data --
16 No data --

Skill File Structure Sample (Reference)

skill-sample/
├─ SKILL.md              ⭐ Required: skill entry doc (purpose / usage / examples / deps)
├─ manifest.sample.json  ⭐ Recommended: machine-readable metadata (index / validation / autofill)
├─ LICENSE.sample        ⭐ Recommended: license & scope (open source / restriction / commercial)
├─ scripts/
│  └─ example-run.py     ✅ Runnable example script for quick verification
├─ assets/
│  ├─ example-formatting-guide.md  🧩 Output conventions: layout / structure / style
│  └─ example-template.tex         🧩 Templates: quickly generate standardized output
└─ references/           🧩 Knowledge base: methods / guides / best practices
   ├─ example-ref-structure.md     🧩 Structure reference
   ├─ example-ref-analysis.md      🧩 Analysis reference
   └─ example-ref-visuals.md       🧩 Visual reference

More Agent Skills specs Anthropic docs: https://agentskills.io/home

SKILL.md Requirements

├─ ⭐ Required: YAML Frontmatter (must be at top)
│  ├─ ⭐ name                 : unique skill name, follow naming convention
│  └─ ⭐ description          : include trigger keywords for matching
│
├─ ✅ Optional: Frontmatter extension fields
│  ├─ ✅ license              : license identifier
│  ├─ ✅ compatibility        : runtime constraints when needed
│  ├─ ✅ metadata             : key-value fields (author/version/source_url...)
│  └─ 🧩 allowed-tools        : tool whitelist (experimental)
│
└─ ✅ Recommended: Markdown body (progressive disclosure)
   ├─ ✅ Overview / Purpose
   ├─ ✅ When to use
   ├─ ✅ Step-by-step
   ├─ ✅ Inputs / Outputs
   ├─ ✅ Examples
   ├─ 🧩 Files & References
   ├─ 🧩 Edge cases
   ├─ 🧩 Troubleshooting
   └─ 🧩 Safety notes

Why SkillWink?

Skill files are scattered across GitHub and communities, difficult to search, and hard to evaluate. SkillWink organizes open-source skills into a searchable, filterable library you can directly download and use.

We provide keyword search, version updates, multi-metric ranking (downloads / likes / comments / updates), and open SKILL.md standards. You can also discuss usage and improvements on skill detail pages.

Keyword Search Version Updates Multi-Metric Ranking Open Standard Discussion

Quick Start:

Import/download skills (.zip/.skill), then place locally:

~/.claude/skills/ (Claude Code)

~/.codex/skills/ (Codex CLI)

One SKILL.md can be reused across tools.

FAQ

Everything you need to know: what skills are, how they work, how to find/import them, and how to contribute.

1. What are Agent Skills?

A skill is a reusable capability package, usually including SKILL.md (purpose/IO/how-to) and optional scripts/templates/examples.

Think of it as a plugin playbook + resource bundle for AI assistants/toolchains.

2. How do Skills work?

Skills use progressive disclosure: load brief metadata first, load full docs only when needed, then execute by guidance.

This keeps agents lightweight while preserving enough context for complex tasks.

3. How can I quickly find the right skill?

Use these three together:

Semantic search: describe your goal in natural language.
Multi-filtering: category/tag/author/language/license.
Sort by downloads/likes/comments/updated to find higher-quality skills.

4. Which import methods are supported?

Upload archive: .zip / .skill (recommended)
Upload skills folder
Import from GitHub repository

Note: file size for all methods should be within 10MB.

5. How to use in Claude / Codex?

Typical paths (may vary by local setup):

Claude Code：~/.claude/skills/
Codex CLI：~/.codex/skills/

One SKILL.md can usually be reused across tools.

6. Can one skill be shared across tools?

Yes. Most skills are standardized docs + assets, so they can be reused where format is supported.

Example: retrieval + writing + automation scripts as one workflow.

7. Are these skills safe to use?

Some skills come from public GitHub repositories and some are uploaded by SkillWink creators. Always review code before installing and own your security decisions.

8. Why does it not work after import?

Most common reasons:

Wrong folder path or nested one level too deep
Invalid/incomplete SKILL.md fields or format
Dependencies missing (Python/Node/CLI)
Tool has not reloaded skills yet

9. Does SkillWink include duplicates/low-quality skills?

We try to avoid that. Use ranking + comments to surface better skills:

Duplicate skills: compare differences (speed/stability/focus)
Low quality skills: regularly cleaned up

Import Skills

eval-writer

nasde-benchmark-creator

add-benchmark

add-benchmark

benchmark

scholarclaw

skilltester

Skill File Structure Sample (Reference)

SKILL.md Requirements

Why SkillWink?

FAQ

1. What are Agent Skills?

2. How do Skills work?

3. How can I quickly find the right skill?

4. Which import methods are supported?

5. How to use in Claude / Codex?

6. Can one skill be shared across tools?

7. Are these skills safe to use?

8. Why does it not work after import?

9. Does SkillWink include duplicates/low-quality skills?

Notice