Evals and Quality Overview
FastSkill quality work has two complementary tracks:- Case-based evals with
fastskill evalto test expected behavior against prompts and rubrics - Portfolio analysis with
fastskill analyzeto inspect coverage, overlap, clusters, and duplicates across installed skills
Recommended quality flow
- Validate config and skill structure
- Run eval suites and inspect failures
- Score and report results for release decisions
- Analyze cluster and duplicates to improve skill portfolio quality
- Gate CI on thresholds
Quality commands map
| Goal | Command |
|---|---|
| Validate eval definitions | fastskill eval validate |
| Execute suite | fastskill eval run --agent <agent> --output-dir <dir> |
| Summarize results | fastskill eval report --run-dir <dir> |
| Re-score prior run | fastskill eval score --run-dir <dir> |
| Inspect clusters | fastskill analyze cluster |
| Find near-duplicates | fastskill analyze duplicates |
| Similarity matrix | fastskill analyze matrix |
What to track over time
- Eval pass rate by suite and tag
- Failure hotspots by case id
- Duplicate pair count above your threshold
- Cluster balance (very sparse or huge clusters usually signal taxonomy issues)