Skip to main content

Evals and Quality Overview

FastSkill quality work has two complementary tracks:
  • Case-based evals with fastskill eval to test expected behavior against prompts and rubrics
  • Portfolio analysis with fastskill analyze to inspect coverage, overlap, clusters, and duplicates across installed skills
Use both. Evals tell you if skills behave correctly. Analysis tells you if your skill set is healthy.
  1. Validate config and skill structure
  2. Run eval suites and inspect failures
  3. Score and report results for release decisions
  4. Analyze cluster and duplicates to improve skill portfolio quality
  5. Gate CI on thresholds

Quality commands map

GoalCommand
Validate eval definitionsfastskill eval validate
Execute suitefastskill eval run --agent <agent> --output-dir <dir>
Summarize resultsfastskill eval report --run-dir <dir>
Re-score prior runfastskill eval score --run-dir <dir>
Inspect clustersfastskill analyze cluster
Find near-duplicatesfastskill analyze duplicates
Similarity matrixfastskill analyze matrix

What to track over time

  • Eval pass rate by suite and tag
  • Failure hotspots by case id
  • Duplicate pair count above your threshold
  • Cluster balance (very sparse or huge clusters usually signal taxonomy issues)

See also