Eval Setup

This page covers the minimum setup for repeatable eval execution.

1) Ensure project prerequisites

Run validation before any suite execution:

fastskill eval validate
fastskill eval validate --agent codex

Use --agent to confirm the runtime is available in the current environment.

Use a predictable output root (for local and CI):

mkdir -p .fastskill/eval-runs

Then always pass --output-dir in eval run.

Use tags and case ids consistently so teams can run:

Example:

fastskill eval run --agent codex --output-dir ./.fastskill/eval-runs --tag smoke

Default behavior should fail on quality regression. Use --no-fail only for exploratory runs.