get_test_runs
get_test_runs
Retrieves test run results for an AI Agent. Use this to review pass/fail status, evaluation criteria outcomes, and rationale after triggering a test_run via propose_change.
Example prompts
- “Show me the latest test run results for each of my test cases.”
- “Which test cases failed in the last run, and why?”
- “Get the full details for test run abc123.”
Parameters
Response
Returns per-run results including:
- Test run ID and test case ID.
- Overall pass/fail status.
- Per-evaluation-criterion outcomes and rationale.
- The simulated conversation transcript the evaluation was based on.
- Timestamps and the user who triggered the run.