✓ Verified 💻 Development

Agent Evaluation

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability

Rating
0 (0 reviews)
Downloads
0 downloads
Version
1.0.0

Overview

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics.

Installation

Terminal bash

openclaw install agent-evaluation
    
Copied!

Tags

#devops_and-cloud #testing

Quick Info

Category Development
Model Claude 3.5
Complexity Multi-Agent
Author rustyorb
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install agent-evaluation