About EvalDog
AI features ship on top of models you don’t control. When a provider updates the model, your prompts can silently break — no error, just wrong output. EvalDog is the watchdog that catches it.
The problem
Testing non-deterministic LLM output doesn’t fit a normal assertion. So teams either skip it, or stand up a heavy eval/observability platform meant for ML engineers. Most just find out from a user complaint.
What we believe
Evaluating an AI feature should feel like writing a test you already know how to write: upload your cases, get a pass/fail score, gate it in CI. No ML degree, no token bill for running checks. The same engine as the free CLIs — but hosted, and watching for drift while you sleep.
Who we are
EvalDog is built by The Testing Academy — a software-testing education brand that has taught QA and automation to a large global community. We’ve spent years helping testers level up; EvalDog is our take on what testing AI should look like.
The CLI is open source
The evaldog command-line tool is MIT-licensed and on npm. Use it free, forever, in any pipeline.