Metrics guide
Measure voice-agent behaviors that lead to specific fixes.
Start with a compact, stable scorecard and preserve the rationale and conversation evidence behind every result.
Task completion
Did the conversation achieve the customer and business outcome? Define completion in observable terms for each workflow.
Resolution quality
Was the outcome correct, complete, and communicated with a useful next step?
Unsupported claims
Did the agent state facts or commitments that were not supported by available policy, tools, or conversation context?
Fallback behavior
When the agent could not proceed, did it recover safely, ask an effective clarifying question, or provide an appropriate escalation?
Operational quality
Track latency, interruptions, silence, repeated turns, and premature termination where provider data makes those signals available.