Agent metrics reference

This reference library contains every metric named in this series. Each metric includes a plain English definition, common use cases, and where to find it in the Copilot Studio ecosystem.

How to read this library

Each entry has four parts:

Metric name: Use one canonical phrase or standard name throughout your value reviews. One name avoids the confusion that comes from using multiple labels for the same metric.
Definition: Define the metric for readers who aren't familiar with it.
Common use cases: Describe where in your portfolio this metric usually changes.
Source: List the Microsoft surfaces that emit the metric. This metric can appear in multiple surfaces, and the library lists each one so you know your options.

Section	What it covers
Engagement and adoption metrics	Whether people are using your agent
Outcome metrics	Whether users are able to achieve their goals with the agent
Quality and groundedness metrics	Whether the agent's answers are correct, on-brand, and safe
Voice-of-customer and qualitative signals	How users experience your agent beyond the numbers
Productivity and process metrics	Whether the agent is changing the shape of work
Productive-hour value metrics	How to convert time saved into Agent Assisted Hours and dollar value
Business-outcome metrics	Whether the agent moved the numbers for your business plans

How to use this library in practice

Treat this library as your shared vocabulary. For each new use case, pick a small set of metrics that matter most—one or two per pillar. Use the canonical labels in this library, and tell your team where each metric comes from. If the same metric appears in two agents, the library guarantees that the engagement rate means the same thing in both agents, whether it's a customer service agent or an HR agent. This consistency turns 16 separate measurement programs into one coherent value story.

Engagement and adoption metrics

These metrics tell you whether people are actually using your agent. Without them, every other metric is theoretical.

Metric	Definition	Common use cases	Where to find it
Total sessions	The count of analytics sessions in a period. One user conversation can produce multiple analytics sessions when the user asks new questions after the End of Conversation topic.	Every conversational agent.	Copilot Studio Analytics
Engagement rate	The share of analytics sessions that triggered a custom topic, Escalate, Fallback, or Conversational Boosting, which signals that the user engaged in a real conversation instead of leaving after the initial greeting.	Customer service, HR, IT helpdesk, knowledge.	Copilot Studio Analytics, Measure agent engagement
Routine adoption rate	The share of eligible users who used the agent on four or more active days in any rolling four-week window, the threshold Microsoft uses to identify users who adopted the agent into their habits.	Sales assist, executive productivity, knowledge, HR, project management.	Copilot Dashboard in Viva Insights
Per-user active days	The count of distinct days a user interacted with the agent over the period, which helps distinguish repeat users from one-time users.	Knowledge, sales assist, executive productivity.	Copilot Dashboard in Viva Insights
Seller active days on the agent	Routine adoption rate restricted to your seller population, used to understand whether sales teams are pulling the agent into their daily rhythm.	Sales and account management.	Copilot business impact report
Trigger use (autonomous)	The count and outcome of each event trigger that started an autonomous agent run, which tells you which triggers drive your business and which are noise.	Finance and invoice processing, procurement intake, cybersecurity, supply chain.	Analyze autonomous agents

Outcome metrics

These metrics tell you whether users are able to achieve their goals with the agent.

Metric	Definition	Common use cases	Where to find it
Resolution rate	The share of engaged sessions that ended with a resolved outcome, either confirmed by the user (the user said yes to the End of Conversation question) or implied by the flow (the user didn't respond and the agent's logic determined success).	Customer service, HR, IT helpdesk, knowledge.	Copilot Studio Analytics, Measure agent outcomes
Escalation rate	The share of engaged sessions that handed off through the Escalate topic or a Transfer conversation node, which indicates when your agent is ready to involve a human.	Customer service, HR, IT helpdesk, legal.	Measure agent outcomes
Abandon rate	The share of engaged sessions that ended after 60 minutes of inactivity without resolution or escalation, which is your fail-quietly signal.	Customer service, knowledge, web Q&A.	Deflection overview
Deflection rate	The share of incoming requests that were resolved through self-service rather than escalation to a human, which is the key metric for any tier-1 support agent.	Customer service, HR, IT helpdesk, procurement.	Deflection overview
First-contact resolution (FCR)	The share of cases that were resolved on the first interaction without needing a return contact within seven days, which is the clearest indicator that the agent solved the problem.	Customer service, IT helpdesk, field service.	Analyze conversational agents
Run outcomes (autonomous)	The end state of each autonomous agent run, broken into successful and unsuccessful, with average duration (the equivalent of resolution-rate for autonomous agents).	Finance and invoice processing, procurement, cybersecurity, supply chain.	Analyze autonomous agents
Tool use (autonomous)	The count and success rate of each tool the agent invoked during a run, which tells you whether your tools are effective.	Autonomous agents in any function.	Analyze autonomous agents
Knowledge source use	The count of references the agent made to each knowledge source during a session or run, which tells you which content carries your agent and which content sits unused.	Knowledge, customer service, IT helpdesk, legal.	Copilot Studio Analytics

Quality and groundedness metrics

These metrics tell you whether the work the agent does is correct, on-brand, and safe.

Metric	Definition	Common use cases	Where to find it
Generated answer rate	The share of user questions that received a generative answer, as opposed to falling through to a fallback or unanswered state.	Knowledge, customer service, HR, IT helpdesk.	Analyze conversational agents
Generated answer quality	An evaluation score for the answers the agent produced, against either reference answers or rubric criteria.	Knowledge, legal, marketing.	Analyze conversational agents, Copilot Studio Kit rubrics
Groundedness	An evaluation score for whether the agent's answer is supported by the knowledge it cited, scored by an AI judge against a configurable rubric.	Knowledge, legal, financial advisory, regulated functions.	Copilot Studio Kit rubrics
Instruction-following score	An evaluation score for whether the agent followed the system and user instructions in its response.	Marketing, legal, communications, any function with brand or compliance constraints.	Copilot Studio Kit rubrics
Topic match score	Whether the agent triggered the topic you expected it to trigger for a given test utterance, which is your regression signal for intent recognition.	Customer service, HR, IT helpdesk, any agent with structured topics.	Copilot Studio Kit test capabilities
Citation accuracy	The share of agent answers whose claims are supported by the citations the agent attached, measured by sampled review.	Knowledge, legal, financial advisory.	Conversation Analyzer
AI-generated insights (preview)	Daily AI-generated recommendations on unanswered generative AI questions, with specific coverage fixes you can apply.	Knowledge, customer service, HR.	Analyze conversational agents

Voice-of-customer and qualitative signals

These signals capture how users experience your agent. They tell the part of the value story that numbers alone can't show.

Signal	Definition	Common use cases	Where to find it
CSAT	The average customer satisfaction score from the End of Conversation survey, on a 1-to-5 scale, where 1-2 maps to dissatisfied, 3 is neutral, and 4-5 is satisfied.	Customer service, HR self-service, IT helpdesk.	Analyze conversational agents
Reactions (thumbs up / thumbs down with comments)	Per-message user feedback retained for 28 days, optionally with free-text comments that explain why a user reacted positively or negatively.	Every conversational agent.	Analyze conversational agents
Sentiment (preview)	An AI-scored percentage of sessions in the period that carried negative user sentiment, which helps you identify friction even in resolved sessions.	Customer service, HR, knowledge, communications.	Analyze conversational agents
Themes (preview)	AI-grouped clusters of user questions that received generative answers, so you can see what users are asking in their own words and create test sets from any theme in one step.	Knowledge, customer service, HR.	Analyze user questions by theme
Customer-experience narrative	Recurring themes from agent transcripts that go beyond CSAT (tone, clarity, friction, trust), surfaced through custom prompts run against transcripts.	Customer service, HR, communications.	Conversation Analyzer
User confidence	Whether users trust the agent's answers enough to act on them, evidenced by reaction comments and Conversation Analyzer themes around confidence and trust.	Knowledge, legal, financial advisory, decision-support agents.	Analyze agent effectiveness plus Conversation Analyzer
Manager stories of reclaimed capacity	Concrete narratives from frontline managers about how their teams are reinvesting the time the agent gives them back, captured in structured quarterly interviews.	Every agent that returns measurable productive hours.	Quarterly value-review interviews, captured in the agent's value record.
Employee sentiment on AI	Standardized sentiment items (productivity, speed, effort, quality) from a Glint or Pulse survey, plus open-ended comments, that show how Copilot is changing the work experience.	Every Copilot rollout, organization-wide.	Viva Glint Copilot Impact Survey, reflected in the Copilot Dashboard
Talent attraction and retention signal	Whether AI fluency and AI-augmented work are showing up in exit interviews, recruiting conversations, and campus-hiring panels.	Strategic-pillar reporting in any function.	HR listening channels, recruiting feedback, exit-interview transcripts.

Productivity and process metrics

These metrics show whether the agent changes how work gets done, not just whether it generates telemetry.

Metric	Definition	Common use cases	Where to find it
Cycle time	The elapsed time from the start of a process to its completion, typically reported as median, P90, and P99 percentiles because tail behavior often matters more than the median.	Finance and invoice processing, procurement, legal, customer service.	Copilot Studio Analytics plus your own ERP or workflow data.
Average handle time (AHT)	The median elapsed time from the start of a session to its resolution, which is the cycle-time variant most relevant to conversational agents.	Customer service, IT helpdesk, HR self-service.	Analyze conversational agents
Touchless rate	The share of transactions an autonomous agent completed end-to-end without any human intervention, which is the clearest signal of full automation in finance, procurement, and supply-chain agents.	Finance and invoice processing, procurement intake, supply chain.	Analyze autonomous agents
Time to first answer	The elapsed time from a user's question to the first useful answer, sampled across knowledge requests.	Knowledge, IT helpdesk, executive productivity.	Copilot Studio Analytics plus baseline sampling for the pre-agent state.
P90 cycle time	The 90th-percentile cycle time: 90 percent of cases complete at or below this value, which exposes the slow tail your median can hide.	Customer service, finance, procurement, field service.	Captured during your prebuild baseline; available in your ERP, ticketing, or workflow system.
P99 cycle time	The 99th-percentile cycle time: 99 percent of cases complete at or below this value, which exposes the worst-case outliers your sponsor gets pulled into.	Customer service, IT operations, legal.	Captured during your prebuild baseline.
Mean time to detect (MTTD)	The median elapsed time from when a security event occurred to when it was detected.	Cybersecurity and SOC operations.	Your SIEM joined to agent telemetry in Application Insights.
Mean time to respond (MTTR)	The median elapsed time from detection to response, which an autonomous triage agent typically reduces by handling enrichment and notification automatically.	Cybersecurity, IT operations.	Azure Application Insights, joined to your incident-management system.
First-time fix rate	The share of field-service visits that resolved the issue on the first trip, without a return visit.	Field service.	Your field-service-management system, joined to agent telemetry.

Productive-hour value metrics

These metrics show how much time your agent saves so your team can focus on higher value work.

Metric	Definition	Common use cases	Where to find it
Agent Assisted Hours (AAH)	The Microsoft-published estimate of productive hours your agent returns to your team, computed as (knowledge references plus weighted sessions without references) times the time savings multiplier divided by 60.	Every agent built in Copilot Studio.	Copilot Studio agents report
Agent Assisted Value (AAV)	Agent Assisted Hours converted to dollars at a configurable productive-hour rate, default $72 per hour from U.S. Bureau of Labor Statistics employer-cost data. (You can replace the rate with your own fully loaded productive-hour value.)	Every agent built in Copilot Studio.	Copilot Studio agents report, Copilot Dashboard
Time savings multiplier	The minutes-saved estimate per knowledge reference or per action, which you can override in the Copilot Studio agents report calculator. The Microsoft default is six minutes per knowledge reference, drawn from Microsoft Office of the Chief Economist research.	Every agent computing AAH.	Copilot Studio agents report calculator, with the source on Microsoft WorkLab
Monthly savings (per run)	Estimated time or money saved per successful agent run, entered by the agent owner directly on the Analytics page. Updates retroactively when you change inputs.	Customer service, HR, IT helpdesk, finance.	Analyze time and cost savings for agents
Monthly savings (per tool)	The same savings estimate, broken down by individual tool the agent invoked. Useful when different tools represent different productive-hour contributions.	Finance, procurement, cybersecurity, supply chain.	Analyze time and cost savings for agents
Custom metrics (preview)	Up to three maker-defined metrics per agent, expressed in natural language. Copilot Studio scores a sampling of sessions and shows the result as a labeled donut.	Marketing (on-brand rate), legal (clause coverage), service (issue-category resolution).	Analyze your agent with custom metrics
Cost per transaction	The fully loaded cost to complete one transaction in the current process, including productive-hour value, system costs, and rework. This cost is the baseline against which Agent Assisted Value is compared.	Customer service, finance, procurement.	Captured during your prebuild baseline. Computed from your ERP and HRIS.

Business outcome metrics

These metrics show whether the agent moved the numbers your business cares about. The agent doesn't produce these metrics directly. It influences them through better service, faster cycles, and more productive hours.

Metric	Definition	Common use cases	Where to find it
Conversion lift	The change in conversion rate (lead-to-meeting, meeting-to-pipeline, opportunity-to-close) for cohorts that engaged with the agent versus matched cohorts that didn't.	Sales, marketing, customer service.	Copilot business impact report
Retention delta	The change in customer retention rate for cohorts whose interactions touched the agent versus matched cohorts that didn't.	Customer service, customer success.	Copilot business impact report
Cross-sell rate	The share of customer interactions that resulted in cross-sell or upsell, before and after the agent.	Sales, banking, retail.	Copilot business impact report
Forecast accuracy	The variance between forecast and actual results, narrowed by faster pipeline updates and better seller capacity.	Sales, finance.	Copilot business impact report
DSO (days sales outstanding) delta	The reduction in days from invoice to cash, often improved by faster invoice cycle time and fewer exceptions.	Finance and invoice processing.	Your ERP, joined to agent telemetry through the Copilot business impact report.
Off-catalog spend share	The share of procurement spending that bypasses approved catalogs, expected to decrease when an intake agent enforces catalog and policy.	Procurement.	Your procurement system, joined to agent telemetry.
Audit findings avoided	The reduction in audit findings from quarter to quarter, attributable to better policy compliance, better citation quality, or stronger guardrails.	Legal, finance, regulated functions.	Your audit-tracking system, joined to Copilot Studio Kit Compliance Hub data.

Feedback

Was this page helpful?

Last updated on 2026-06-04