Virtual agent platform

The virtual agent platform is a feature within Quality AI that provides insights into the performance of conversational agents created with Dialogflow and Conversational Agents (Generative). Conversational Agents (Generative) is a virtual agent builder, which helps you create asynchronous multi-agent systems with bidirectional streaming.

View platform

Follow these steps to view the platform in the Conversational Insights console.

Click Quality AI > Agents > Virtual agent.
Select an agent.

Virtual agent performance

The virtual agent platform provides details about virtual agent performance. The platform displays details of both operational and Quality AI metrics.

Operational metrics

The virtual agent platform displays each of the following metrics as a number or percentage.

Total sessions: Total number of conversations handled by this agent.
Escalation rate: Percentage of conversations escalated to a human agent, calculated using a signal from Dialogflow.
Turns per session: Average number of turns per conversation.
No match rate: Percentage of conversations that did not match any intent, applicable only for flow-based virtual agents.

The virtual agent platform displays a graph of the change over time for each of the following metrics.

Volume: Total number of conversations handled by this agent.
Escalation rate: Percentage of conversations escalated to a human agent.

Follow these methods to calculate the escalation rate:

Dialogflow: Calculated using a signal from Dialogflow.
Conversational Agents (Generative): Calculated using a signal from the Conversational Agents (Generative) EndSession tracing.

Follow these steps in the Conversational Agents (Generative) to set up the escalation rate in end_session:

Add the end_session tool to the agent node. Navigate to the Agent page in Quality AI. Click Tools to view the end_session tool.
Provide agent instruction.

Example: When the customer asks for a live agent, execute the end_session tool with arguments {session_escalated = true}

Escalation type breakdown: Number of conversations per escalation initiator, which could be the user or the agent, analyzed based on the predefined question in the prebuilt scorecard Who escalated the conversation?.
Tool failure rate: Percentage of tool calls that failed across all uses of the tool in conversations for a specific agent in the selected time period and for the specified conversation medium.
Tool latency: Average latency of a tool call across all uses of the tool in the conversations for the specific agent, in the selected time period and conversation medium.
No match rate: Percentage of conversations that did not match any intent, applicable only for flow-based virtual agents.

End-to-end latency (E2E) breakdown

End-to-end (E2E) latency is the amount of time between the end of a user utterance and the start of the following agent utterance. The virtual agent platform calculates E2E latency for every user-agent interaction within both chat and voice conversations.

The virtual agent platform displays a graph titled E2E latency breakdown, which illustrates E2E latency at the utterance level. The y-axis displays time in 50 millisecond increments. The x-axis indicates that you can view E2E latency for tools, the large language model (LLM), and text-to-speech (TTS). Lastly, the graph displays E2E latency as separate bars for three percentile categories: P50 is the 50th percentile, P90 is the 90th percentile, and P99 is the 99th percentile. So for example, if the LLM P99 latency is 5 seconds, that means 99% of the utterance-level LLM latency values are less than 5 seconds.

Escalation type breakdown

The escalation type breakdown shows the number of conversations for each escalation initiator: user or agent. Quality AI determines the escalation initiator by answering the predefined question Who escalated the conversation? You can drill down on an escalation initiator to view a list of conversations with it.

Tools

Tool metrics are computed for Conversational agent tools. Aggregated metrics like tool latency and failure rates help bot builders identify performance bottlenecks across conversations.

Quality AI metrics

The virtual agent platform displays the following Quality AI metrics.

Quality score: Average quality score per scorecard over conversations handled by this agent.
Overall sentiment: Average sentiment score over conversations handled by this agent.
Sentiment breakdown: A color-coded bar chart to illustrate the number of conversations of this agent per conversation-level sentiment category: negative, neutral, or positive.
Conversation outcome: Number of conversations for each possible outcome.
Sentiment by discovery: Breakdown of the number of conversations per conversation-level sentiment category for this metric.

The platform also displays graphs of the change over time for the following Quality AI metrics.

Quality score: Percentage of quality scores for all conversations handled by this agent.
Score category breakdown: Quality score numbers for each predefined category: business, compliance, customer, and any custom categories.

Conversation outcome

The conversation outcome graph shows the number of conversations that ended with each of the following possible outcomes:

Abandoned
Partially resolved
Escalated
Redirected
Successfully resolved
Unknown

Compute conversation outcomes using predefined questions in the prebuilt scorecard in Quality AI.

To view outcome data in this graph, run a Quality AI analysis with prebuilt scorecard.
Click an outcome to see a list of conversations with that outcome.

Sentiment breakdown

After you run sentiment analysis on all your conversations, the sentiment breakdown chart displays the number of conversations with an overall sentiment in each category: negative, neutral, and positive.