Compute metrics using Agent Assist conversation data

Conversational Insights lets you to import conversations from Agent Assist to analyze them. This document walks you through the process of computing several important conversational metrics on imported Agent Assist data.

Prerequisites

  1. Use Agent Assist to generate conversational data using either the smart reply or summarization features.

  2. Enable Dialogflow runtime integration in Conversational Insights.

  3. Export your data from Insights to BigQuery.

Trigger rate

Trigger rate describes how often a type of annotation appears in a conversation.

Sentence-level trigger rate

A sentence-level trigger rate applies to annotations that can appear at minimum once per sentence.

Annotation Name Annotation Type Criteria
Article suggestion 'ARTICLE_SUGGESTION' At least one SuggestArticles request is sent for each message in the conversation.
FAQ assist 'FAQ' At least one SuggestFaqAnswers or AnalyzeContent request is sent for each message in the conversation.
Smart reply 'SMART_REPLY' At least one SuggestSmartReplies or AnalyzeContent request is sent for each message in the conversation.

WITH calculation AS
(
  SELECT
    COUNTIF(
      (
        SELECT COUNT(*)
        FROM UNNEST (sen.annotations) AS annotation
        WHERE annotation.type = 'ANNOTATION_TYPE'
      ) > 0
    ) AS numSentencesAtLeastOneAnnotation,
    COUNT(*) AS numSentencesTotal
  FROM BIGQUERY_TABLE,
  UNNEST (sentences) AS sen
)
SELECT
  numSentencesAtLeastOneAnnotation,
  numSentencesTotal,
  numSentencesAtLeastOneAnnotation / numSentencesTotal AS triggerRate
FROM calculation;

Conversation-level trigger rate

A conversation-level trigger rate applies to annotations that can appear at most once per conversation.

Annotation Name Annotation Type Criteria
Summarization 'CONVERSATION_SUMMARIZATION' At least one SuggestConversationSummary request is sent for each conversation.

WITH calculation AS
(
  SELECT
    COUNTIF(
      (
        SELECT COUNT(*)
        FROM
          UNNEST (sentences) AS sen,
          UNNEST (sen.annotations) AS annotation
        WHERE annotation.type = 'ANNOTATION_TYPE'
      ) > 0
    ) AS numConversationsAtLeastOneAnnotation,
    COUNT(*) AS numConversationsTotal
  FROM BIGQUERY_TABLE
) SELECT
  numConversationsAtLeastOneAnnotation,
  numConversationsTotal,
  numConversationsAtLeastOneAnnotation / numConversationsTotal AS triggerRate
FROM calculation;

Click-through rate

Click-through rate describes how often call agents click the annotations suggested to them.

Sentence-level click-through rate

A sentence-level click-through rate applies to annotations that call agents can click multiple times per sentence.

Annotation Name Annotation Type Criteria
Article suggestion 'ARTICLE_SUGGESTION' At least one SuggestArticles request is sent for each message in the conversation.
FAQ assist 'FAQ' At least one SuggestFaqAnswers or AnalyzeContent request is sent for each message in the conversation.
Smart reply 'SMART_REPLY' At least one SuggestSmartReplies or AnalyzeContent request is sent for each message in the conversation.

WITH calculation AS
(
  SELECT
    COUNTIF(
      (
        SELECT COUNT(*)
        FROM UNNEST (sen.annotations) AS annotation
        WHERE
          annotation.type = 'ANNOTATION_TYPE' AND
          annotation.displayed = TRUE AND
          annotation.clicked = TRUE
      ) > 0
    ) AS numSentencesAtLeastOneClickedAndDisplayedAnnotation,
    COUNTIF(
      (
        SELECT COUNT(*)
        FROM UNNEST (sen.annotations) AS annotation
        WHERE
          annotation.type = 'ANNOTATION_TYPE' AND
          annotation.displayed = TRUE
      ) > 0
    ) AS numSentencesAtLeastOneDisplayedAnnotation,
  FROM BIGQUERY_TABLE,
  UNNEST (sentences) AS sen
)
SELECT
  numSentencesAtLeastOneClickedAndDisplayedAnnotation,
  numSentencesAtLeastOneDisplayedAnnotation,
  numSentencesAtLeastOneClickedAndDisplayedAnnotation / numSentencesAtLeastOneDisplayedAnnotation AS clickThroughRate
FROM calculation;

Conversation-level click-through rate

The concept of conversation-level click-through rate doesn't apply. Call agents have to proactively send a request that creates a conversation-level annotation such as summarization. This concept doesn't apply because call agents will always have to click something to get this type of annotation.

Feature-specific metrics

This section lists metrics that are specific to each feature.

Smart reply

This section lists metrics that are specific to smart reply.

Effective click-through rate (effective CTR)

The effective click-through rate is a more precise reflection of how often call agents click smart reply annotations than the regular click-through rate metric. The following query fetches each agent's response and the smart reply annotations that were made just for that agent's response and clicked by the agent. It then counts the number of occurrences where the agent drafted their response using smart reply suggestions, then divides this number by the total count of agent's responses.

WITH calculation AS (
  SELECT
    sen.sentence AS agentResponse,
    (
      # Get a list of effective smart replies that map to each agent response.
      SELECT ARRAY_AGG(JSON_QUERY(PARSE_JSON(JSON_VALUE(annotation.annotationRecord)), "$.reply"))
      FROM
        UNNEST (sentences) AS sen2,
        UNNEST (sen2.annotations) AS annotation
      WHERE
        annotation.type='SMART_REPLY' AND
        annotation.clicked=TRUE AND
        annotation.annotationRecord IS NOT NULL AND
        annotation.createTimeNanos >= (
          # Get the timestamp of the agent response that immediately precedes
          # the current agent response. The timestamp of the effective smart
          # reply must be greater than this timestamp.
          SELECT MAX(sen3.createTimeNanos)
          FROM UNNEST(sentences) AS sen3
          WHERE
            sen3.participantRole = 'HUMAN_AGENT' AND
            sen3.createTimeNanos < sen.createTimeNanos
        ) AND
        annotation.createTimeNanos <= sen.createTimeNanos
    ) AS effectiveSmartRepliesPerAgentResponse
  FROM BIGQUERY_TABLE,
  UNNEST (sentences) AS sen
  WHERE sen.participantRole = 'HUMAN_AGENT'
)
SELECT COUNTIF(ARRAY_LENGTH(effectiveSmartRepliesPerAgentResponse) > 0) / COUNT(agentResponse) AS effectiveCTR
FROM calculation;

Edit rate

Agents can choose to either use suggested smart reply responses verbatim, or edit the responses before sending them to customers. Computing the edit rate of smart reply responses can show you how often the suggested replies are edited. The following query fetches each suggested smart reply and the agent's response that immediately follows that smart reply. It then counts the number of occurrences where the agent's response doesn't contain its corresponding smart reply, then divides this by the total smart reply count.

WITH calculation AS
(
  SELECT
    JSON_VALUE(PARSE_JSON(JSON_VALUE(annotation.annotationRecord), wide_number_mode=>'round'), "$.reply") AS smartReply,
    (
      # Get the agent response that immediately follows the smart reply.
      SELECT sen2.sentence
      FROM UNNEST (sentences) AS sen2
      WHERE sen2.createTimeNanos = (
        SELECT MIN(sen3.createTimeNanos)
        FROM UNNEST (sentences) AS sen3
        WHERE
          sen3.createTimeNanos > sen.createTimeNanos AND
          sen3.participantRole = 'HUMAN_AGENT'
      )
    ) AS agentResponseThatFollows,
  FROM BIGQUERY_TABLE,
  UNNEST (sentences) AS sen,
  UNNEST (sen.annotations) AS annotation
  WHERE
    annotation.type='SMART_REPLY' AND
    annotation.clicked=TRUE AND
    annotation.annotationRecord IS NOT NULL
)
SELECT COUNTIF(REGEXP_CONTAINS(agentResponseThatFollows, smartReply)) / COUNT(smartReply) AS editRate
FROM calculation;

Summarization

This section lists metrics that are specific to summarization.

  • GENERATOR_SUGGESTION_RESULT type will contain all generator results, including summaries.
  • CONVERSATION_SUMMARIZATION_SUGGESTION type will contain legacy summaries.

Generator Summary

Sample query to access generator summary and agent edits to the summary as JSON:

SELECT
IF
  (JSON_VALUE(annotation.annotationRecord) != '', 
   JSON_QUERY(PARSE_JSON(JSON_VALUE(annotation.annotationRecord)), "$.generatorSuggestion.summarySuggestion"),
   NULL) AS generator_summary,
IF
  (JSON_VALUE(annotation.detailedFeedback) != '',
   JSON_QUERY(PARSE_JSON(JSON_VALUE(annotation.detailedFeedback)), "$.summarizationFeedback.summaryText"),
   NULL) AS feedback,
FROM
  BIGQUERY_TABLE AS c,
  UNNEST(c.sentences) AS sentences,
  UNNEST(sentences.annotations) AS annotation
WHERE
  annotation.type = 'GENERATOR_SUGGESTION_RESULT'
  AND JSON_QUERY(PARSE_JSON(JSON_VALUE(annotation.annotationRecord)), "$.generatorSuggestion.summarySuggestion") IS NOT NULL
LIMIT
  10 ;

Edit distance

The edit distance of summarization is a measure of how much the agent-edited summaries differ from the suggested summaries. This metric correlates to the amount of time the agent spent editing the suggested summaries.

  1. Query the summarizations and their revised summaries (if any).

  2. Because call agents can submit summary revisions without setting the clicked boolean to true, query summarizations, regardless of their clicked status. If the agent's summary is null, the agent didn't revise the suggested summary or the revised summary wasn't sent to Agent Assist using the preferred feedback route.

  3. Compare the suggested summaries with the agents' revised summaries using your preferred edit distance algorithms to compute the summarization edit distance. For example, Levenshtein distance with equal penalty for insertion, deletion, and substitution.

SELECT
  JSON_VALUE(PARSE_JSON(JSON_VALUE(annotation.annotationRecord)), "$.text") AS suggestedSummary,
IF
  (JSON_VALUE(annotation.detailedFeedback) != '', JSON_QUERY(PARSE_JSON(JSON_VALUE(annotation.detailedFeedback)), "$.summarizationFeedback.summaryText"), NULL) AS feedback,
FROM
  BIGQUERY_TABLE,
  UNNEST (sentences) AS sen,
  UNNEST (sen.annotations) as annotation
WHERE
  type='CONVERSATION_SUMMARIZATION_SUGGESTION';

Edit rate

You can compute the edit rate for summarization by dividing the edit distance by the string length of the edited summary. This metric reflects how much of the summary was edited by the agent. This edit rate has no upper bound, but when the value is equal or greater than one, it means the generated summaries are bad and the feature should probably be disabled.

Edit percentage

The edit percentage is a measure of the percentage of the total number of suggested summaries that were edited by agents instead of being used verbatim. It's calculated as the number of agent-edited summaries divided by the number of suggested summaries. This metric is a good signal of model quality when the value is close to 0. But when it's close to 1, it doesn't necessarily mean the model quality is bad since the edit could be minor and it would still count as the suggested summary having been edited.

WITH
  summaryAndEdits AS (
  SELECT
    JSON_VALUE(PARSE_JSON(JSON_VALUE(annotation.annotationRecord)), "$.text") AS suggestedSummary,
  IF
    (JSON_VALUE(annotation.detailedFeedback) != '', JSON_QUERY(PARSE_JSON(JSON_VALUE(annotation.detailedFeedback)), "$.summarizationFeedback.summaryText"), NULL) AS editedSummary,
  FROM
    BIGQUERY_TABLE,
    UNNEST (sentences) AS sen,
    UNNEST (sen.annotations) AS annotation
  WHERE
    type='CONVERSATION_SUMMARIZATION_SUGGESTION' )
SELECT
  countif (suggestedSummary IS NOT NULL) AS totalSummaries,
  countif (editedSummary IS NOT NULL) AS editedSummaries,
  countif (editedSummary IS NOT NULL) / countif (suggestedSummary IS NOT NULL) AS editRate
FROM
  summaryAndEdits;