Topic modeling basics

This document describes the basic concepts for understanding the Conversational Insights topic modeling feature.

Conversations

Topic modeling analyzes conversations. Each conversation is an interaction between a contact center agent and a user. Topic modeling uses chat transcripts or call transcripts created using the Insights API.

For more information, see the Conversations reference documentation.

Topics

A topic is created by analyzing the key subjects from each conversation and then creating clusters of similar subjects. Topic modeling then identifies the number of distinct clusters, attempting to generate a name for each one. Names represent a topic, which in turn is represented by an Issue resource.

When topic modeling creates a set of topic names, you can review the names and the conversations it has labeled with that name. Topic modeling can also show you the snippet from the most representative conversation for a topic.

Topic models

Before you use topic modeling to analyze conversations, create a topic model in Conversational Insights. Topic models contain discovered topics and can be used to infer topics for any conversation. From a topic model, you can generate a report identifying the topics within the model as well as the names and descriptions of each topic. You can also deploy a topic model to your project, which will enable you to infer topics in real-time during a conversation with an end user. Topic models are represented by issueModels resources.

Fine-tune a topic model

When using topic modeling, there are four main techniques for improving topic assignments. All these actions affect adjusted topic distributions.

  • Modify an existing topic's name and description.
  • Add a new topic.
  • Remove an existing topic.

When you perform any of these actions, a new analysis follows the updated topic list and the existing analysis is unchanged. To apply a new change to an existing analysis, follow the instructions in Real-time topic inference.

Add or edit topics

You can change your topic list by adding topics to cover areas the model doesn't already represent. Follow these steps:

  1. From the topic list, click New topic.
  2. Enter a name, then click Add.

You can also update topic names or descriptions to better describe the subjects of conversations that should match or better suit the business use case. Follow these steps to edit a topic name or description:

  1. From the topic list, go to the topic and click more_vert, then edit Edit topic.
  2. Enter the name and description, then click Done.

Avoid adding duplicate or similar topics because they will negatively impact the quality of topic inferences. When creating or changing a topic, apply the following naming and description guidelines.

Name

  • Use short, descriptive topics of three to six words, such as troubleshooting remote control or inquiring about billing policy.

  • Avoid generic or abstract names, such as Sales.

Optionally, follow these best practices:

  • Use readily available custom topic names, such as Billing.

  • Add a short description to the topic name, as in "Billing Errors and Refunds".

  • Choose a suitable model configuration based on the results you want.

Example

A credit card support center runs topic modeling on their archived support call logs. The modeling creates a topic from a cluster of conversations and names it Credit card over the limit inquiries. The business shortens the name to Credit limit inquiries.

Description

  • Use a general description followed by a few examples.

  • Avoid including personal information like names, dates, or locations.

  • Too much detail, such as "don't include X topic", can negatively impact topic inference.

Examples
  • The customer is inquiring about their landline phone service. They may want to cancel it or consult about the current billing.

  • The customer is inquiring about their bill. They may want to know the amount or the due date.

Remove a Topic

In the Insights console, follow these steps to remove a topic from the final topic list and the topic inference results.

  1. Select your Insights enabled project.
  2. Click model_training Topic Models and select a topic model.
  3. Go to the topic and click more_vert.
  4. Click do_not_disturb_on Remove topic.

Assessing training data quality

For voice data, the quality of Speech-to-Text outputs is critical to the performance of the topic model.

  • Ensure that the conversation's speaker roles are assigned properly when the conversation is ingested.

    • Accurately label conversation turns as coming from customer or agent.
    • Use AGENT for human roles, AUTOMATED_AGENT for virtual ones, and, for customer roles, END_USER or CUSTOMER.
    • Make sure that most conversations have transcripts with customer and agent roles labeled. Conversations with only one role won't be used in training.
  • Ensure that the conversations are sufficiently long: About ten total turns, with five from the agent and five from the customer.

  • Avoid using duplicate conversations in the dataset.

For better quality topics from the model, try using redacted conversations for topic modeling. However, if the redaction is overly aggressive and removes important information from the transcripts, it can affect the length of your training conversations. If applicable, check the Cloud Data Loss Prevention redaction quality.

Data requirements, including smaller datasets

A minimum of 1,000 conversations with five back-and-forth turns between an agent and customer are required. We also recommend using about 10,000 conversations for training.