Best practices for rolling out Conversational Analytics with Looker

Conversational Analytics lets users query data that is modeled in LookML by asking natural language questions within a Looker instance.

This guide provides strategies and best practices to help Looker administrators and LookML developers successfully configure, deploy, and optimize Conversational Analytics. This guide covers the following topics:

By preparing your LookML model and Conversational Analytics, you can increase user adoption and ensure that users get accurate and useful answers to their questions.

Learn how and when Gemini for Google Cloud uses your data. As an early-stage technology, Gemini for Google Cloud products can generate output that seems plausible but is factually incorrect. We recommend that you validate all output from Gemini for Google Cloud products before you use it. For more information, see Gemini for Google Cloud and responsible AI.

LookML best practices for Conversational Analytics

Conversational Analytics interprets natural language questions by leveraging two primary inputs:

  1. The LookML model: Conversational Analytics analyzes the structure, fields (dimensions, measures), labels, and descriptions that are defined within the Looker Explores.

  2. Distinct field values: Conversational Analytics examines the data values within fields (specifically, string dimensions) to identify the available categories and entities that users might ask about. Cardinality (the number of unique values) can influence how these values are used.

While powerful, the ability of Conversational Analytics to be effective is directly tied to the quality and clarity of these two inputs. The following table contains common ways that unclear or ambiguous LookML can negatively affect Conversational Analytics, along with solutions for improving the output and user experience.

Common LookML quality issue Solution for clearer Conversational Analytics
Lack of clarity: Fields that don't have clear labels or descriptions are ambiguous both to Conversational Analytics and to its users. Apply clear labels: Use the label parameter to give fields intuitive, business-friendly names that users are likely to use in their questions.
Field bloat: Exposing too many fields, especially internal IDs (primary keys), duplicate fields that are inherited from joins, or intermediate calculation fields, can clutter the options that are available to Conversational Analytics. Hide irrelevant fields: Ensure that all primary keys, foreign keys, redundant fields from joins, and purely technical fields remain hidden.

(Optional) Extend Explores: If your Explore contains a large number of fields, consider creating a new Explore that extends an existing one. This lets you tailor a dedicated version of popular content for Conversational Analytics without modifying Explores that other content may rely on.
Naming conflicts: Multiple fields that have similar or identical names or labels across different views within the Explore can lead to incorrect field selection. Write thorough descriptions: Descriptions provide critical context for Conversational Analytics. Use the description parameter for the following tasks:
  • Describe the field clearly using natural language.
  • Include company- or industry-specific terminology or synonyms.
  • Explain calculations or context. Conversational Analytics uses descriptions to better identify field meanings and to map user terms.

For example, a field that has the label user_count could have the description "The total number of unique users who visited the website."

Standardize naming: Review field names and labels for consistency and clarity.
Hidden complexity: Relying heavily on dashboard-level custom fields or table calculations means that potentially critical business logic won't be accessible to Conversational Analytics. Incorporate custom logic: Identify important and commonly used custom fields or table calculations. Convert the logic for these fields into LookML dimensions and measures so that Conversational Analytics can use them.
Messy data: The following types of inconsistent or poorly structured data make it difficult for Conversational Analytics to interpret queries accurately.
  • Value variations: Inconsistent capitalization or naming conventions (for example, a mix of the values complete, Complete, and COMPLETE) can lead to data duplication or incorrect data relationships in Conversational Analytics.
  • Inconsistent data types: Columns that are intended to be numeric and that contain occasional string values force the field type to be string, which prevents numerical operations.
  • Timezone ambiguity: Lack of standardized time zones in timestamp fields can lead to incorrect filtering or aggregation.
Address data quality: Where possible, flag data quality issues (inconsistent values, types, time zones) that you identify during data curation. Work with data engineering teams to clean up the source data or apply transformations in the ETL/data modeling layer.

For more best practices for writing clean, efficient LookML, see the following documentation:

When to add context to LookML versus Conversational Analytics

In Conversational Analytics, you can add context inputs, such as field synonyms and descriptions, both to LookML and inside agent instructions. When you're deciding where to add context, apply the following guidance: Context that is always true should be added directly to your LookML model. Looker Explores may be used multiple places, including both in dashboards and in Conversational Analytics, so context that's applied in LookML must hold true for all possible users who will interact with the data.

Agent context should be qualitative and focused on the user, and there can be many agents serving different users from one Explore. Examples of context that should be included in agent instructions, but not in LookML, are as follows:

  • Who is the user that is interacting with the agent? What is their role? Are they internal or external to the company? What is their previous analytics experience?
  • What is the goal of the user? What type of decision are they looking to make at the end of the conversation?
  • What are some types of questions that this user will ask?
  • What are the top fields that are specific to this user? What are fields that this user will never need to use?

This guide recommends the following phased approach for implementing Conversational Analytics in Looker:

This approach lets you start with a small, controlled scope, validate your setup, and then expand to more users and data.

Phase 1: Curate data and define the initial scope

In this phase, prepare your data for users to query with Conversational Analytics and define the scope of the initial deployment. Follow these recommendations for starting with a small and controlled scope:

  • Limit initial user access: To enable internal testing and validation, use Looker's permission system to grant the Gemini role to a small set of users who are familiar with the data.
  • Limit Looker model access for Gemini: When you grant the Gemini role, you can also limit which models Gemini can access. To start, consider limiting Gemini access to one or two models that you have curated for Conversational Analytics.
  • Select curated Explores: Start with one or two well-structured Explores that are based on relatively clean data and that provide clear business value. Optimize these Explores for Conversational Analytics in Looker by following the detailed instructions in LookML best practices for Conversational Analytics.

Phase 2: Configure agents and validate internally

In this phase, build and refine your Conversational Analytics agents, and then thoroughly test them with internal users to confirm accuracy and effectiveness. This phase involves the following steps:

  1. Create curated agents: Create Conversational Analytics agents that are based only on the curated Explores that you prepared during the curation and initial setup phase.
  2. Refine with agent instructions: Use agent instructions to provide additional context and further guidance. For example:

    • Define synonyms for field names or values.
    • Provide specific context or rules for how certain fields should be used.
  3. Validate internally and iterate: Thoroughly test the agents with users who are familiar with the data. Ask various questions, test edge cases, and identify weaknesses. Make the following changes based on feedback from testing:

    1. Refine the LookML. For example, adjust the values for the label, description, or hidden LookML parameters.
    2. Adjust agent instructions.
    3. Continue flagging issues with data quality.

Phase 3: Expand Conversational Analytics adoption to more users

In this phase, expand Conversational Analytics adoption to more users by granting access, collecting feedback, and iterating on your agents. This phase involves the following steps:

  1. Grant targeted access: Grant Conversational Analytics access to additional users who have the Gemini role, and encourage those users to use the specific, vetted agents that you have created.
  2. Launch and collect feedback: Actively solicit feedback on the following topics:

    • Accuracy of responses
    • Ease of use
    • Missing information or confusing results
  3. Iterate continuously: Use feedback to make further refinements to LookML and agent instructions, and prioritize data cleanup efforts.

  4. Expand access: Once the agents prove stable and valuable, expand access to other relevant user groups and introduce new curated agents by granting the Gemini role. You can also introduce new curated agents and expand access to the models that are available to the Gemini role, following the same processes that were used in the previous phases.