This tutorial describes how to set up and use the AlloyDB AI natural language API using the Google Cloud console. You learn how to configure the AlloyDB AI natural language API so that you can ask natural language questions and receive SQL queries and results.
The examples in this tutorial are intended for demonstration purposes only.
Objectives
- Create and populate tables, and use auto generation to create context.
- Create a value index for the columns in the database.
- Create and configure a natural language configuration (
nl_config
) object. - Create templates for a sample query in the application.
- Use the
get_sql()
function to produce a SQL query that answers a question. - Use the
execute_nl_query()
function to answer a natural language question using the database.
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.
Before you begin
Enable billing and required APIs
In the Google Cloud console, select a project.
Make sure that billing is enabled for your Google Cloud project.
Enable the Cloud APIs necessary to create and connect to AlloyDB for PostgreSQL.
- In the Confirm project step, click Next to confirm the name of the project you are going to make changes to.
In the Enable APIs step, click Enable to enable the following:
- AlloyDB API
Create and connect to a database
- Create a cluster and its primary instance.
- Connect to your instance and create a database.
- Enable Vertex AI integration. For more information, see Integrate with Vertex AI.
Enable and install the required extension
To install and use the alloydb_ai_nl extension
, you must first enable the
extension using the alloydb_ai_nl.enabled
flag. For more information, see
Configure an instance's database flags.
To install the alloydb_ai_nl
extension, which is the
AlloyDB AI natural language support API, run the following
query:
CREATE EXTENSION alloydb_ai_nl cascade;
Upgrade the alloydb_ai_nl extension
If you already installed the extension, then run the following statement to update it to the latest version:
ALTER EXTENSION alloydb_ai_nl UPDATE;
Create the nla_demo schema and tables
In the following steps, you create the nla_demo
schema and tables in the
schema. You populate the tables with synthetic data. The provided schema and
data are designed to support the fundamental operations of an online retail
business, with potential applications extending to customer management,
analytics, marketing, and operational aspects.
The sample data shows how you can use AlloyDB AI natural language for development, testing, and demonstration purposes, particularly for features like natural language interfaces.
Create the schema by running the following query:
CREATE SCHEMA nla_demo;
Create tables in the
nla_demo
schema. Theaddresses
table stores the address information for customers and orders.CREATE TABLE nla_demo.addresses ( address_id SERIAL PRIMARY KEY, street_address VARCHAR(255) NOT NULL, city VARCHAR(255) NOT NULL, country VARCHAR(255) );
Create the
customers
table by running the following query. This table stores customer information including the customer ID, name, contact details, address reference, birth date, and record creation time.CREATE TABLE nla_demo.customers ( customer_id SERIAL PRIMARY KEY, first_name VARCHAR(255) NOT NULL, last_name VARCHAR(255) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL, address_id INTEGER REFERENCES nla_demo.addresses(address_id), date_of_birth DATE, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
Create the
categories
table, which stores product categories.CREATE TABLE nla_demo.categories ( category_id INTEGER PRIMARY KEY, category_name VARCHAR(255) UNIQUE NOT NULL );
Create the
brands
table, which stores brand names.CREATE TABLE nla_demo.brands ( brand_id INTEGER PRIMARY KEY, brand_name VARCHAR(255) NOT NULL );
Create the
products
table, which stores product information like the product ID, name, description, brand, category linkage, and record creation time.CREATE TABLE nla_demo.products ( product_id INTEGER PRIMARY KEY, name VARCHAR(255) NOT NULL, description TEXT DEFAULT 'Not available', brand_id INTEGER REFERENCES nla_demo.brands(brand_id), category_id INTEGER REFERENCES nla_demo.categories(category_id), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, price DECIMAL(10, 2), description_embedding VECTOR(768) );
Create the
orders
table. This table stores information about customer orders, including the customer, date, total amount, shipping and billing addresses, and order status.CREATE TABLE nla_demo.orders ( order_id INTEGER PRIMARY KEY, customer_id INTEGER REFERENCES nla_demo.customers(customer_id), order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP, total_amount DECIMAL(10, 2) NOT NULL, shipping_address_id INTEGER REFERENCES nla_demo.addresses(address_id), billing_address_id INTEGER REFERENCES nla_demo.addresses(address_id), order_status VARCHAR(50) );
Create the
order_items
table. This table records individual items in an order, links to the order and product variant, and specifies quantity and price.CREATE TABLE nla_demo.order_items ( order_item_id SERIAL PRIMARY KEY, order_id INTEGER REFERENCES nla_demo.orders(order_id), product_id INTEGER REFERENCES nla_demo.products(product_id), quantity INTEGER NOT NULL, price DECIMAL(10, 2) NOT NULL );
Populate tables in the nla_demo schema
Populate the
addresses
table by running the following query:INSERT INTO nla_demo.addresses (street_address, city, country) VALUES ('1800 Amphibious Blvd', 'Mountain View', 'USA'), ('Avenida da Pastelaria, 1903', 'Lisbon', 'Portugal'), ('8 Rue du Nom Fictif 341', 'Paris', 'France');
Populate the
customers
table.INSERT INTO nla_demo.customers (first_name, last_name, email, address_id, date_of_birth) VALUES ('Alex', 'B.', 'alex.b@example.com', 1, '2003-02-20'), ('Amal', 'M.', 'amal.m@example.com', 2, '1998-11-08'), ('Dani', 'G.', 'dani.g@example.com', 3, '2002-07-25');
Populate the
categories
table.INSERT INTO nla_demo.categories (category_id, category_name) VALUES (1, 'Accessories'), (2, 'Apparel'), (3, 'Footwear'), (4, 'Swimwear');
Populate the
brands
table.INSERT INTO nla_demo.brands (brand_id, brand_name) VALUES (1, 'CymbalPrime'), (2, 'CymbalPro'), (3, 'CymbalSports');
Populate the
products
table.INSERT INTO nla_demo.products (product_id, brand_id, category_id, name, description, price) VALUES (1, 1, 2, 'Hoodie', 'A comfortable, casual sweatshirt with an attached hood.', 79.99), (2, 1, 3, 'Running Shoes', 'Lightweight, cushioned footwear designed for the impact of running.', 99.99), (3, 2, 4, 'Swimsuit', 'A garment designed for swimming or other water activities.', 20.00), (4, 3, 1, 'Tote Bag', 'A large, unfastened bag with two parallel handles.', 69.99), (5, 3, 3, 'CymbalShoe', 'Footwear from Cymbal, designed for your life''s rhythm.', 89.99); UPDATE nla_demo.products SET description_embedding = embedding('text-embedding-004', description);
Populate the
orders
table.INSERT INTO nla_demo.orders (order_id, customer_id, total_amount, shipping_address_id, billing_address_id, order_status) VALUES (1, 1, 99.99, 1, 1, 'Shipped'), (2, 1, 69.99, 1, 1, 'Delivered'), (3, 2, 20.99, 2, 2, 'Processing'), (4, 3, 79.99, 3, 3, 'Shipped');
Populate the
order_items
table.INSERT INTO nla_demo.order_items (order_id, product_id, quantity, price) VALUES (1, 1, 1, 79.99), (1, 3, 1, 20.00), (2, 4, 1, 69.99), (3, 3, 1, 20.00), (4, 2, 1, 79.99);
Create a natural language configuration
To use AlloyDB AI natural language, make sure that the
Vertex AI endpoint is configured.
Then you create a configuration and register a schema.
g_alloydb_ai_nl.g_create_configuration
creates the model.
Create a natural language configuration.
SELECT alloydb_ai_nl.g_create_configuration( 'nla_demo_cfg' );
Register tables to the
nla_demo_cfg
config.SELECT alloydb_ai_nl.g_manage_configuration( operation => 'register_table_view', configuration_id_in => 'nla_demo_cfg', table_views_in=>'{nla_demo.customers, nla_demo.addresses, nla_demo.brands, nla_demo.products, nla_demo.categories, nla_demo.orders, nla_demo.order_items}' );
Create and apply context for tables and columns
To provide accurate answers to natural language questions, you use the
AlloyDB AI natural language API to provide context about tables,
views, and columns. You can use the automated context generation feature of the
AlloyDB AI natural language API to produce context from tables
and columns, and apply the context as COMMENTS
attached to tables, views, and
columns.
To generate schema contexts for the tables and their columns that are registered in the
nla_demo_cfg
configuration, run the following:SELECT alloydb_ai_nl.generate_schema_context( 'nla_demo_cfg', TRUE );
The preceding query populates the
alloydb_ai_nl.generated_schema_context_view
view with context. PassingTRUE
overwrites the context in this view from previous runs.To verify the generated context for the
nla_demo.products
table, run the following query:SELECT object_context FROM alloydb_ai_nl.generated_schema_context_view WHERE schema_object = 'nla_demo.products';
The resulting context is similar to the following:
The products table stores information about products, including their name, a brief description, the brand they belong to (referenced by brand_id), and the category they fall under (referenced by category_id). Each product has a unique identifier (product_id) and a timestamp indicating its creation time (created_at).
To verify the produced context for a column, such as
nla_demo.products.name
, run the following:SELECT object_context FROM alloydb_ai_nl.generated_schema_context_view WHERE schema_object = 'nla_demo.products.name';
The query output is similar to the following:
The name column in the nla_demo.products table contains the specific name or title of each product. This is a short, descriptive text string that clearly identifies the product, like "Hoodie," "Tote Bag," "Running Shoes," or "Swimsuit." It helps distinguish individual products within the broader context of their brand and category. The name column specifies the exact product. This column is essential for users and systems to identify and refer to specific products within the database.
Review the generated context in the
alloydb_ai_nl.generated_schema_context_view
view, and update the context that needs revision.SELECT alloydb_ai_nl.update_generated_relation_context( 'nla_demo.products', 'The "nla_demo.products" table stores product details such as ID, name, description, brand, category linkage, and record creation time.' ); SELECT alloydb_ai_nl.update_generated_column_context( 'nla_demo.products.name', 'The "name" column in the "nla_demo.products" table contains the specific name or title of each product.' );
Apply the generated context that you want to attach to the corresponding objects:
SELECT alloydb_ai_nl.apply_generated_relation_context( 'nla_demo.products', true ); SELECT alloydb_ai_nl.apply_generated_column_context( 'nla_demo.products.name', true );
The resulting context entries in the
alloydb_ai_nl.generated_schema_context_view
view are applied to the corresponding schema objects, and the comments are overwritten.
Construct the value index
The AlloyDB AI natural language API produces accurate SQL queries by using value linking. Value linking associates value phrases in natural language statements with pre-registered concept types and column names which can enrich the natural language question.
For example, the question "Give me the price of a Hoodie" can be answered more
accurately if Hoodie
is associated with a product_name
concept, which is
associated with the nla_demo.products.name
.
column.
To define the
product_name
concept type and associate it with thenla_demo.products.name
column, run the following queries:SELECT alloydb_ai_nl.add_concept_type( concept_type_in => 'product_name', match_function_in => 'alloydb_ai_nl.get_concept_and_value_generic_entity_name', additional_info_in => '{ "description": "Concept type for product name.", "examples": "SELECT alloydb_ai_nl.get_concept_and_value_generic_entity_name(''Hoodie'')" }'::jsonb ); SELECT alloydb_ai_nl.associate_concept_type( 'nla_demo.products.name', 'product_name', 'nla_demo_cfg' );
To verify that the
product_name
concept type is added to the list of concept types, run the following query to make sure thatproduct_name
is included in the result of this query:SELECT alloydb_ai_nl.list_concept_types();
To verify that the
nla_demo.products.name
column is associated with theproduct_name
concept type, run the following query:SELECT * FROM alloydb_ai_nl.value_index_columns WHERE column_names = 'nla_demo.products.name';
To define the
brand_name
concept type and associate it with thenla_demo.brands.brand_name
column, run the following queries:SELECT alloydb_ai_nl.add_concept_type( concept_type_in => 'brand_name', match_function_in => 'alloydb_ai_nl.get_concept_and_value_generic_entity_name', additional_info_in => '{ "description": "Concept type for brand name.", "examples": "SELECT alloydb_ai_nl.get_concept_and_value_generic_entity_name(''CymbalPrime'')" }'::jsonb ); SELECT alloydb_ai_nl.associate_concept_type( 'nla_demo.brands.brand_name', 'brand_name', 'nla_demo_cfg' );
After you define the concept types and associate columns with them, create a value index.
SELECT alloydb_ai_nl.create_value_index('nla_demo_cfg'); SELECT alloydb_ai_nl.refresh_value_index('nla_demo_cfg');
Autogenerate concept type associations
Using AlloyDB AI natural language, you can autogenerate
associations based on the existing concept types, instead of having to
manually associate a concept type with columns—for example, manually
calling alloydb_ai_nl.associate_concept_type
.
To autogenerate a concept type association, follow these steps:
Generate associations for all relations within the scope of
nla_demo_cfg
:SELECT alloydb_ai_nl.generate_concept_type_associations('nla_demo_cfg');
Review the generated associations.
SELECT * from alloydb_ai_nl.generated_value_index_columns_view;
The result is similar to the following. Built-in concepts are considered as well as user-defined concepts.
-[ RECORD 1 ]---+----------------------------------------------------------- id | 1 config | nla_demo_cfg column_names | nla_demo.addresses.city concept_type | city_name additional_info | {} -[ RECORD 2 ]---+----------------------------------------------------------- id | 2 config | nla_demo_cfg column_names | nla_demo.addresses.country concept_type | country_name additional_info | {} -[ RECORD 3 ]---+----------------------------------------------------------- id | 3 config | nla_demo_cfg column_names | nla_demo.customers.first_name,nla_demo.customers.last_name concept_type | full_person_name additional_info | {} -[ RECORD 4 ]---+----------------------------------------------------------- id | 4 config | nla_demo_cfg column_names | nla_demo.brands.brand_name concept_type | brand_name additional_info | {} -[ RECORD 5 ]---+----------------------------------------------------------- id | 5 config | nla_demo_cfg column_names | nla_demo.products.name concept_type | product_name additional_info | {} .... ```
Optional: Update or drop the generated associations.
-- Update, NULL means keeping the original value. SELECT alloydb_ai_nl.update_generated_concept_type_associations( id => 1, column_names => NULL, concept_type => 'generic_entity_name', additional_info => NULL ); -- Drop SELECT alloydb_ai_nl.drop_generated_concept_type_association(id => 1);
Apply the generated associations.
SELECT alloydb_ai_nl.apply_generated_concept_type_associations('nla_demo_cfg');
To reflect the changes, refresh the value index.
SELECT alloydb_ai_nl.refresh_value_index('nla_demo_cfg');
Define query templates
You can define templates to improve the quality of the answers produced by the AlloyDB AI natural language API.
To provide example templates for business critical questions, and to provide anticipated questions for which high accuracy is expected, run the following query to add a template:
SELECT alloydb_ai_nl.add_template( nl_config_id => 'nla_demo_cfg', intent => 'List the first names and the last names of all customers who ordered Swimsuit.', sql => 'SELECT c.first_name, c.last_name FROM nla_demo.Customers c JOIN nla_demo.orders o ON c.customer_id = o.customer_id JOIN nla_demo.order_items oi ON o.order_id = oi.order_id JOIN nla_demo.products p ON oi.product_id = p.product_id AND p.name = ''Swimsuit''', sql_explanation => 'To answer this question, JOIN `nla_demo.Customers` with `nla_demo.orders` on having the same `customer_id`, and JOIN the result with nla_demo.order_items on having the same `order_id`. Then JOIN the result with `nla_demo.products` on having the same `product_id`, and filter rwos that with p.name = ''Swimsuit''. Return the `first_name` and the `last_name` of the customers with matching records.', check_intent => TRUE );
To view the list of added templates, query the view
alloydb_ai_nl.template_store_view
:SELECT nl, sql, intent, psql, pintent FROM alloydb_ai_nl.template_store_view WHERE config = 'nla_demo_cfg';
The following output is returned:
nl | List the first names and the last names of all customers who ordered Swimsuit. sql | SELECT c.first_name, c.last_name | FROM nla_demo.Customers c | JOIN nla_demo.orders o ON c.customer_id = o.customer_id | JOIN nla_demo.order_items oi ON o.order_id = oi.order_id | JOIN nla_demo.products p ON oi.product_id = p.product_id | AND p.name = 'Swimsuit' intent | List the first names and the last names of all customers who ordered | Swimsuit. psql | SELECT c.first_name, c.last_name | FROM nla_demo.Customers c JOIN nla_demo.orders o | ON c.customer_id = o.customer_id | JOIN nla_demo.order_items oi ON o.order_id = oi.order_id | JOIN nla_demo.products p ON oi.product_id = p.product_id | AND p.name = $1 pintent | List the first names and the last names of all customers who ordered | $1.
In this template, the value corresponding to the
psql
attribute is the parameterized SQL query, and the value for thepintent
column is the parameterized intent statement. The ID of a recently added template can be different, based on the templates that you added previously. Templates provide highly accurate answers to questions.To create a template that uses semantic search, run the following example statement:
SELECT alloydb_ai_nl.add_template( nl_config_id => 'nla_demo_cfg', intent => 'List 3 products most similar to a Swimwear.', sql => $$SELECT name FROM nla_demo.products ORDER BY description_embedding <=> embedding('text-embedding-004', 'Swimwear')::vector$$, sql_explanation => $$To answer this question, ORDER products in `nla_demo.products` , based by their distance of the descrption_embedding of the product with the embedding of 'Swimwear'.$$, check_intent => TRUE );
The preceding template adds the following row to the view
alloydb_ai_nl.template_store_view
:nl | List 3 products most similar to a Swimwear. sql | SELECT name FROM nla_demo.products | ORDER BY description_embedding <=> | embedding('text-embedding-004', 'Swimwear')::vector intent | List 3 products most similar to a Swimwear. psql | SELECT name FROM nla_demo.products | ORDER BY description_embedding <=> | embedding('text-embedding-004', $1)::vector pintent | List 3 products most similar to a $1.
Define a query fragment
You can define fragments to improve the quality of the answers produced by the AlloyDB AI natural language API.
To provide a fragment for business critical predicates, and to provide anticipated conditions for which high accuracy is expected, run the following query:
SELECT alloydb_ai_nl.add_fragment(
nl_config_id => 'nla_demo_cfg',
table_aliases => ARRAY['nla_demo.products AS T'],
intent => 'luxury product',
fragment => $$description LIKE '%luxury%' OR description LIKE '%premium%' OR description LIKE '%exclusive%' OR description LIKE '%high-end%' OR description LIKE '%finest%' OR description LIKE '%elite%' OR description LIKE '%deluxe%'$$);
To view the list of added fragments, query the view alloydb_ai_nl.fragment_store_view
:
SELECT intent, fragment, pintent
FROM alloydb_ai_nl.fragment_store_view;
The following output is returned:
intent | luxury product
fragment | description LIKE '%luxury%' OR description LIKE '%premium%' OR description LIKE '%exclusive%' OR description LIKE '%high-end%' OR description LIKE '%finest%' OR description LIKE '%elite%' OR description LIKE '%deluxe%'
pintent | luxury product
Generate SQL results from natural language questions
To use the AlloyDB AI natural language API to produce SQL queries and result sets, run the following query:
SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'Find the customers who purchased Tote Bag.' ) ->> 'sql';
The following output is returned:
SELECT DISTINCT "c"."first_name", "c"."last_name" FROM "nla_demo"."customers" AS "c" JOIN "nla_demo"."orders" AS "o" ON "c"."customer_id" = "o"."customer_id" JOIN "nla_demo"."order_items" AS "oi" ON "o"."order_id" = "oi"."order_id" JOIN "nla_demo"."products" AS "p" ON "oi"."product_id" = "p"."product_id" WHERE "p"."name" = 'Tote Bag';
The JSON output is a SQL query using the template that you added in Define a query template.
To use the AlloyDB AI natural language API to produce SQL queries, run the following query:
SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'List the maximum price of any CymbalShoe.' ) ->> 'sql';
The following output is returned:
SELECT max("price") FROM "nla_demo"."products" WHERE "name" = 'CymbalShoe'
AlloyDB AI natural language API recognizes that
CymbalShoe
is the name of the product, by using the value index. Run the following query to replaceCymbalShoe
with a brand name (CymbalPrime
):SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'List the maximum price of any CymbalPrime.' ) ->> 'sql';
The following output is produced:
SELECT max("price") FROM "nla_demo"."products" AS t1 JOIN "nla_demo"."brands" AS t2 ON t1."brand_id" = t2."brand_id" WHERE t2."brand_name" = 'CymbalPrime';
AlloyDB AI uses the value index created in Construct the value index to resolve
CymbalPrime
into thebrand_name
concept type, and uses thenla_demo.brands.brand_name
column associated to thebrand_name
.To use the AlloyDB AI natural language API to produce the result of a question, run the following query:
SELECT alloydb_ai_nl.execute_nl_query( 'nla_demo_cfg', 'Find the last name of the customers who live in Lisbon.' );
The following output is returned:
execute_nl_query -------------------------- {"last_name":"M."}
To use the AlloyDB AI natural language API to produce SQL statements that use semantic search, run the following query:
SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'List 2 products similar to a Tote Bag.');
The following SQL statement is returned:
SELECT name FROM nla_demo.products ORDER BY description_embedding <=> embedding( 'text-embedding-004', 'Tote Bag')::vector LIMIT 2;
Get SQL summaries
You can produce a result summary from a natural language question based on the data that's stored in the database. This helps end users understand data by posing natural language questions directly.
To get a SQL summary, run the following example query:
SELECT
alloydb_ai_nl.get_sql_summary(
nl_config_id => 'nla_demo_cfg',
nl_question => 'which brands have the largest number of products.'
);
This query returns a JSON object as output, similar to the following:
"answer": "The result set lists three brands: CymbalSports, CymbalPro, and CymbalPrime. Each brand is represented once, suggesting an equal distribution of products across these three brands within the dataset."
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
The following sections describe how to delete these resources and objects.
Delete the cluster
When you delete the cluster that you created in Before you begin, all of the objects that you created are also deleted.
In the Google Cloud console, go to the Clusters page.
Click the name of your cluster,
my-cluster
, in the Resource name column.Click delete Delete cluster.
In Delete cluster my-cluster, enter
my-cluster
to confirm you want to delete your cluster.Click Delete.
If you created a private connection when you created a cluster, go to the Google Cloud console VPC networks page and click Delete VPC network.
Delete the objects
You can choose to keep the resources that you set up in Before you begin, and you can delete just the objects that you created in the Google Cloud project.
To remove the template that you defined in Define a query template, run the following query:
SELECT alloydb_ai_nl.drop_template(id) FROM alloydb_ai_nl.template_store_view WHERE config = 'nla_demo_cfg';
To remove the autogenerated concept associations generated in Autogenerate concept type associations, run the following query:
SELECT alloydb_ai_nl.drop_generated_concept_type_association(id) FROM alloydb_ai_nl.generated_value_index_columns_view WHERE config = 'nla_demo_cfg';
To remove the
product_name
concept type that you defined in Construct the value index, run the following query:SELECT alloydb_ai_nl.drop_concept_type('product_name');
To refresh the value index after you remove the
product_name
concept type, run the following query:SELECT alloydb_ai_nl.refresh_value_index();
To remove the
nla_demo_cfg
configuration that you created in Create a natural language configuration, run the following query:SELECT alloydb_ai_nl.g_manage_configuration( 'drop_configuration', 'nla_demo_cfg' );
To remove the nla_demo schema and tables that you created and populated in Create the
nla_demo
schema and tables and Populate tables in thenla_demo
schema, run the following query:DROP SCHEMA nla_demo CASCADE;