Introduction to AI in BigQuery

BigQuery offers various AI capabilities that let you do the following:

  • Do predictive machine learning (ML).
  • Run inference against large language models (LLMs) such as Gemini.
  • Build applications using embeddings and vector search.
  • Use built-in agents to assist with coding.
  • Create data pipelines.
  • Access BigQuery functionality with agent tools.

Machine learning

With BigQuery ML, you can train, evaluate, and run inference on models for tasks such as time series forecasting, anomaly detection, classification, regression, clustering, dimensionality reduction, and recommendations.

You can work with BigQuery ML capabilities through the Google Cloud console, the bq command-line tool, the REST API, or in Colab Enterprise notebooks. Because BigQuery ML lets SQL practitioners use existing SQL tools and skills to build and evaluate models, it democratizes ML and speeds up model development by bringing ML to the data instead of requiring data movement. You can use BigQuery ML to help you with the following types of ML tasks:

To learn more, see the Introduction to ML in BigQuery.

AI functions

BigQuery offers various SQL functions that you can use for AI tasks such as text generation, text or unstructured data analysis, and translation. These functions access Gemini and partner LLM models available from Vertex AI, Cloud AI APIs, or built-in BigQuery models to perform these tasks.

There are several categories of AI functions:

  • Generative AI functions. These functions help you perform tasks such as content generation, analysis, summarization, structured data extraction, classification, embedding generation, and data enrichment. There are two types of generative AI functions:

    • General-purpose AI functions give you full control and transparency on the choice of model, prompt, and parameters to use.
    • Managed AI functions offer a streamlined syntax for routine tasks such as filtering, rating, and classification. BigQuery chooses a model for you, optimized for cost and quality.
  • Task-specific functions. These functions help you use Cloud AI APIs for tasks such as the following:

For more information, see Task-specific solutions overview.

BigQuery offers a variety of search functions and features to help you efficiently find specific data or discover similarities between data including multimodal data.

  • Text search. You can use the SEARCH function to perform tokenized search on unstructured text or semi-structured JSON data. You can improve search performance by creating a search index, which lets BigQuery optimize queries that use the SEARCH function, as well as other functions and operators. For more information, see Search indexed data.
  • Embedding generation. You can generate multimodal embeddings by using models provided by or hosted on Vertex AI, or by using models imported and run in BigQuery.
  • Vector search. You can use the VECTOR_SEARCH function to search embeddings to find semantically similar items. Embeddings are high-dimensional numerical vectors that represent entities such as text or images, and are often generated by ML models. You can improve vector search performance by creating a vector index, which uses Approximate Nearest Neighbor search techniques to provide faster, more approximate results. Common use cases for vector search include semantic search, recommendation, and retrieval-augmented generation (RAG). For more information, see Introduction to vector search.

Assistive AI features

AI-powered assistance features in BigQuery, collectively referred to as Gemini in BigQuery, help you discover, prepare, query, and visualize your data.

  • Data insights. Generate natural language questions about your data, along with the SQL queries to answer those questions.
  • Data preparation. Generate context aware recommendations to clean, transform, and enrich your data.
  • SQL code assist. Generate, complete, and explain SQL queries.
  • Python code assist. Generate, complete, and explain Python code, including PySpark and BigQuery DataFrames.
  • Data canvas. Query your data using natural language, visualize results with charts, and ask follow-up questions.
  • SQL translator. Create Gemini-enhanced SQL translation rules to help you migrate queries written in a different dialect to GoogleSQL.

Agents

Agents are software tools that can use AI to complete tasks on your behalf. You can use built-in agents or create your own agents to help you process, manage, analyze, and visualize your data:

  • Use the Data Science Agent to automate exploratory data analysis, data processing, ML tasks, and visualization insights within a Colab Enterprise notebook.

  • Use the Data Engineering Agent to build, modify, and manage data pipelines to load and process data in BigQuery. You can use natural language prompts to generate data pipelines from various data sources or adapt existing data pipelines to suit your data engineering needs.

  • Use the Gemini CLI to interact with BigQuery data in your terminal by using natural language prompts.

  • Use the MCP toolbox to connect your own AI tool to BigQuery and interact with your data.

What's next