Introduction to AI in BigQuery
BigQuery offers various AI capabilities that let you do the following:
- Do predictive machine learning (ML).
- Run inference against large language models (LLMs) such as Gemini.
- Build applications using embeddings and vector search.
- Use built-in agents to assist with coding.
- Create data pipelines.
- Access BigQuery functionality with agent tools.
Machine learning
With BigQuery ML, you can train, evaluate, and run inference on models for tasks such as time series forecasting, anomaly detection, classification, regression, clustering, dimensionality reduction, and recommendations.
You can work with BigQuery ML capabilities through the Google Cloud console, the bq command-line tool, the REST API, or in Colab Enterprise notebooks. Because BigQuery ML lets SQL practitioners use existing SQL tools and skills to build and evaluate models, it democratizes ML and speeds up model development by bringing ML to the data instead of requiring data movement. You can use BigQuery ML to help you with the following types of ML tasks:
- Create and run ML models by using GoogleSQL queries.
- Create Colab Enterprise notebooks to perform ML workflows. Notebooks let you use SQL and Python interchangeably, and use any AI or ML Python libraries for your development.
- Understand the results of your predictive ML models with explainable AI.
- Use the TimesFM,
ARIMA_PLUS, andARIMA_PLUS_XREGmodels to perform forecasting and anomaly detection on time series data. - Generate insights about changes to key metrics in your multi-dimensional data with contribution analysis.
To learn more, see the Introduction to ML in BigQuery.
AI functions
BigQuery offers various SQL functions that you can use for AI tasks such as text generation, text or unstructured data analysis, and translation. These functions access Gemini and partner LLM models available from Vertex AI, Cloud AI APIs, or built-in BigQuery models to perform these tasks.
There are several categories of AI functions:
Generative AI functions. These functions help you perform tasks such as content generation, analysis, summarization, structured data extraction, classification, embedding generation, and data enrichment. There are two types of generative AI functions:
- General-purpose AI functions give you full control and transparency on the choice of model, prompt, and parameters to use.
- Managed AI functions offer a streamlined syntax for routine tasks such as filtering, rating, and classification. BigQuery chooses a model for you, optimized for cost and quality.
Task-specific functions. These functions help you use Cloud AI APIs for tasks such as the following:
For more information, see Task-specific solutions overview.
Search
BigQuery offers a variety of search functions and features to help you efficiently find specific data or discover similarities between data including multimodal data.
- Text search. You can use the
SEARCHfunction to perform tokenized search on unstructured text or semi-structuredJSONdata. You can improve search performance by creating a search index, which lets BigQuery optimize queries that use theSEARCHfunction, as well as other functions and operators. For more information, see Search indexed data. - Embedding generation. You can generate multimodal embeddings by using models provided by or hosted on Vertex AI, or by using models imported and run in BigQuery.
- Vector search. You can use the
VECTOR_SEARCHfunction to search embeddings to find semantically similar items. Embeddings are high-dimensional numerical vectors that represent entities such as text or images, and are often generated by ML models. You can improve vector search performance by creating a vector index, which uses Approximate Nearest Neighbor search techniques to provide faster, more approximate results. Common use cases for vector search include semantic search, recommendation, and retrieval-augmented generation (RAG). For more information, see Introduction to vector search.
Assistive AI features
AI-powered assistance features in BigQuery, collectively referred to as Gemini in BigQuery, help you discover, prepare, query, and visualize your data.
- Data insights. Generate natural language questions about your data, along with the SQL queries to answer those questions.
- Data preparation. Generate context aware recommendations to clean, transform, and enrich your data.
- SQL code assist. Generate, complete, and explain SQL queries.
- Python code assist. Generate, complete, and explain Python code, including PySpark and BigQuery DataFrames.
- Data canvas. Query your data using natural language, visualize results with charts, and ask follow-up questions.
- SQL translator. Create Gemini-enhanced SQL translation rules to help you migrate queries written in a different dialect to GoogleSQL.
Agents
Agents are software tools that can use AI to complete tasks on your behalf. You can use built-in agents or create your own agents to help you process, manage, analyze, and visualize your data:
Use the Data Science Agent to automate exploratory data analysis, data processing, ML tasks, and visualization insights within a Colab Enterprise notebook.
Use the Data Engineering Agent to build, modify, and manage data pipelines to load and process data in BigQuery. You can use natural language prompts to generate data pipelines from various data sources or adapt existing data pipelines to suit your data engineering needs.
Use the Gemini CLI to interact with BigQuery data in your terminal by using natural language prompts.
Use the MCP toolbox to connect your own AI tool to BigQuery and interact with your data.
What's next
- For more information about ML, see Introduction to ML in BigQuery.
- For more information about generative AI functions in SQL, see Generative AI overview.
- For more information about searching your data, see Search indexed data and Introduction to vector search.
- For more information about assistive AI features, see Gemini in BigQuery.
- For more information about using agents with BigQuery, see Use BigQuery with MCP, Gemini CLI, and other agents.