Forecasting overview

Forecasting is a technique where you analyze historical data in order to make an informed prediction about future trends. For example, you might analyze historical sales data from several store locations in order to predict future sales at those locations. In BigQuery ML, you perform forecasting on time series data.

You can perform forecasting in the following ways:

  • By using the AI.FORECAST function with the built-in TimesFM model. Use this approach when you need to forecast future values for a single variable, and don't require the ability to fine-tune the model. This approach doesn't require you to create and manage a model.
  • By using the ML.FORECAST function with the ARIMA_PLUS model. Use this approach when you need to run an ARIMA-based modeling pipeline and decompose the time series into multiple components in order to explain the results. This approach requires you to create and manage a model.
  • By using the ML.FORECAST function with the ARIMA_PLUS_XREG model. Use this approach when you need to forecast future values for multiple variables. This approach requires you to create and manage a model.

ARIMA_PLUS and ARIMA_PLUS_XREG time series models aren't actually single models, but rather a time series modeling pipeline that includes multiple models and algorithms. For more information, see Time series modeling pipeline.

Compare the TimesFM and ARIMA models

Use the following table to determine whether to use AI.FORECAST with the built-in TimesFM model or ML.FORECAST with an ARIMA_PLUS or ARIMA_PLUS_XREG model for your use case:

Feature AI.FORECAST with a TimesFM model ML.FORECAST with an ARIMA_PLUS or ARIMA_PLUS_XREG model
Model type Transformer-based foundation model. Statistical model that uses the ARIMA algorithm for the trend component, and a variety of other algorithms for non-trend components. For more information, see Time series modeling pipeline.
Training required No, the TimesFM model is pre-trained. Yes, one ARIMA_PLUS or ARIMA_PLUS_XREG model is trained for each time series.
SQL ease of use Very high. Requires a single function call. High. Requires a CREATE MODEL statement and a function call.
Data history used Uses 512 time points. Uses all time points in the training data, but can be customized to use fewer time points.
Accuracy Very high. Outperforms a number of other models. For more information, see A Decoder-only Foundation Model for Time-series Forecasting. Very high, on par with the TimesFM model.
Customization Low. High. The CREATE MODEL statement offers arguments that let you tune many model settings, such as the following:
  • Seasonality
  • Holiday effects
  • Step changes
  • Trend
  • Spikes and dips removal
  • Forecasting upper and lower bounds
Supports covariates No. Yes, when using the ARIMA_PLUS_XREG model.
Explainability Low. High. You can use the ML.EXPLAIN_FORECAST function to inspect model components.
Best use cases
  • Quick forecasts
  • Need minimal setup
  • Model needs fine tuning
  • Need explainability for model output
  • Model input needs more context

By using the default settings of BigQuery ML's statements and functions, you can create and use a forecasting model even without much ML knowledge. However, having basic knowledge about ML development, and forecasting models in particular, helps you optimize both your data and your model to deliver better results. We recommend using the following resources to develop familiarity with ML techniques and processes: