Class Pipeline (2.25.0)

Pipeline(steps: typing.List[typing.Tuple[str, bigframes.ml.base.BaseEstimator]])

Pipeline of transforms with a final estimator.

Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be transforms. That is, they must implement fit and transform methods. The final estimator only needs to implement fit.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. This simplifies code and allows for deploying an estimator and preprocessing together, e.g. with Pipeline.to_gbq(...).

Methods

repr

__repr__()

Print the estimator's constructor with all non-default parameter values.

fit

fit(
    X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Optional[
        typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]
    ] = None,
) -> bigframes.ml.pipeline.Pipeline

Fit the model.

Fit all the transformers one after the other and transform the data. Finally, fit the transformed data using the final estimator.

Parameters
Name	Description
`X`	`bigframes.dataframe.DataFrame or bigframes.series.Series` A DataFrame or Series representing training data. Must match the input requirements of the first step of the pipeline.
`y`	`bigframes.dataframe.DataFrame or bigframes.series.Series` A DataFrame or Series representing training targets, if applicable.

Returns
Type	Description
`Pipeline`	Pipeline with fitted steps.

get_params

get_params(deep: bool = True) -> typing.Dict[str, typing.Any]

Get parameters for this estimator.

Parameter
Name	Description
`deep`	`bool, default True` Default `True`. If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
Type	Description
`Dictionary`	A dictionary of parameter names mapped to their values.

predict

predict(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
) -> bigframes.dataframe.DataFrame

API documentation for predict method.

score

score(
    X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Optional[
        typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]
    ] = None,
) -> bigframes.dataframe.DataFrame

API documentation for score method.

to_gbq

to_gbq(model_name: str, replace: bool = False) -> bigframes.ml.pipeline.Pipeline

Save the pipeline to BigQuery.

Parameters
Name	Description
`model_name`	`str` The name of the model(pipeline).
`replace`	`bool, default False` Whether to replace if the model(pipeline) already exists. Default to False.

Returns
Type	Description
`Pipeline`	Saved model(pipeline).