Class SimpleImputer (1.31.0)

SimpleImputer(strategy: typing.Literal["mean", "median", "most_frequent"] = "mean")

Univariate imputer for completing missing values with simple strategies.

Replace missing values using a descriptive statistic (e.g. mean, median, or most frequent) along each column.

Examples:

>>> import bigframes.pandas as bpd
>>> from bigframes.ml.impute import SimpleImputer
>>> bpd.options.display.progress_bar = None
>>> X_train = bpd.DataFrame({"feat0": [7.0, 4.0, 10.0], "feat1": [2.0, None, 5.0], "feat2": [3.0, 6.0, 9.0]})
>>> imp_mean = SimpleImputer().fit(X_train)
>>> X_test = bpd.DataFrame({"feat0": [None, 4.0, 10.0], "feat1": [2.0, None, None], "feat2": [3.0, 6.0, 9.0]})
>>> imp_mean.transform(X_test)
   imputer_feat0  imputer_feat1  imputer_feat2
0            7.0            2.0            3.0
1            4.0            3.5            6.0
2           10.0            3.5            9.0
<BLANKLINE>
[3 rows x 3 columns]

Parameter

Name Description
strategy {'mean', 'median', 'most_frequent'}, default='mean'

The imputation strategy. 'mean': replace missing values using the mean along the axis. 'median':replace missing values using the median along the axis. 'most_frequent', replace missing using the most frequent value along the axis.

Methods

__repr__

__repr__()

Print the estimator's constructor with all non-default parameter values.

fit

fit(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y=None,
) -> bigframes.ml.impute.SimpleImputer

Fit the imputer on X.

Parameters
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

The Dataframe or Series with training data.

y default None

Ignored.

Returns
Type Description
SimpleImputer Fitted scaler.

fit_transform

fit_transform(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
) -> bigframes.dataframe.DataFrame

Fit to data, then transform it.

Parameters
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples, n_features). Input samples.

y bigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples,) or (n_samples, n_outputs). Default None. Target values (None for unsupervised transformations).

Returns
Type Description
bigframes.dataframe.DataFrame DataFrame of shape (n_samples, n_features_new). Transformed DataFrame.

get_params

get_params(deep: bool = True) -> typing.Dict[str, typing.Any]

Get parameters for this estimator.

Parameter
Name Description
deep bool, default True

Default True. If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
Type Description
Dictionary A dictionary of parameter names mapped to their values.

to_gbq

to_gbq(model_name: str, replace: bool = False) -> bigframes.ml.base._T

Save the transformer as a BigQuery model.

Parameters
Name Description
model_name str

The name of the model.

replace bool, default False

Determine whether to replace if the model already exists. Default to False.

transform

transform(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ]
) -> bigframes.dataframe.DataFrame

Impute all missing values in X.

Parameter
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

The DataFrame or Series to be transformed.

Returns
Type Description
bigframes.dataframe.DataFrame Transformed result.