Class LogisticRegression (1.5.0)

LogisticRegression(
    *,
    optimize_strategy: typing.Literal[
        "auto_strategy", "batch_gradient_descent", "normal_equation"
    ] = "auto_strategy",
    fit_intercept: bool = True,
    l1_reg: typing.Optional[float] = None,
    l2_reg: float = 0.0,
    max_iterations: int = 20,
    warm_start: bool = False,
    learning_rate: typing.Optional[float] = None,
    learning_rate_strategy: typing.Literal["line_search", "constant"] = "line_search",
    tol: float = 0.01,
    ls_init_learning_rate: typing.Optional[float] = None,
    calculate_p_values: bool = False,
    enable_global_explain: bool = False,
    class_weight: typing.Optional[
        typing.Union[typing.Literal["balanced"], typing.Dict[str, float]]
    ] = None
)

Logistic Regression (aka logit, MaxEnt) classifier.

Parameters

Name Description
optimize_strategy str, default "auto_strategy"

The strategy to train logistic regression models. Possible values are "auto_strategy", "batch_gradient_descent", "normal_equation". Default to "auto_strategy".

fit_intercept default True

Default True. Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

class_weight dict or 'balanced', default None

Default None. Weights associated with classes in the form {class_label: weight}.If not given, all classes are supposed to have weight one. The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)). Dict isn't supported.

l1_reg float or None, default None

The amount of L1 regularization applied. Default to None. Can't be set in "normal_equation" mode. If unset, value 0 is used.

l2_reg float, default 0.0

The amount of L2 regularization applied. Default to 0.

max_iterations int, default 20

The maximum number of training iterations or steps. Default to 20.

warm_start bool, default False

Determines whether to train a model with new training data, new model options, or both. Unless you explicitly override them, the initial options used to train the model are used for the warm start run. Default to False.

learning_rate float or None, default None

The learn rate for gradient descent when learning_rate_strategy='constant'. If unset, value 0.1 is used. If learning_rate_strategy='line_search', an error is returned.

learning_rate_strategy str, default "line_search"

The strategy for specifying the learning rate during training. Default to "line_search".

tol float, default 0.01

The minimum relative loss improvement that is necessary to continue training when EARLY_STOP is set to true. For example, a value of 0.01 specifies that each iteration must reduce the loss by 1% for training to continue. Default to 0.01.

ls_init_learning_rate float or None, default None

Sets the initial learning rate that learning_rate_strategy='line_search' uses. This option can only be used if line_search is specified. If unset, value 0.1 is used.

calculate_p_values bool, default False

Specifies whether to compute p-values and standard errors during training. Default to False.

enable_global_explain bool, default False

Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

Methods

__repr__

__repr__()

Print the estimator's constructor with all non-default parameter values.

fit

fit(
    X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> bigframes.ml.base._T

Fit the model according to the given training data.

Parameters
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples, n_features). Training vector, where n_samples is the number of samples and n_features is the number of features.

y bigframes.dataframe.DataFrame or bigframes.series.Series

DataFrame of shape (n_samples,). Target vector relative to X.

Returns
Type Description
LogisticRegression Fitted estimator.

get_params

get_params(deep: bool = True) -> typing.Dict[str, typing.Any]

Get parameters for this estimator.

Parameter
Name Description
deep bool, default True

Default True. If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
Type Description
Dictionary A dictionary of parameter names mapped to their values.

predict

predict(
    X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]
) -> bigframes.dataframe.DataFrame

Predict class labels for samples in X.

Parameter
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples, n_features). The data matrix for which we want to get the predictions.

Returns
Type Description
bigframes.dataframe.DataFrame DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values.

register

register(vertex_ai_model_id: typing.Optional[str] = None) -> bigframes.ml.base._T

Register the model to Vertex AI.

After register, go to the Google Cloud console (https://console.cloud.google.com/vertex-ai/models) to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.

Parameter
Name Description
vertex_ai_model_id Optional[str], default None

Optional string id as model id in Vertex. If not set, will default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation.

score

score(
    X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> bigframes.dataframe.DataFrame

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy, which is a harsh metric since you require that each label set be correctly predicted for each sample.

Parameters
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series

DataFrame of shape (n_samples, n_features). Test samples.

y bigframes.dataframe.DataFrame or bigframes.series.Series

DataFrame of shape (n_samples,) or (n_samples, n_outputs). True labels for X.

Returns
Type Description
bigframes.dataframe.DataFrame A DataFrame of the evaluation result.

to_gbq

to_gbq(
    model_name: str, replace: bool = False
) -> bigframes.ml.linear_model.LogisticRegression

Save the model to BigQuery.

Parameters
Name Description
model_name str

The name of the model.

replace bool, default False

Determine whether to replace if the model already exists. Default to False.

Returns
Type Description
LogisticRegression Saved model.