KMeans(n_clusters: int = 8)
K-Means clustering.
Parameter | |
---|---|
Name | Description |
n_clusters |
int, default 8
The number of clusters to form as well as the number of centroids to generate. Default to 8. |
Properties
cluster_centers_
Information of cluster centers.
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame | DataFrame of cluster centers, containing following columns: centroid_id: An integer that identifies the centroid. feature: The column name that contains the feature. numerical_value: If feature is numeric, the value of feature for the centroid that centroid_id identifies. If feature is not numeric, the value is NULL. categorical_value: An list of mappings containing information about categorical features. Each mapping contains the following fields: categorical_value.category: The name of each category. categorical_value.value: The value of categorical_value.category for the centroid that centroid_id identifies. The output contains one row per feature per centroid. |
Methods
__repr__
__repr__()
Print the estimator's constructor with all non-default parameter values
fit
fit(
X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
y: typing.Optional[
typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]
] = None,
) -> bigframes.ml.base._T
Compute k-means clustering.
Parameters | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series
DataFrame of shape (n_samples, n_features). Training data. |
y |
default None
Not used, present here for API consistency by convention. |
Returns | |
---|---|
Type | Description |
KMeans | Fitted Estimator. |
get_params
get_params(deep: bool = True) -> typing.Dict[str, typing.Any]
Get parameters for this estimator.
Parameter | |
---|---|
Name | Description |
deep |
bool, default True
Default |
Returns | |
---|---|
Type | Description |
Dictionary | A dictionary of parameter names mapped to their values. |
predict
predict(
X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]
) -> bigframes.dataframe.DataFrame
Predict the closest cluster each sample in X belongs to.
Parameter | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series
DataFrame of shape (n_samples, n_features). New data to predict. |
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame | DataFrame of the cluster each sample belongs to. |
register
register(vertex_ai_model_id: typing.Optional[str] = None) -> bigframes.ml.base._T
Register the model to Vertex AI.
After register, go to https://console.cloud.google.com/vertex-ai/models to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.
Parameter | |
---|---|
Name | Description |
vertex_ai_model_id |
Optional[str], default None
optional string id as model id in Vertex. If not set, will by default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation. |
score
score(
X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series], y=None
) -> bigframes.dataframe.DataFrame
Metrics of the model.
Parameters | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series
DataFrame of shape (n_samples, n_features). New Data. |
y |
default None
Not used, present here for API consistency by convention. |
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame | DataFrame of the metrics. |
to_gbq
to_gbq(model_name: str, replace: bool = False) -> bigframes.ml.cluster.KMeans
Save the model to BigQuery.
Parameters | |
---|---|
Name | Description |
model_name |
str
the name of the model. |
replace |
bool, default False
whether to replace if the model already exists. Default to False. |
Returns | |
---|---|
Type | Description |
KMeans | saved model. |