Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Para criar um modelo personalizado, você precisa de um script de treinamento do Python que crie e
treine o modelo personalizado. Inicialize o job de treinamento com o script de treinamento do Python.
Em seguida, invoque o método
run
do job de treinamento para executar o script.
Neste tópico, você cria o script de treinamento e especifica argumentos de comando para ele.
Criar um script de treinamento
Nesta seção, você criará um script de treinamento. Esse script é um novo arquivo no ambiente
do notebook chamado task.py. Mais adiante neste tutorial, você vai transmitir esse
script para o construtor aiplatform.CustomTrainingJob. Quando o script é executado, ele faz o seguinte:
Carrega os dados no conjunto de dados do BigQuery que você criou.
Especifica o número de épocas e o tamanho do lote a ser usado quando o método
Model.fit
do Keras é invocado.
Especifica onde salvar os artefatos de modelo usando a variável de ambiente
AIP_MODEL_DIR. AIP_MODEL_DIR é definido pela Vertex AI e contém o URI de um
diretório para salvar artefatos de modelo. Para mais informações, consulte Variáveis de ambiente para diretórios especiais do Cloud Storage.
Para criar o script de treinamento, execute o seguinte código no seu notebook:
%%writefiletask.pyimportargparseimportnumpyasnpimportosimportpandasaspdimporttensorflowastffromgoogle.cloudimportbigqueryfromgoogle.cloudimportstorage# Read environmental variablestraining_data_uri=os.getenv("AIP_TRAINING_DATA_URI")validation_data_uri=os.getenv("AIP_VALIDATION_DATA_URI")test_data_uri=os.getenv("AIP_TEST_DATA_URI")# Read argsparser=argparse.ArgumentParser()parser.add_argument('--label_column',required=True,type=str)parser.add_argument('--epochs',default=10,type=int)parser.add_argument('--batch_size',default=10,type=int)args=parser.parse_args()# Set up training variablesLABEL_COLUMN=args.label_column# See https://cloud.google.com/vertex-ai/docs/workbench/managed/executor#explicit-project-selection for issues regarding permissions.PROJECT_NUMBER=os.environ["CLOUD_ML_PROJECT_ID"]bq_client=bigquery.Client(project=PROJECT_NUMBER)# Download a tabledefdownload_table(bq_table_uri:str):# Remove bq:// prefix if presentprefix="bq://"ifbq_table_uri.startswith(prefix):bq_table_uri=bq_table_uri[len(prefix):]# Download the BigQuery table as a dataframe# This requires the "BigQuery Read Session User" role on the custom training service account.table=bq_client.get_table(bq_table_uri)returnbq_client.list_rows(table).to_dataframe()# Download dataset splitsdf_train=download_table(training_data_uri)df_validation=download_table(validation_data_uri)df_test=download_table(test_data_uri)defconvert_dataframe_to_dataset(df_train:pd.DataFrame,df_validation:pd.DataFrame,):df_train_x,df_train_y=df_train,df_train.pop(LABEL_COLUMN)df_validation_x,df_validation_y=df_validation,df_validation.pop(LABEL_COLUMN)y_train=tf.convert_to_tensor(np.asarray(df_train_y).astype("float32"))y_validation=tf.convert_to_tensor(np.asarray(df_validation_y).astype("float32"))# Convert to numpy representationx_train=tf.convert_to_tensor(np.asarray(df_train_x).astype("float32"))x_test=tf.convert_to_tensor(np.asarray(df_validation_x).astype("float32"))# Convert to one-hot representationnum_species=len(df_train_y.unique())y_train=tf.keras.utils.to_categorical(y_train,num_classes=num_species)y_validation=tf.keras.utils.to_categorical(y_validation,num_classes=num_species)dataset_train=tf.data.Dataset.from_tensor_slices((x_train,y_train))dataset_validation=tf.data.Dataset.from_tensor_slices((x_test,y_validation))return(dataset_train,dataset_validation)# Create datasetsdataset_train,dataset_validation=convert_dataframe_to_dataset(df_train,df_validation)# Shuffle train setdataset_train=dataset_train.shuffle(len(df_train))defcreate_model(num_features):# Create modelDense=tf.keras.layers.Densemodel=tf.keras.Sequential([Dense(100,activation=tf.nn.relu,kernel_initializer="uniform",input_dim=num_features,),Dense(75,activation=tf.nn.relu),Dense(50,activation=tf.nn.relu),Dense(25,activation=tf.nn.relu),Dense(3,activation=tf.nn.softmax),])# Compile Keras modeloptimizer=tf.keras.optimizers.RMSprop(lr=0.001)model.compile(loss="categorical_crossentropy",metrics=["accuracy"],optimizer=optimizer)returnmodel# Create the modelmodel=create_model(num_features=dataset_train._flat_shapes[0].dims[0].value)# Set up datasetsdataset_train=dataset_train.batch(args.batch_size)dataset_validation=dataset_validation.batch(args.batch_size)# Train the modelmodel.fit(dataset_train,epochs=args.epochs,validation_data=dataset_validation)tf.saved_model.save(model,os.getenv("AIP_MODEL_DIR"))
Depois que você criar o script, ele vai aparecer na pasta raiz do notebook:
Defina argumentos para seu script de treinamento
Transmita os seguintes argumentos de linha de comando para seu script de treinamento:
label_column: identifica a coluna nos dados que contém o que você quer prever. Nesse caso, essa coluna é species. Você definiu isso em uma variável chamada LABEL_COLUMN ao processar seus dados. Para mais informações, consulte Fazer o download, pré-processamento e divisão dos dados.
epochs: esse é o número de épocas usadas ao treinar o modelo. Uma
época é uma iteração dos dados ao treinar o modelo. Neste tutorial,
usamos 20 épocas.
batch_size: é o número de amostras processadas antes da atualização do modelo. Este tutorial usa um tamanho de lote de 10.
Para definir os argumentos que são transmitidos para seu script, execute o seguinte código:
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-25 UTC."],[],[],null,["# Create a training script\n\nTo create a custom model, you need a Python training script that creates and trains the custom model. You initialize your training job with the Python training script, then invoke the training job's [`run`](/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.CustomTrainingJob#google_cloud_aiplatform_CustomTrainingJob_run) method to run the script.\n\n\u003cbr /\u003e\n\nIn this topic, you create the training script, then specify command arguments\nfor your training script.\n\nCreate a training script\n------------------------\n\nIn this section, you create a training script. This script is a new file in your\nnotebook environment named `task.py`. Later in this tutorial, you pass this\nscript to the [`aiplatform.CustomTrainingJob`](/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.CustomTrainingJob) constructor. When the script runs, it does the following:\n\n- Loads the data in the BigQuery dataset you created.\n\n- Uses the\n [TensorFlow Keras API](https://www.tensorflow.org/api_docs/python/tf/keras) to\n build, compile, and train your model.\n\n- Specifies the number of epochs and the batch size to use when the Keras\n [`Model.fit`](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit)\n method is invoked.\n\n- Specifies where to save model artifacts using the `AIP_MODEL_DIR` environment\n variable. `AIP_MODEL_DIR` is set by Vertex AI and contains the URI of a\n directory for saving model artifacts. For more information, see [Environment\n variables for special Cloud Storage\n directories](/vertex-ai/docs/training/code-requirements#environment-variables).\n\n- Exports a TensorFlow\n [`SavedModel`](https://www.tensorflow.org/api_docs/python/tf/saved_model) to\n the model directory. For more information, see [Using the `SavedModel`\n format](https://www.tensorflow.org/guide/saved_model#the_savedmodel_format_on_disk)\n on the TensorFlow website.\n\nTo create your training script, run the following code in your notebook: \n\n %%writefile task.py\n\n import argparse\n import numpy as np\n import os\n\n import pandas as pd\n import tensorflow as tf\n\n from google.cloud import bigquery\n from google.cloud import storage\n\n # Read environmental variables\n training_data_uri = os.getenv(\"AIP_TRAINING_DATA_URI\")\n validation_data_uri = os.getenv(\"AIP_VALIDATION_DATA_URI\")\n test_data_uri = os.getenv(\"AIP_TEST_DATA_URI\")\n\n # Read args\n parser = argparse.ArgumentParser()\n parser.add_argument('--label_column', required=True, type=str)\n parser.add_argument('--epochs', default=10, type=int)\n parser.add_argument('--batch_size', default=10, type=int)\n args = parser.parse_args()\n\n # Set up training variables\n LABEL_COLUMN = args.label_column\n\n # See https://cloud.google.com/vertex-ai/docs/workbench/managed/executor#explicit-project-selection for issues regarding permissions.\n PROJECT_NUMBER = os.environ[\"CLOUD_ML_PROJECT_ID\"]\n bq_client = bigquery.Client(project=PROJECT_NUMBER)\n\n\n # Download a table\n def download_table(bq_table_uri: str):\n # Remove bq:// prefix if present\n prefix = \"bq://\"\n if bq_table_uri.startswith(prefix):\n bq_table_uri = bq_table_uri[len(prefix) :]\n\n # Download the BigQuery table as a dataframe\n # This requires the \"BigQuery Read Session User\" role on the custom training service account.\n table = bq_client.get_table(bq_table_uri)\n return bq_client.list_rows(table).to_dataframe()\n\n # Download dataset splits\n df_train = download_table(training_data_uri)\n df_validation = download_table(validation_data_uri)\n df_test = download_table(test_data_uri)\n\n def convert_dataframe_to_dataset(\n df_train: pd.DataFrame,\n df_validation: pd.DataFrame,\n ):\n df_train_x, df_train_y = df_train, df_train.pop(LABEL_COLUMN)\n df_validation_x, df_validation_y = df_validation, df_validation.pop(LABEL_COLUMN)\n\n y_train = tf.convert_to_tensor(np.asarray(df_train_y).astype(\"float32\"))\n y_validation = tf.convert_to_tensor(np.asarray(df_validation_y).astype(\"float32\"))\n\n # Convert to numpy representation\n x_train = tf.convert_to_tensor(np.asarray(df_train_x).astype(\"float32\"))\n x_test = tf.convert_to_tensor(np.asarray(df_validation_x).astype(\"float32\"))\n\n # Convert to one-hot representation\n num_species = len(df_train_y.unique())\n y_train = tf.keras.utils.to_categorical(y_train, num_classes=num_species)\n y_validation = tf.keras.utils.to_categorical(y_validation, num_classes=num_species)\n\n dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))\n dataset_validation = tf.data.Dataset.from_tensor_slices((x_test, y_validation))\n return (dataset_train, dataset_validation)\n\n # Create datasets\n dataset_train, dataset_validation = convert_dataframe_to_dataset(df_train, df_validation)\n\n # Shuffle train set\n dataset_train = dataset_train.shuffle(len(df_train))\n\n def create_model(num_features):\n # Create model\n Dense = tf.keras.layers.Dense\n model = tf.keras.Sequential(\n [\n Dense(\n 100,\n activation=tf.nn.relu,\n kernel_initializer=\"uniform\",\n input_dim=num_features,\n ),\n Dense(75, activation=tf.nn.relu),\n Dense(50, activation=tf.nn.relu),\n Dense(25, activation=tf.nn.relu),\n Dense(3, activation=tf.nn.softmax),\n ]\n )\n\n # Compile Keras model\n optimizer = tf.keras.optimizers.RMSprop(lr=0.001)\n model.compile(\n loss=\"categorical_crossentropy\", metrics=[\"accuracy\"], optimizer=optimizer\n )\n\n return model\n\n # Create the model\n model = create_model(num_features=dataset_train._flat_shapes[0].dims[0].value)\n\n # Set up datasets\n dataset_train = dataset_train.batch(args.batch_size)\n dataset_validation = dataset_validation.batch(args.batch_size)\n\n # Train the model\n model.fit(dataset_train, epochs=args.epochs, validation_data=dataset_validation)\n\n tf.saved_model.save(model, os.getenv(\"AIP_MODEL_DIR\"))\n\nAfter you create the script, it appears in the root folder of your notebook:\n\nDefine arguments for your training script\n-----------------------------------------\n\nYou pass the following command-line arguments to your training script:\n\n- `label_column` - This identifies the column in your data that contains what\n you want to predict. In this case, that column is `species`. You defined this\n in a variable named `LABEL_COLUMN` when you processed your data. For more\n information, see\n [Download, preprocess, and split the data](/vertex-ai/docs/tutorials/tabular-bq-prediction/create-dataset#download-process-public-dataset).\n\n- `epochs` - This is the number of epochs used when you train your model. An\n *epoch* is an iteration over the data when training your model. This tutorial\n uses 20 epochs.\n\n- `batch_size` - This is the number of samples that are processed before your\n model updates. This tutorial uses a batch size of 10.\n\nTo define the arguments that are passed to your script, run the following code: \n\n JOB_NAME = \"custom_job_unique\"\n\n EPOCHS = 20\n BATCH_SIZE = 10\n\n CMDARGS = [\n \"--label_column=\" + LABEL_COLUMN,\n \"--epochs=\" + str(EPOCHS),\n \"--batch_size=\" + str(BATCH_SIZE),\n ]"]]