Hello custom training: Train a custom image classification model
Stay organized with collections
Save and categorize content based on your preferences.
This page shows you how to run a TensorFlow Keras training application on
Vertex AI. This particular model trains an image classification
model that can classify flowers by type.
Each page assumes that you have already performed the instructions from the
previous pages of the tutorial.
The rest of this document assumes that you are using the same Cloud Shell
environment that you created when following the first page of this
tutorial. If your original Cloud Shell session is no
longer open, you can return to the environment by doing the following:
In the Google Cloud console, activate Cloud Shell.
Click add_box
Create to open the Train new model pane.
On the Choose training method step, do the following:
In the Dataset drop-down list, select No managed dataset. This
particular training application loads data from the TensorFlow
Datasets library rather than a managed Vertex AI
dataset.
Ensure that Custom training (advanced) is selected.
Click Continue.
On the Model details step, in the Name field, enter
hello_custom. Click Continue.
On the Training container step, provide Vertex AI with
information it needs to use the training package that you uploaded to
Cloud Storage:
Select Prebuilt container.
In the Model framework drop-down list, select TensorFlow.
In the Model framework version drop-down list, select 2.3.
In the Package location field, enter
cloud-samples-data/ai-platform/hello-custom/hello-custom-sample-v1.tar.gz.
In the Python module field, enter trainer.task. trainer is the
name of the Python package in your tarball, and task.py contains your
training code. Therefore, trainer.task is the name of the module that
you want Vertex AI to run.
In the Model output directory field, click Browse. Do the
following in the Select folder pane:
Navigate to your Cloud Storage bucket.
Click Create new folder create_new_folder.
Name the new folder output. Then click Create.
Click Select.
Confirm that field has the value
gs://BUCKET_NAME/output, where BUCKET_NAME
is the name of your Cloud Storage bucket.
This value gets passed to Vertex AI in the
baseOutputDirectory API
field, which sets
several environment variables that your training application can access
when it runs.
For example, when you set this field to
gs://BUCKET_NAME/output, Vertex AI sets
the AIP_MODEL_DIR environment variable to
gs://BUCKET_NAME/output/model. At the end of training,
Vertex AI uses any artifacts in the AIP_MODEL_DIR directory
to create a model resource.
On the optional Hyperparameters step, make sure that the Enable
hyperparameter tuning checkbox is cleared. This tutorial does not use
hyperparameter tuning. Click Continue.
On the Compute and pricing step, allocate resources for the custom
training job:
In the Region drop-down list, select us-central1 (Iowa).
In the Machine type drop-down list, select n1-standard-4 from the
Standard section.
Do not add any accelerators or worker pools for this tutorial. Click
Continue.
On the Prediction container step, provide Vertex AI with
information it needs to serve predictions:
Select Prebuilt container.
In the Prebuilt container settings section, do the following:
In the Model framework drop-down list, select TensorFlow.
In the Model framework version drop-down list, select 2.3.
In the Accelerator type drop-down list, select None.
Confirm that Model directory field has the value
gs://BUCKET_NAME/output, where
BUCKET_NAME is the name of your Cloud Storage
bucket. This matches the Model output directory value that you
provided in a previous step.
Leave the fields in the Predict schemata section blank.
Click Start training to start the custom training pipeline.
You can now view your new training pipeline, which is named hello_custom, on
the Training page. (You might need to refresh the page.) The training
pipeline does two main things:
The training pipeline creates a custom job resource named
hello_custom-custom-job. After a few moments, you can view this resource
on the Custom jobs page of the Training section:
The custom job runs the training application using the computing resources
that you specified in this section.
After the custom job completes, the training pipeline finds the artifacts
that your training application creates in the output/model/ directory of
your Cloud Storage bucket. It uses these artifacts to create
a model resource.
Monitor training
To view training logs, do the following:
In the Google Cloud console, in the Vertex AI section, go to
the Custom jobs page.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-03 UTC."],[],[],null,["# Hello custom training: Train a custom image classification model\n\n| To learn more,\n| run the following notebooks in the environment of your choice:\n|\n| - \"Use the Vertex AI SDK for Python to train and deploy a custom image classification model for batch prediction.\":\n|\n| [Open in Colab](https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/custom/sdk-custom-image-classification-batch.ipynb)\n|\n|\n| \\|\n|\n| [Open in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fcustom%2Fsdk-custom-image-classification-batch.ipynb)\n|\n|\n| \\|\n|\n| [Open\n| in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fcustom%2Fsdk-custom-image-classification-batch.ipynb)\n|\n|\n| \\|\n|\n| [View on GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/custom/sdk-custom-image-classification-batch.ipynb)\n| - \"Use the Vertex AI SDK for Python to train and deploy a custom image classification model for online prediction.\":\n|\n| [Open in Colab](https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/custom/sdk-custom-image-classification-online.ipynb)\n|\n|\n| \\|\n|\n| [Open in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fcustom%2Fsdk-custom-image-classification-online.ipynb)\n|\n|\n| \\|\n|\n| [Open\n| in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fcustom%2Fsdk-custom-image-classification-online.ipynb)\n|\n|\n| \\|\n|\n| [View on GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/custom/sdk-custom-image-classification-online.ipynb)\n\nThis page shows you how to run a TensorFlow Keras training application on\nVertex AI. This particular model trains an image classification\nmodel that can classify flowers by type.\nThis tutorial has several pages:\n\n\u003cbr /\u003e\n\n1. [Setting up your project and environment.](/vertex-ai/docs/tutorials/image-classification-custom)\n\n2. Training a custom image classification model.\n\n3. [Serving predictions from a custom image classification\n model.](/vertex-ai/docs/tutorials/image-classification-custom/serving)\n\n4. [Cleaning up your project.](/vertex-ai/docs/tutorials/image-classification-custom/cleanup)\n\nEach page assumes that you have already performed the instructions from the\nprevious pages of the tutorial.\nThe rest of this document assumes that you are using the same Cloud Shell environment that you created when following the [first page of this\ntutorial](/vertex-ai/docs/tutorials/image-classification-custom). If your original Cloud Shell session is no longer open, you can return to the environment by doing the following:\n\n\u003cbr /\u003e\n\n1. In the Google Cloud console, activate Cloud Shell.\n\n [Activate Cloud Shell](https://console.cloud.google.com/?cloudshell=true)\n2. In the Cloud Shell session, run the following command:\n\n ```bash\n cd hello-custom-sample\n ```\n\nRun a custom training pipeline\n------------------------------\n\nThis section describes using the training package that you uploaded to\nCloud Storage to run a Vertex AI custom training\npipeline.\n\n1. In the Google Cloud console, in the Vertex AI section, go to\n the **Training pipelines** page.\n\n [Go to Training pipelines](https://console.cloud.google.com/vertex-ai/training/training-pipelines)\n2. Click **add_box\n Create** to open the **Train new model** pane.\n\n3. On the **Choose training method** step, do the following:\n\n 1. In the **Dataset** drop-down list, select **No managed dataset** . This\n particular training application loads data from the [TensorFlow\n Datasets](https://www.tensorflow.org/datasets/) library rather than a managed Vertex AI\n dataset.\n\n 2. Ensure that **Custom training (advanced)** is selected.\n\n Click **Continue**.\n4. On the **Model details** step, in the **Name** field, enter\n `hello_custom`. Click **Continue**.\n\n5. On the **Training container** step, provide Vertex AI with\n information it needs to use the training package that you uploaded to\n Cloud Storage:\n\n 1. Select **Prebuilt container**.\n\n 2. In the **Model framework** drop-down list, select **TensorFlow**.\n\n 3. In the **Model framework version** drop-down list, select **2.3**.\n\n 4. In the **Package location** field, enter\n `cloud-samples-data/ai-platform/hello-custom/hello-custom-sample-v1.tar.gz`.\n\n 5. In the **Python module** field, enter `trainer.task`. `trainer` is the\n name of the Python package in your tarball, and `task.py` contains your\n training code. Therefore, `trainer.task` is the name of the module that\n you want Vertex AI to run.\n\n 6. In the **Model output directory** field, click **Browse** . Do the\n following in the **Select folder** pane:\n\n 1. Navigate to your Cloud Storage bucket.\n\n 2. Click **Create new folder create_new_folder**.\n\n 3. Name the new folder `output`. Then click **Create**.\n\n 4. Click **Select**.\n\n Confirm that field has the value\n `gs://`\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e`/output`, where \u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e\n is the name of your Cloud Storage bucket.\n\n This value gets passed to Vertex AI in the\n [`baseOutputDirectory` API\n field](/vertex-ai/docs/reference/rest/v1/CustomJobSpec#FIELDS.base_output_directory), which sets\n several environment variables that your training application can access\n when it runs.\n\n For example, when you set this field to\n `gs://`\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e`/output`, Vertex AI sets\n the `AIP_MODEL_DIR` environment variable to\n `gs://`\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e`/output/model`. At the end of training,\n Vertex AI uses any artifacts in the `AIP_MODEL_DIR` directory\n to create a model resource.\n\n Learn more about the [environment variables set by this\n field](/vertex-ai/docs/training/code-requirements#environment-variables).\n\n Click **Continue**.\n6. On the optional **Hyperparameters** step, make sure that the **Enable\n hyperparameter tuning** checkbox is cleared. This tutorial does not use\n hyperparameter tuning. Click **Continue**.\n\n7. On the **Compute and pricing** step, allocate resources for the custom\n training job:\n\n 1. In the **Region** drop-down list, select **us-central1 (Iowa)**.\n\n 2. In the **Machine type** drop-down list, select **n1-standard-4** from the\n **Standard** section.\n\n Do not add any accelerators or worker pools for this tutorial. Click\n **Continue**.\n8. On the **Prediction container** step, provide Vertex AI with\n information it needs to serve predictions:\n\n 1. Select **Prebuilt container**.\n\n 2. In the **Prebuilt container settings** section, do the following:\n\n 1. In the **Model framework** drop-down list, select **TensorFlow**.\n\n 2. In the **Model framework version** drop-down list, select **2.3**.\n\n 3. In the **Accelerator type** drop-down list, select **None**.\n\n 4. Confirm that **Model directory** field has the value\n `gs://`\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e`/output`, where\n \u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e is the name of your Cloud Storage\n bucket. This matches the **Model output directory** value that you\n provided in a previous step.\n\n 3. Leave the fields in the **Predict schemata** section blank.\n\n9. Click **Start training** to start the custom training pipeline.\n\nYou can now view your new *training pipeline* , which is named `hello_custom`, on\nthe **Training** page. (You might need to refresh the page.) The training\npipeline does two main things:\n\n1. The training pipeline creates a *custom job* resource named\n `hello_custom-custom-job`. After a few moments, you can view this resource\n on the **Custom jobs** page of the **Training** section:\n\n [Go to Custom jobs](https://console.cloud.google.com/vertex-ai/training/custom-jobs)\n\n The custom job runs the training application using the computing resources\n that you specified in this section.\n2. After the custom job completes, the training pipeline finds the artifacts\n that your training application creates in the `output/model/` directory of\n your Cloud Storage bucket. It uses these artifacts to create\n a *model* resource.\n\n### Monitor training\n\nTo view training logs, do the following:\n\n1. In the Google Cloud console, in the Vertex AI section, go to\n the **Custom jobs** page.\n\n [Go to Custom jobs](https://console.cloud.google.com/vertex-ai/training/custom-jobs)\n2. To view details for the `CustomJob` that you just created, click\n `hello_custom-custom-job` in the list.\n\n3. On the job details page, click **View logs**.\n\n### View your trained model\n\nWhen the custom training pipeline completes, you can find the trained model in\nthe Google Cloud console, in the Vertex AI section, on the\n**Models** page.\n\n[Go to Models](https://console.cloud.google.com/vertex-ai/models)\n\nThe model has the name `hello_custom`.\n\nWhat's next\n-----------\n\nFollow the [next page of this tutorial](/vertex-ai/docs/tutorials/image-classification-custom/serving) to serve\npredictions from your trained ML model."]]