Learn how to get started with Gen AI evaluation service using the Google Google Cloud console.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Verify that billing is enabled for your Google Cloud project.
-
Make sure that you have the following role or roles on the project: Storage Admin
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Verify that billing is enabled for your Google Cloud project.
-
Make sure that you have the following role or roles on the project: Storage Admin
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
Evaluate your model
To evaluate your model:
In the Google Cloud console, go to the Gen AI Evaluation page.
Click New evaluation to open the evaluation page.
For Define evaluation dataset, select an option:
Upload file: Click Upload to upload a CSV or JSONL file. The dataset should contain prompts and responses of a model or values, with a maximum of 200 rows.
Generate data: Enter a Prompt template to guide the Gen AI evaluation service in generating a dataset. For more information, see Use prompt templates.
Define variables to generate: Specify variables to generate and descriptions of the variable to guide generation. If needed, click Add another variable description.
Enter a Number of samples to generate.
Use model logs: Use the snapshot of prompts and responses from the logged traffic of the selected model. You can only use this option if you have request-response logs enabled on a deployed model in Vertex AI.
Select your Model.
Select a Region for your model.
Enter a Sampling count.
For Define model responses to evaluate, select an option:
From dataset: If you uploaded a dataset, select a Response column.
From model: If you're using model logs as the evaluation dataset, the Gen AI evaluation service uses the model responses from the model logs.
Call model: Select a model. The Gen AI evaluation service runs prompts on the selected model and uses the responses for evaluation.
(Optional) For Auto-generated evaluation metrics, you can Specify custom instructions to guide the rubrics generated from each prompt. For example,
Evaluate the dataset on cultural sensitivity to the countries {name}
. For more information, see Define your evaluation metrics.For Name and output directory, enter the following:
Evaluation name: Enter a name for your evaluation.
Output private data path: Enter the name of a Cloud Storage bucket where you want to store your evaluation, or click Browse to choose the bucket.
Click Evaluate.
View your evaluation results
To view an evaluation result:
In the Google Cloud console, go to the Gen AI Evaluation page.
Click the evaluation name.
For each prompt in your evaluation dataset, the model's response displays along with the evaluation results.