A partire dal 29 aprile 2025, i modelli Gemini 1.5 Pro e Gemini 1.5 Flash non sono disponibili nei progetti che non li hanno mai utilizzati, inclusi i nuovi progetti. Per maggiori dettagli, vedi Versioni e ciclo di vita dei modelli.

Questa pagina è stata tradotta dall'API Cloud Translation.

Tutorial: esegui la valutazione utilizzando l'SDK Python

Questa pagina mostra come eseguire una valutazione basata su modelli con Gen AI evaluation service utilizzando l'SDK Vertex AI Python.

Prima di iniziare

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.
Installa l'SDK Vertex AI Python con la dipendenza del servizio di valutazione Gen AI:
```
!pip install google-cloud-aiplatform[evaluation]
```
Configura le tue credenziali. Se esegui questa guida rapida in Colaboratory, esegui questo comando:
```
from google.colab import auth
auth.authenticate_user()
```
Per altri ambienti, consulta Autenticazione in Vertex AI.

Importare librerie

Importa le librerie e configura il progetto e la posizione.

import pandas as pd

import vertexai
from vertexai.evaluation import EvalTask, PointwiseMetric, PointwiseMetricPromptTemplate
from google.cloud import aiplatform

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"
EXPERIMENT_NAME = "EXPERIMENT_NAME"

vertexai.init(
    project=PROJECT_ID,
    location=LOCATION,
)

Tieni presente che EXPERIMENT_NAME può contenere solo caratteri alfanumerici minuscoli e trattini, fino a un massimo di 127 caratteri.

Configura le metriche di valutazione in base ai tuoi criteri

La seguente definizione di metrica valuta la qualità del testo generato da un modello linguistico di grandi dimensioni in base a due criteri: Fluency e Entertaining. Il codice definisce una metrica chiamata custom_text_quality utilizzando questi due criteri:

custom_text_quality = PointwiseMetric(
    metric="custom_text_quality",
    metric_prompt_template=PointwiseMetricPromptTemplate(
        criteria={
            "fluency": (
                "Sentences flow smoothly and are easy to read, avoiding awkward"
                " phrasing or run-on sentences. Ideas and sentences connect"
                " logically, using transitions effectively where needed."
            ),
            "entertaining": (
                "Short, amusing text that incorporates emojis, exclamations and"
                " questions to convey quick and spontaneous communication and"
                " diversion."
            ),
        },
        rating_rubric={
            "1": "The response performs well on both criteria.",
            "0": "The response is somewhat aligned with both criteria",
            "-1": "The response falls short on both criteria",
        },
    ),
)

Preparare il set di dati

Aggiungi il seguente codice per preparare il set di dati:

responses = [
    # An example of good custom_text_quality
    "Life is a rollercoaster, full of ups and downs, but it's the thrill that keeps us coming back for more!",
    # An example of medium custom_text_quality
    "The weather is nice today, not too hot, not too cold.",
    # An example of poor custom_text_quality
    "The weather is, you know, whatever.",
]

eval_dataset = pd.DataFrame({
    "response" : responses,
})

Esegui la valutazione con il tuo set di dati

Esegui la valutazione:

eval_task = EvalTask(
    dataset=eval_dataset,
    metrics=[custom_text_quality],
    experiment=EXPERIMENT_NAME
)

pointwise_result = eval_task.evaluate()

Visualizza i risultati della valutazione per ogni risposta nel metrics_table DataFrame Pandas:

pointwise_result.metrics_table

Esegui la pulizia

Per evitare che al tuo account Google Cloud vengano addebitati costi relativi alle risorse utilizzate in questa pagina, segui questi passaggi.

Elimina ExperimentRun creato dalla valutazione:

aiplatform.ExperimentRun(
    run_name=pointwise_result.metadata["experiment_run"],
    experiment=pointwise_result.metadata["experiment"],
).delete()