Python 版 RAG 快速入门

本页面介绍了如何使用 Vertex AI SDK 运行 Vertex AI RAG Engine 任务。

您也可以使用此笔记本Vertex AI RAG Engine 简介进行学习。

准备 Google Cloud 控制台

如需使用 Vertex AI RAG 引擎,请执行以下操作:

  1. 安装 Vertex AI SDK for Python

  2. 在 Google Cloud 控制台中运行以下命令以设置项目。

    gcloud config set {project}

  3. 运行此命令以授权您的登录。

    gcloud auth application-default login

运行 Vertex AI RAG Engine

将以下示例代码复制并粘贴到 Google Cloud 控制台中,以便运行 Vertex AI RAG Engine。

Python

from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# Create a RAG Corpus, Import Files, and Generate a response

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# paths = ["https://drive.google.com/file/d/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Create RagCorpus
# Configure embedding model, for example "text-embedding-005".
embedding_model_config = rag.RagEmbeddingModelConfig(
      vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
          publisher_model="publishers/google/models/text-embedding-005"
      )
)

# Setup a display name of the corpus
display_name = "your_display_name"
backend_config = rag.RagVectorDbConfig(rag_embedding_model_config=embedding_model_config)

rag_corpus = rag.create_corpus(
    display_name=display_name,
    backend_config=backend_config,
)

# List the rag corpus you just created
rag.list_corpora()

# Import Files to the RagCorpus
# choose a corpus to import files to, you can use rag_corpus.name for just created corpus
# or use the name in list_corpora()
corpus_name = rag_corpus.name

transformation_config = rag.TransformationConfig(
      chunking_config=rag.ChunkingConfig(
          chunk_size=512,
          chunk_overlap=100,
      ),
  )

rag.import_files(
    corpus_name,
    paths,
    transformation_config=transformation_config, # Optional
    max_embedding_requests_per_min=1000,  # Optional
)

# Alternatively, you can use async import
response = await rag.import_files_async(
  corpus_name,
  paths,
  transformation_config=transformation_config, # Optional
  max_embedding_requests_per_min=1000,  # Optional
)
result = await response.result()
print(result)

# List the files in the rag corpus
rag.list_files(corpus_name)

# Direct context retrieval
rag_retrieval_config=rag.RagRetrievalConfig(
    top_k=3,  # Optional
    filter=rag.Filter(vector_distance_threshold=0.5)  # Optional
)
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is RAG and why it is helpful?",
    rag_retrieval_config=rag_retrieval_config,
)
print(response)

# Enhance generation
# Create a RAG retrieval tool
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,  # Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag_retrieval_config,
        ),
    )
)
# Create a gemini model instance
rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)

# Generate response
response = rag_model.generate_content("What is RAG and why it is helpful?")
print(response.text)
# Example response:
#   RAG stands for Retrieval-Augmented Generation.
#   It's a technique used in AI to enhance the quality of responses
# ...

后续步骤