Vertex AI 支持一组精选的由 Google 合作伙伴开发的模型。合作伙伴模型可与 Vertex AI 搭配,作为模型即服务 (MaaS) 使用,并作为托管式 API 提供。使用合作伙伴模型时,您可以继续向 Vertex AI 端点发送请求。合作伙伴模型是无服务器服务,因此您无需预配或管理基础设施。
您可以使用 Model Garden 发现合作伙伴模型,还可以使用 Model Garden 部署模型。如需了解详情,请参阅在 Model Garden 中探索 AI 模型。您可以在 Model Garden 中的模型卡片上找到每个可用合作伙伴模型的信息,但本指南仅记录了与 Vertex AI 搭配,作为 MaaS 执行的第三方模型。
Anthropic Claude 和 Mistral 模型是可在 Vertex AI 上使用的第三方托管式模型的示例。
Vertex AI 合作伙伴模型价格(含容量保证)
Google 为某些合作伙伴模型提供预配的吞吐量,以便以固定费用为您的模型预留吞吐量。您可以决定吞吐量容量以及在哪些区域预留该容量。由于预配吞吐量请求的优先级高于标准随用随付请求,因此预配吞吐量可提高可用性。当系统过载时,只要吞吐量保持在预留吞吐量容量范围内,您的请求仍可完成。如需了解详情或订阅该服务,请与销售人员联系。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-18。"],[],[],null,["# Vertex AI partner models for MaaS\n\nVertex AI supports a curated list of models developed by Google partners.\nPartner models can be used with [Vertex AI](/vertex-ai) as a model as a\nservice (MaaS) and are offered as a managed API. When you use a partner model,\nyou continue to send your requests to Vertex AI endpoints. Partner models\nare serverless so there's no need to provision or manage infrastructure.\n\nPartner models can be discovered using Model Garden. You can also\ndeploy models using Model Garden. For more information, see [Explore AI\nmodels in\nModel Garden](/vertex-ai/generative-ai/docs/model-garden/explore-models).\nWhile information about each available partner model can be found on its model\ncard in Model Garden, only third-party models that perform as a\nMaaS with Vertex AI are documented in this guide.\n\nAnthropic's Claude and Mistral models are examples of third-party managed models\nthat are available to use on Vertex AI.\n\nPartner models\n--------------\n\nThe following partner models are offered as managed APIs on Vertex AI\nModel Garden (MaaS): \n\nVertex AI partner model pricing with capacity assurance\n-------------------------------------------------------\n\nGoogle offers provisioned throughput for some partner models that reserves\nthroughput capacity for your models for a fixed fee. You decide on the\nthroughput capacity and in which regions to reserve that capacity. Because\nprovisioned throughput requests are prioritized over the standard pay-as-you-go\nrequests, provisioned throughput provides increased availability. When the\nsystem is overloaded, your requests can still be completed as long as the\nthroughput remains under your reserved throughput capacity. For more information\nor to subscribe to the service, [Contact sales](/contact).\n\nRegional and global endpoints\n-----------------------------\n\nFor regional endpoints, requests are served from your specified region. In cases\nwhere you have data residency requirements or if a model doesn't support the\nglobal endpoint, use the regional endpoints.\n\nWhen you use the global endpoint, Google can process and serve your requests\nfrom any region that is supported by the model that you are using, which might\nresult in higher latency in some cases. The global endpoint helps improve\noverall availability and helps reduce errors.\n\nThere is no price difference with the regional endpoints when you use the global\nendpoint. However, the global endpoint quotas and supported model capabilities\ncan differ from the regional endpoints. For more information, view the related\nthird-party model page.\n\n### Specify the global endpoint\n\nTo use the global endpoint, set the region to `global`.\n\nFor example, the request URL for a curl command uses the following format:\n`https://aiplatform.googleapis.com/v1/projects/`\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e`/locations/`**global** `/publishers/`\u003cvar translate=\"no\"\u003ePUBLISHER_NAME\u003c/var\u003e`/models/`\u003cvar translate=\"no\"\u003eMODEL_NAME\u003c/var\u003e\n\nFor the Vertex AI SDK, a regional endpoint is the default. Set the\nregion to `GLOBAL` to use the global endpoint.\n\n### Supported models\n\nThe global endpoint is available for the following models:\n\n- [Claude Opus 4.1](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude Opus 4](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude Sonnet 4](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude 3.7 Sonnet](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude 3.5 Sonnet v2](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n\n\u003cbr /\u003e\n\n| **Note:** Prompt Caching is supported when using the global endpoint. Provisioned Throughput isn't supported when using the global endpoint.\n\n### Restrict global API endpoint usage\n\nTo help enforce the use of regional endpoints, use the\n`constraints/gcp.restrictEndpointUsage` organization policy constraint to block\nrequests to the global API endpoint. For more information, see\n[Restricting endpoint usage](/assured-workloads/docs/restrict-endpoint-usage).\n\nGrant user access to partner models\n-----------------------------------\n\nFor you to enable partner models and make a prompt request, a Google Cloud\nadministrator must [set the required permissions](#set-permissions) and [verify\nthe organization policy allows the use of required\nAPIs](#set-organization-policy).\n\n### Set required permissions to use partner models\n\nThe following roles and permissions are required to use partner models:\n\n- You must have the Consumer Procurement Entitlement Manager\n Identity and Access Management (IAM) role. Anyone who's been granted this role can\n enable partner models in Model Garden.\n\n- You must have the `aiplatform.endpoints.predict` permission. This permission\n is included in the Vertex AI User IAM role. For more\n information, see [Vertex AI\n User](/vertex-ai/docs/general/access-control#aiplatform.user) and\n [Access control](/vertex-ai/generative-ai/docs/access-control#permissions).\n\n### Console\n\n1. To grant the Consumer Procurement Entitlement Manager IAM\n roles to a user, go to the **IAM** page.\n\n [Go to IAM](https://console.cloud.google.com/projectselector/iam-admin/iam?supportedpurview=)\n2. In the **Principal** column, find the user\n [principal](/iam/docs/overview#concepts_related_identity) for which you\n want to enable access to partner models, and then click\n edit **Edit principal** in that row.\n\n3. In the **Edit access** pane, click\n add **Add another role**.\n\n4. In **Select a role** , select **Consumer Procurement Entitlement Manager**.\n\n5. In the **Edit access** pane, click\n add **Add another role**.\n\n6. In **Select a role** , select **Vertex AI User**.\n\n7. Click **Save**.\n\n### gcloud\n\n\n1. In the Google Cloud console, activate Cloud Shell.\n\n [Activate Cloud Shell](https://console.cloud.google.com/?cloudshell=true)\n2. Grant the Consumer Procurement Entitlement Manager role that's required\n to enable partner models in Model Garden\n\n gcloud projects add-iam-policy-binding \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e \\\n --member=\u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e --role=roles/consumerprocurement.entitlementManager\n\n3. Grant the Vertex AI User role that includes the\n `aiplatform.endpoints.predict` permission which is required to make\n prompt requests:\n\n gcloud projects add-iam-policy-binding \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e \\\n --member=\u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e --role=roles/aiplatform.user\n\n Replace \u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e with the identifier for\n the principal. The identifier takes the form\n `user|group|serviceAccount:email` or `domain:domain`---for\n example, `user:cloudysanfrancisco@gmail.com`,\n `group:admins@example.com`,\n `serviceAccount:test123@example.domain.com`, or\n `domain:example.domain.com`.\n\n The output is a list of policy bindings that includes the following: \n\n - members:\n - user:\u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e\n role: roles/roles/consumerprocurement.entitlementManager\n\n For more information, see\n [Grant a single role](/iam/docs/granting-changing-revoking-access#grant-single-role)\n and\n [`gcloud projects add-iam-policy-binding`](/sdk/gcloud/reference/projects/add-iam-policy-binding).\n\n### Set the organization policy for partner model access\n\nTo enable partner models, your organization policy must allow the following\nAPI: Cloud Commerce Consumer Procurement API - `cloudcommerceconsumerprocurement.googleapis.com`\n\nIf your organization sets an organization policy to\n[restrict service usage](/resource-manager/docs/organization-policy/restricting-resources),\nthen an organization administrator must verify that\n`cloudcommerceconsumerprocurement.googleapis.com` is allowed by\n[setting the organization policy](/resource-manager/docs/organization-policy/restricting-resources#setting_the_organization_policy).\n\nAlso, if you have an organization policy that restricts model usage in\nModel Garden, the policy must allow access to partner models. For more\ninformation, see [Control model\naccess](/vertex-ai/generative-ai/docs/control-model-access).\n\n### Partner model regulatory compliance\n\nThe [certifications](/security/compliance/services-in-scope) for\n[Generative AI on Vertex AI](/vertex-ai/generative-ai/docs/overview) continue to\napply when partner models are used as a managed API using Vertex AI.\nIf you need details about the models themselves, additional information can be\nfound in the respective Model Card, or you can contact the respective model\npublisher.\n\nYour data is stored at rest within the selected region or multi-region for\npartner models on Vertex AI, but the regionalization of data\nprocessing may vary. For a detailed list of partner models' data processing\ncommitments, see [Data residency for partner\nmodels](/vertex-ai/generative-ai/docs/learn/locations#ml-processing-partner-models).\n\nCustomer prompts and model responses are not shared with third-parties when\nusing the Vertex AI API, including partner models. Google only processes\nCustomer Data as instructed by the Customer, which is further described in our\n[Cloud Data Processing Addendum](/terms/data-processing-addendum)."]]