Send feedback
Run LLM inference on Cloud Run GPUs with vLLM (services)
Stay organized with collections
Save and categorize content based on your preferences.
Preview
— GPU support for Cloud Run services
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section
of the Service Specific Terms .
Pre-GA features are available "as is" and might have limited support.
For more information, see the
launch stage descriptions .
The following codelab shows how to run a backend service that runs vLLM , which is an
inference engine for production systems, along with Google's Gemma 2 , which is
a 2 billion parameters instruction-tuned model.
See the entire codelab at Run LLM inference on Cloud Run GPUs with vLLM .
Send feedback
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-25 UTC.
[{
"type": "thumb-down",
"id": "hardToUnderstand",
"label":"Hard to understand"
},{
"type": "thumb-down",
"id": "incorrectInformationOrSampleCode",
"label":"Incorrect information or sample code"
},{
"type": "thumb-down",
"id": "missingTheInformationSamplesINeed",
"label":"Missing the information/samples I need"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Other"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Easy to understand"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Solved my problem"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Other"
}]
Need to tell us more?
{"lastModified": "Last updated 2024-09-25 UTC."}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-09-25 UTC."]]