Method: projects.locations.publishers.models.serverStreamingPredict

Perform a server-side streaming online prediction request for Vertex LLM streaming.

HTTP request

POST https://{service-endpoint}/v1beta1/{endpoint}:serverStreamingPredict

Where {service-endpoint} is one of the supported service endpoints.

Path parameters



Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

Request body

The request body contains data with the following structure:

JSON representation
  "inputs": [
      object (Tensor)
  "parameters": {
    object (Tensor)

object (Tensor)

The prediction input.


object (Tensor)

The parameters that govern the prediction.

Response body

If successful, the response body contains a stream of StreamingPredictResponse instances.

Authorization scopes

Requires one of the following OAuth scopes:


For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.