Stay organized with collections
Save and categorize content based on your preferences.
You can install additional components like Presto when you create a Dataproc
cluster using the
Optional components
feature. This page describes how you can optionally install Presto component
on a Dataproc cluster.
Presto (Trino) is an open
source distributed SQL query engine. The Presto server and
Web UI are by default available on port 8060 (or port 7778 if Kerberos is
enabled) on the cluster's first master node.
By default, Presto on Dataproc is configured to work with Hive, BigQuery,
Memory, TPCH and TPCDSconnectors.
After creating a cluster with the Presto component, you can run queries:
To create a Dataproc cluster that includes the Presto component,
use the
gcloud dataproc clusters createcluster-name
command with the --optional-components flag.
Add the --properties flag to the
gcloud dataproc clusters create command to set
presto, presto-jvm and presto-catalog config properties.
Application properties: Use cluster properties with the
presto: prefix to configure
Presto application properties—for example, --properties="presto:join-distribution-type=AUTOMATIC".
JVM configuration properties: Use cluster properties with the
presto-jvm: prefix to configure JVM properties for Presto
coordinator and worker Java processes—for example,
--properties="presto-jvm:XX:+HeapDumpOnOutOfMemoryError".
Creating new catalogs and adding catalog properties: Use
presto-catalog:catalog-name.property-name
to configure Presto catalogs.
Example: The following `properties` flag can be used
with the `gcloud dataproc clusters create` command to create a Presto cluster
with a "prodhive" Hive catalog. A prodhive.properties file will
be created under/usr/lib/presto/etc/catalog/ to enable the
prodhive catalog.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThe Presto component, which is Trino in Dataproc image versions 2.1 and later, is an optional distributed SQL query engine that can be installed on Dataproc clusters.\u003c/p\u003e\n"],["\u003cp\u003ePresto on Dataproc is pre-configured to work with \u003ccode\u003eHive\u003c/code\u003e, \u003ccode\u003eBigQuery\u003c/code\u003e, \u003ccode\u003eMemory\u003c/code\u003e, \u003ccode\u003eTPCH\u003c/code\u003e, and \u003ccode\u003eTPCDS\u003c/code\u003e connectors, enabling users to query various data sources.\u003c/p\u003e\n"],["\u003cp\u003eYou can install Presto on a Dataproc cluster using the \u003ccode\u003egcloud dataproc clusters create\u003c/code\u003e command with the \u003ccode\u003e--optional-components=PRESTO\u003c/code\u003e flag, and it is compatible with Dataproc version 1.3 and later.\u003c/p\u003e\n"],["\u003cp\u003eThe Presto Web UI and server can be accessed via port \u003ccode\u003e8060\u003c/code\u003e (or \u003ccode\u003e7778\u003c/code\u003e if Kerberos is enabled) on the cluster's first master node, and you can enable access through the component gateway by setting the \u003ccode\u003e--enable-component-gateway\u003c/code\u003e flag when creating a cluster.\u003c/p\u003e\n"],["\u003cp\u003eCustom properties for Presto application, JVM, and catalogs can be configured using the \u003ccode\u003e--properties\u003c/code\u003e flag with appropriate prefixes (e.g., \u003ccode\u003epresto:\u003c/code\u003e, \u003ccode\u003epresto-jvm:\u003c/code\u003e, \u003ccode\u003epresto-catalog:\u003c/code\u003e) during cluster creation.\u003c/p\u003e\n"]]],[],null,["# Dataproc optional Presto component\n\n| **Note:** The [Presto Optional Component](/dataproc/docs/concepts/components/presto) is available as [Trino Optional Component](/dataproc/docs/concepts/components/trino) in 2.1 and later image versions.\n\nYou can install additional components like Presto when you create a Dataproc\ncluster using the\n[Optional components](/dataproc/docs/concepts/components/overview#available_optional_components)\nfeature. This page describes how you can optionally install Presto component\non a Dataproc cluster.\n\nPresto ([Trino](https://trino.io/)) is an open\nsource distributed SQL query engine. The Presto server and\nWeb UI are by default available on port `8060` (or port `7778` if Kerberos is\nenabled) on the cluster's first master node.\n\nBy default, Presto on Dataproc is configured to work with `Hive`, `BigQuery`,\n`Memory`, `TPCH` and `TPCDS` [connectors](https://trino.io/docs/current/connector.html).\n\nAfter creating a cluster with the Presto component, you can run queries:\n\n- from a local terminal with the [`gcloud dataproc jobs submit presto`](/sdk/gcloud/reference/dataproc/jobs/submit/presto) command\n- from a terminal window on the cluster's first master node using the `presto` CLI (Command Line Interface)---see [Use Trino with Dataproc](/dataproc/docs/tutorials/trino-dataproc)\n\nInstall the component\n---------------------\n\nInstall the component when you create a Dataproc cluster.\nComponents can be added to clusters created with\nDataproc [version 1.3](/dataproc/docs/concepts/versioning/dataproc-release-1.3)\nand later.\n\nSee\n[Supported Dataproc versions](/dataproc/docs/concepts/versioning/dataproc-versions#supported_cloud_dataproc_versions)\nfor the component version included in each Dataproc image release. \n\n### gcloud command\n\nTo create a Dataproc cluster that includes the Presto component,\nuse the\n[gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create) \u003cvar translate=\"no\"\u003ecluster-name\u003c/var\u003e\ncommand with the `--optional-components` flag.\nWhen creating the cluster, use [gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create) command with the `--enable-component-gateway` flag, as shown below, to enable connecting to the Presto Web UI using the [Component Gateway](/dataproc/docs/concepts/accessing/dataproc-gateways). \n\n```\ngcloud dataproc clusters create cluster-name \\\n --optional-components=PRESTO \\\n --region=region \\\n --enable-component-gateway \\\n ... other flags\n```\n\n#### Configuring properties\n\nAdd the [`--properties`](/dataproc/docs/concepts/configuring-clusters/cluster-properties#how_the_properties_flag_works) flag to the\n`gcloud dataproc clusters create` command to set\npresto, presto-jvm and presto-catalog config properties.\n\n- **Application properties:** Use cluster properties with the `presto:` prefix to configure [Presto application properties](https://trino.io/docs/current/admin/properties.html)---for example, `--properties=\"presto:join-distribution-type=AUTOMATIC\"`.\n- **JVM configuration properties:** Use cluster properties with the `presto-jvm:` prefix to configure JVM properties for Presto coordinator and worker Java processes---for example, `--properties=\"presto-jvm:XX:+HeapDumpOnOutOfMemoryError\"`.\n- **Creating new catalogs and adding catalog properties:** Use `presto-catalog:`\u003cvar translate=\"no\"\u003ecatalog-name\u003c/var\u003e`.`\u003cvar translate=\"no\"\u003eproperty-name\u003c/var\u003e to configure Presto catalogs.\n\n\n **Example:** The following \\`properties\\` flag can be used\n with the \\`gcloud dataproc clusters create\\` command to create a Presto cluster\n with a \"prodhive\" Hive catalog. A `prodhive.properties` file will\n be created under`/usr/lib/presto/etc/catalog/` to enable the\n prodhive catalog. \n\n ```\n --properties=\"presto-catalog:prodhive.connector.name=hive-hadoop2,presto-catalog:prodhive.hive.metastore.uri=thrift://localhost:9083\n ```\n\n### REST API\n\nThe Presto component can be specified through the Dataproc API using\n[SoftwareConfig.Component](/dataproc/docs/reference/rest/v1/ClusterConfig#Component)\nas part of a\n[clusters.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nrequest.\n| Using the [Dataproc `v1` API](/dataproc/docs/reference/rest), set the [EndpointConfig.enableHttpPortAccess](/dataproc/docs/reference/rest/v1/ClusterConfig#endpointconfig) property to `true` as part of the clusters.create request to enable connecting to the Presto Web UI using the [Component Gateway](/dataproc/docs/concepts/accessing/dataproc-gateways).\n\n### Console\n\n1. Enable the component and component gateway.\n - In the Google Cloud console, open the Dataproc [Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd) page. The Set up cluster panel is selected.\n - In the Components section:\n - Under Optional components, select Presto and other optional components to install on your cluster.\n - Under Component Gateway, select Enable component gateway (see [Viewing and Accessing Component Gateway URLs](/dataproc/docs/concepts/accessing/dataproc-gateways#viewing_and_accessing_component_gateway_urls))."]]