Stay organized with collections
Save and categorize content based on your preferences.
This page contains information about connecting Looker to Apache Spark 3.
Looker connects to Apache Spark 3+ through a JDBC connection to the Spark Thrift Server.
Encrypting network traffic
It is a best practice to encrypt network traffic between the Looker application and your database. Consider one of the options described on the Enabling secure database access documentation page.
Creating the Looker connection to your database
In the Admin section of Looker, select Connections, and then click Add Connection.
Fill out the connection details. The majority of the settings are common to most database dialects. See the Connecting Looker to your database documentation page for information. Some of the settings are described next:
Name: The name of the connection. This is how the connection will be referred to in the LookML model.
Dialect: Select Apache Spark 3+.
Host: The Thrift server host.
Port The Thrift server port (10000 by default).
Database: The default schema/database that will be modeled. When no database is specified for a table, this will be assumed.
Username: The user that Looker will authenticate as.
Password: The optional password for Looker user.
Enable PDTs: Use this toggle to enable persistent derived tables. When PDTs are enabled, the Connection window reveals additional PDT settings and the PDT Overrides section.
Temp Database: A temporary schema/database for storing PDTs. It must be created beforehand, with a statement such as CREATE SCHEMA looker_scratch;.
Additional JDBC parameters: Add any additional Hive JDBC parameters here, such as:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-14 UTC."],[],[],null,["# Apache Spark\n\nThis page contains information about connecting Looker to Apache Spark 3.\n\nLooker connects to Apache Spark 3+ through a JDBC connection to the Spark Thrift Server.\n| **Note:** Looker does not support connections to Apache Spark 1.5 or Apache Spark 2. Queries on connections to Apache Spark 1.5 or Apache Spark 2 will return an error.\n\nEncrypting network traffic\n--------------------------\n\nIt is a best practice to encrypt network traffic between the Looker application and your database. Consider one of the options described on the [Enabling secure database access](/looker/docs/2512/enabling-secure-db-access) documentation page.\n\nCreating the Looker connection to your database\n-----------------------------------------------\n\nIn the **Admin** section of Looker, select **Connections** , and then click **Add Connection**.\n\nFill out the connection details. The majority of the settings are common to most database dialects. See the [Connecting Looker to your database](/looker/docs/2512/connecting-to-your-db) documentation page for information. Some of the settings are described next:\n\n- **Name**: The name of the connection. This is how the connection will be referred to in the LookML model.\n- **Dialect** : Select **Apache Spark 3+**.\n- **Host**: The Thrift server host.\n- **Port** The Thrift server port (10000 by default).\n- **Database**: The default schema/database that will be modeled. When no database is specified for a table, this will be assumed.\n- **Username**: The user that Looker will authenticate as.\n- **Password**: The optional password for Looker user.\n- **Enable PDTs** : Use this toggle to enable [persistent derived tables](/looker/docs/2512/derived-tables#persistent-derived-tables). When PDTs are enabled, the **Connection** window reveals additional PDT settings and the [**PDT Overrides**](/looker/docs/2512/connecting-to-your-db#pdt-overrides) section.\n- **Temp Database** : A temporary schema/database for storing PDTs. It must be created beforehand, with a statement such as `CREATE SCHEMA looker_scratch;`.\n- **Additional JDBC parameters** : Add any additional Hive JDBC parameters here, such as:\n - `;spark.sql.inMemoryColumnarStorage.compressed=true`\n - `;auth=noSasl`\n- **SSL**: Leave this unchecked.\n- **Database Time Zone**: The time zone of data stored in Spark. Usually it can be left blank or set to UTC.\n- **Query Time Zone**: The time zone to display data queried in Looker.\n\nTo verify that the connection is successful, click **Test** . See the [Testing database connectivity](/looker/docs/2512/testing-db-connectivity) documentation page for troubleshooting information.\n\nTo save these settings, click **Connect**.\n\nFeature support\n---------------\n\nFor Looker to support some features, your database dialect must also support them.\n\n### Apache Spark 3+\n\nApache Spark 3+ supports the following features as of Looker 25.12:\n\nNext steps\n----------\n\nAfter you have created the connection, [set authentication options](/looker/docs/2512/getting-started-with-users)."]]