Stay organized with collections
Save and categorize content based on your preferences.
You can use a BigQuery connector to enable programmatic read and write
access to BigQuery. This is an ideal way to process
data that is stored in BigQuery. Command-line access is not exposed.
The BigQuery connector is a library that enables Spark and Hadoop
applications to process data from BigQuery and write data to
BigQuery using its native terminology.
Pricing
When using the connector, charges include BigQuery usage fees.
The following service-specific charges may also apply:
Cloud Storage - the connector downloads data into a Cloud Storage
bucket before or during job execution. After the job successfully completes,
the data is deleted from Cloud Storage. You are charged for this storage
according to Cloud Storage pricing. To avoid excess charges,
check your Cloud Storage account and remove unneeded temporary files.
The following BigQuery connectors are available for use in
the Hadoop ecosystem:
The Spark BigQuery Connector
adds a Spark data source, which allows DataFrames to interact directly with
BigQuery tables using Spark's read and write operations.
The Hive BigQuery Connector adds a Storage Handler, which allows Apache Hive to interact
directly with BigQuery tables using HiveQL syntax.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThe BigQuery connector enables Spark and Hadoop applications to programmatically read and write data to BigQuery, without direct command-line access.\u003c/p\u003e\n"],["\u003cp\u003eThe Spark BigQuery Connector, Hive BigQuery Connector, and Hadoop BigQuery Connector are available options for integrating BigQuery with Spark, Hive, and Hadoop, respectively.\u003c/p\u003e\n"],["\u003cp\u003eUtilizing the connector incurs charges for BigQuery usage, Cloud Storage for temporary data, and the BigQuery Storage API for optimized data retrieval.\u003c/p\u003e\n"],["\u003cp\u003eThe connector leverages the BigQuery Storage API to enhance performance when reading data, and it downloads data to a temporary Cloud Storage bucket during job execution.\u003c/p\u003e\n"],["\u003cp\u003eQuick start guides are available for Spark and Java MapReduce to assist users in implementing the BigQuery connector in their workflows.\u003c/p\u003e\n"]]],[],null,["# BigQuery connector\n\nYou can use a BigQuery connector to enable programmatic read and write\naccess to [BigQuery](/bigquery). This is an ideal way to process\ndata that is stored in BigQuery. Command-line access is not exposed.\nThe BigQuery connector is a library that enables Spark and Hadoop\napplications to process data from BigQuery and write data to\nBigQuery using its native terminology.\n| The [GoogleCloudDataproc/spark-bigquery-connector](https://github.com/GoogleCloudDataproc/spark-bigquery-connector) is also available for reading data from BigQuery. It takes advantage of the [BigQueryStorage API](/bigquery/docs/reference/storage).\n\nPricing\n-------\n\nWhen using the connector, charges include [BigQuery usage fees](/bigquery/pricing).\nThe following service-specific charges may also apply:\n\n- [Cloud Storage](/storage) - the connector downloads data into a Cloud Storage bucket before or during job execution. After the job successfully completes, the data is deleted from Cloud Storage. You are charged for this storage according to [Cloud Storage pricing](/storage/pricing). To avoid excess charges, check your Cloud Storage account and remove unneeded temporary files.\n- [BigQuery Storage API](/bigquery/docs/reference/storage) - to achieve better performance, the connector reads data using the BigQuery Storage API. You are charged for this usage according to [BigQuery Storage API pricing](/bigquery/pricing#storage-api).\n\nAvailable connectors\n--------------------\n\nThe following BigQuery connectors are available for use in\nthe Hadoop ecosystem:\n\n1. The [Spark BigQuery Connector](https://github.com/GoogleCloudDataproc/spark-bigquery-connector) adds a Spark data source, which allows DataFrames to interact directly with BigQuery tables using Spark's `read` and `write` operations.\n2. The [Hive BigQuery Connector](https://github.com/GoogleCloudDataproc/hive-bigquery-connector) adds a Storage Handler, which allows Apache Hive to interact directly with BigQuery tables using HiveQL syntax.\n3. The [Hadoop BigQuery Connector](https://github.com/GoogleCloudDataproc/hadoop-connectors) allows Hadoop mappers and reducers to interact with BigQuery tables using abstracted versions of the [InputFormat](http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/InputFormat.html) and [OutputFormat](http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/OutputFormat.html) classes.\n\nUse the connectors\n------------------\n\nFor a quick start using the BigQuery connector, see the following examples:\n\n- [Spark example](/dataproc/docs/tutorials/bigquery-connector-spark-example)\n- [Java MapReduce example](/dataproc/docs/tutorials/bigquery-connector-mapreduce-example)\n- [Connect Dataproc cluster to BigQuery](https://console.cloud.google.com/?walkthrough_id=dataproc--dataproc-bq-spark-connector)\n\nWhat's next\n-----------\n\n- Learn more about [BigQuery](/bigquery).\n- Follow the [BigQuery example for Spark](/dataproc/docs/tutorials/bigquery-connector-spark-example).\n- Learn more about the [Hive BigQuery Connector](/dataproc/docs/concepts/connectors/hive-bigquery).\n- Follow the [BigQuery example for Java MapReduce](/dataproc/docs/tutorials/bigquery-connector-mapreduce-example)."]]