Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Federasi metadata adalah layanan yang memungkinkan Anda mengakses beberapa sumber metadata dari satu endpoint.
Untuk menyiapkan federasi, Anda membuat layanan federasi, lalu mengonfigurasi sumber metadata. Setelah itu, layanan akan mengekspos satu endpoint gRPC yang dapat Anda gunakan untuk mengakses semua metadata Anda.
Misalnya, menggunakan federasi, Anda dapat membuat cluster Dataproc
yang mengekspos beberapa layanan Dataproc Metastore melalui
satu endpoint. Setelah itu, Anda dapat menjalankan tugas big data melalui mesin software open source (OSS), seperti Spark atau Hive, untuk mengakses metadata di beberapa metastore.
Cara kerja federasi
Beban kerja big data OSS yang berjalan di Spark atau Hive mengirim permintaan ke Hive
Metastore API untuk mengambil metadata saat runtime.
Antarmuka Hive Metastore
mendukung metode baca dan tulis. Layanan federasi mengekspos antarmuka Hive Metastore versi gRPC.
Saat runtime, ketika layanan federasi menerima permintaan, layanan tersebut akan memeriksa
pengurutan sumber untuk mengambil metadata yang sesuai.
Sumber metadata
Saat membuat layanan federasi, Anda harus menambahkan sumber metadata.
Anda dapat menggunakan sumber berikut sebagai metastore backend:
Instance Dataproc Metastore.
Project yang berisi satu atau beberapa set data BigQuery.
Bagian berikut mencantumkan batasan yang harus Anda patuhi saat
menggunakan berbagai sumber metadata.
Semua sumber
Batasan berikut berlaku untuk semua sumber metadata:
Layanan federasi tidak berisi datanya sendiri. Sebagai gantinya, layanan federasi hanya menyajikan metadata dari salah satu sumber metadatanya.
Layanan federasi tidak dapat menjadi sumber metadata di layanan federasi lain.
Dataproc Metastore
Jika Anda menggunakan Dataproc Metastore sebagai sumber, batasan berikut berlaku:
Layanan federasi hanya tersedia melalui endpoint gRPC. Untuk menggunakan
Dataproc Metastore dengan federasi, buat metastore Anda
dengan endpoint gRPC.
Layanan federasi dapat dilampirkan ke layanan Dataproc Metastore satu region atau multi-region.
Jika metastore multi-regional berada di project yang berbeda dengan layanan federasi, berikan izin metastore.services.get kepada akun layanan Dataproc Metastore dari project federasi pada instance Dataproc Metastore yang dikonfigurasi di multi-region.
BigQuery
Jika Anda menggunakan project yang berisi set data BigQuery sebagai
sumber, Anda harus memenuhi kondisi berikut:
Beri peran Identity and Access Management yang benar untuk mengakses project yang berisi set data BigQuery.
Tambahkan setidaknya satu layanan Dataproc Metastore sebagai sumber, bersama dengan set data BigQuery Anda.
Lake Katalog Universal Dataplex
Berikan peran IAM yang berisi izin dataplex.lakes.get.
Tambahkan setidaknya satu layanan Dataproc Metastore sebagai sumber, beserta Lake Dataplex Universal Catalog Anda.
Pengurutan sumber
Layanan federasi Anda memproses permintaan metadata dalam urutan prioritas. Konsep
ini dikenal sebagai pengurutan sumber. Saat runtime, saat layanan federasi menerima permintaan, layanan tersebut akan memeriksa pengurutan sumber dan menyelesaikan salah satu tindakan berikut:
Jika permintaan berisi nama database. Permintaan dirutekan ke metastore backend yang berisi nama database. Jika lebih dari satu
metastore berisi nama database yang sama, permintaan akan dirutekan ke
metastore dengan peringkat terendah.
Jika permintaan membuat atau menghapus database. Permintaan dirutekan ke
metastore dengan peringkat terendah.
Jika permintaan tidak berisi nama database dan tidak membuat atau
menghapus database. Permintaan dirutekan ke instance Dataproc Metastore dengan peringkat terendah. Beberapa contoh permintaan Hive Metastore yang tidak menentukan database adalah set_ugi dan create_database.
Jika tidak ada metastore yang berisi database. Mesin OSS merespons
dengan error setara tidak ditemukan.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-02 UTC."],[[["\u003cp\u003eMetadata federation allows access to multiple metadata sources through a single gRPC endpoint, simplifying metadata retrieval.\u003c/p\u003e\n"],["\u003cp\u003eFederation services utilize backend sources such as Dataproc Metastore instances, BigQuery datasets, and Dataplex Lakes, but cannot be a source for another federation service.\u003c/p\u003e\n"],["\u003cp\u003eRequests to the federation service are handled according to a source ordering, with the service checking for the requested database name or routing to the lowest-ranked metastore if not specified.\u003c/p\u003e\n"],["\u003cp\u003eWhen using BigQuery or Dataplex as a source, you must include at least one Dataproc Metastore service as a source as well.\u003c/p\u003e\n"],["\u003cp\u003eFederation services do not contain their own data, and will only serve metadata from its designated metadata sources.\u003c/p\u003e\n"]]],[],null,["# About Metadata federation\n\nMetadata federation is a service that lets you access multiple sources of\nmetadata from a single endpoint.\n\nTo set up federation, you create a federation service and then configure your\nmetadata sources. Afterward, the service exposes a single gRPC endpoint that you can\nuse to access all of your metadata.\n\nFor example, using federation, you can create a Dataproc cluster\nthat exposes multiple Dataproc Metastore services through a\nsingle endpoint. Afterward, you can run big data jobs through open-source\nsoftware (OSS) engines, such as Spark or Hive, to access your metadata across\nmultiple metastores.\n\nHow federation works\n--------------------\n\nOSS big data workloads that run on Spark or Hive send requests to the Hive\nMetastore API to fetch metadata at runtime.\n\n- The [Hive Metastore interface](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift) supports both read and write methods. The federation service exposes a gRPC version of the Hive Metastore interface.\n- At runtime, when the federation service receives a request, it checks the [source ordering](#source_ordering) to retrieve the appropriate metadata.\n\nMetadata sources\n----------------\n\nWhen you create a federation service, you must add a metadata source.\nYou can use the following sources as backend metastores:\n\n- A Dataproc Metastore instance.\n- A project containing one or more BigQuery datasets.\n- A Dataplex Universal Catalog Lake ([Preview](/products#product-launch-stages)).\n\nSource restrictions\n-------------------\n\nThe following section lists the restrictions that you must adhere to when\nusing various metadata sources.\n\n### All sources\n\nThe following restrictions apply to all metadata sources:\n\n- A federation service doesn't contain its own data. Instead, the federation service just serves metadata from one of its metadata sources.\n- A federation service can't be a source of metadata in another federation service.\n\n### Dataproc Metastore\n\nIf you're using a Dataproc Metastore as a source, the following\nrestrictions apply:\n\n- Federation services are only available through gRPC endpoints. To use a Dataproc Metastore with federation, create your metastore with a [gRPC endpoint](/dataproc-metastore/docs/endpoint-protocol).\n- Federation services can be attached to both single-region or multi-region\n Dataproc Metastore services.\n\n If the multi-regional metastore is in a different project than the\n federation service, grant the Dataproc Metastore service account\n of the federation project the `metastore.services.get` permission on the\n Dataproc Metastore instances configured in the multi-region.\n\n### BigQuery\n\nIf you're using a project that contains BigQuery datasets as a\nsource, you must satisfy the following conditions:\n\n- Grant the correct Identity and Access Management roles to access the project that contains the BigQuery datasets.\n- Add at least one Dataproc Metastore service as a source, along with your BigQuery datasets.\n\n### Dataplex Universal Catalog Lakes\n\n|\n| **Preview**\n|\n|\n| This product or feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA products and features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\n- Grant an IAM role that contains the `dataplex.lakes.get` permission.\n- Add at least one Dataproc Metastore service as a source, along with your Dataplex Universal Catalog Lake.\n\n### Source ordering\n\nYour federation service processes metadata requests in a priority order. This\nconcept is known as source ordering. At runtime, when the federation service\nreceives a request, it checks the source ordering and completes one of the\nfollowing actions:\n| **Note:** The metastore with the lowest rank is known as the primary metastore.\n\n- **If the request contains a database name**. The request is routed to the backend metastore that contains the database name. If more than one metastore contains the same database name, the request is routed to the metastore with the lowest rank.\n- **If the request creates or drops a database**. The request is routed to the metastore with the lowest rank.\n- **If the request doesn't contain a database name and it doesn't create or\n drop a database** . The request is routed to the Dataproc Metastore instance with the lowest rank. Some examples of Hive Metastore requests that don't specify a database are `set_ugi` and `create_database`.\n- **If none of the metastores contain a database**. The OSS engine responds with the equivalent of a not-found error.\n\nWhat's next\n-----------\n\n- [Create a federation service](/dataproc-metastore/docs/create-federation)"]]