Mentranskode data mainframe yang dipindahkan ke Google Cloud menggunakan virtual tape library
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Mentranskode data secara lokal di mainframe adalah proses yang intensif CPU yang menghasilkan konsumsi million instructions per second (MIPS) yang tinggi. Untuk menghindari hal ini, Anda
dapat menggunakan Cloud Run untuk memindahkan dan mentranskode data mainframe dari jarak jauh di
Google Cloud. Tindakan ini akan membebaskan mainframe Anda untuk tugas penting bisnis dan juga
mengurangi konsumsi MIPS.
Jika Anda ingin memindahkan data dalam volume yang sangat besar (sekitar 500 GB per hari atau lebih)
dari mainframe ke Google Cloud, dan tidak ingin menggunakan mainframe untuk
upaya ini, Anda dapat menggunakan solusi Virtual Tape Library (VTL) yang mendukung cloud untuk mentransfer data ke bucket Cloud Storage. Kemudian, Anda dapat menggunakan Cloud Run untuk mentranskode data yang ada di bucket dan memindahkannya ke BigQuery.
Halaman ini membahas cara membaca data mainframe yang disalin ke bucket Cloud Storage, mengonversinya dari set data extended binary coded decimal interchange code (EBCDIC) ke format ORC dalam UTF-8, dan memuat set data ke tabel BigQuery.
Diagram berikut menunjukkan cara memindahkan data mainframe ke
bucket Cloud Storage menggunakan solusi VTL, mentranskode data ke format ORC
menggunakan Cloud Run, lalu memindahkan konten ke BigQuery.
Mentranskode data mainframe dari jarak jauh menggunakan VTL
Sebelum memulai
Pilih solusi VTL yang sesuai dengan persyaratan Anda, lalu pindahkan data mainframe Anda ke bucket Cloud Storage dan simpan sebagai .dat. Pastikan
Anda menambahkan kunci metadata
bernama x-goog-meta-lrecl ke file .dat
yang diupload, dan panjang kunci metadata sama dengan panjang data file
asli, misalnya 80.
Di mainframe, tetapkan variabel lingkungan GCSDSNURI ke
awalan yang telah Anda gunakan untuk data mainframe di bucket Cloud Storage.
export GCSDSNURI="gs://BUCKET/PREFIX"
Ganti kode berikut:
BUCKET: Nama bucket Cloud Storage.
PREFIX: Awalan yang ingin Anda gunakan di bucket.
Buat akun layanan atau identifikasi akun layanan yang ada untuk digunakan dengan Mainframe Connector. Akun layanan ini harus memiliki izin untuk mengakses bucket Cloud Storage, set data BigQuery, dan resource Google Cloud lainnya yang ingin Anda gunakan.
Mentranskode data mainframe yang diupload ke bucket Cloud Storage
Untuk memindahkan data mainframe ke Google Cloud menggunakan VTL dan melakukan transcoding dari jarak jauh,
Anda harus melakukan tugas berikut:
Membaca dan mentranskode data yang ada di bucket Cloud Storage ke format ORC. Operasi transcoding mengonversi set data EBCDIC mainframe ke
format ORC dalam UTF-8.
Muat set data ke tabel BigQuery.
(Opsional) Jalankan kueri SQL pada tabel BigQuery.
(Opsional) Mengekspor data dari BigQuery ke file biner di Cloud Storage.
Untuk melakukan tugas ini, ikuti langkah-langkah berikut:
Di mainframe, buat tugas untuk membaca data dari file .dat di bucket Cloud Storage, dan transkode ke format ORC, seperti berikut.
Untuk mengetahui daftar lengkap variabel lingkungan yang didukung oleh
Mainframe Connector, lihat Variabel lingkungan.
(Opsional) Buat dan kirim tugas kueri BigQuery yang menjalankan pembacaan SQL dari file DD QUERY.
Biasanya, kueri akan berupa pernyataan MERGE atau SELECT INTO DML
yang menghasilkan transformasi tabel BigQuery. Perhatikan bahwa Konektor Mainframe mencatat metrik tugas, tetapi tidak menulis hasil kueri ke file.
Anda dapat membuat kueri BigQuery dengan berbagai cara-inline, dengan set data terpisah menggunakan DD, atau dengan set data terpisah menggunakan DSN.
PROJECT_NAME: Nama project tempat Anda
ingin menjalankan kueri.
LOCATION: Lokasi tempat kueri akan
dijalankan. Sebaiknya jalankan kueri di lokasi yang dekat dengan data.
(Opsional) Buat dan kirim tugas ekspor yang mengeksekusi pembacaan SQL dari
file DD QUERY, dan ekspor
set data yang dihasilkan ke Cloud Storage sebagai file biner.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-19 UTC."],[],[],null,["# Transcode mainframe data moved to Google Cloud using virtual tape library\n\nTranscoding data locally on a mainframe is a CPU-intensive process that results\nin high million instructions per second (MIPS) consumption. To avoid this, you\ncan use Cloud Run to move and transcode mainframe data remotely on\nGoogle Cloud. This frees up your mainframe for business critical tasks and also\nreduces MIPS consumption.\n\nIf you want to move very large volumes of data (around 500 GB per day or more)\nfrom your mainframe to Google Cloud, and don't want to use your mainframe for\nthis effort, you can use a cloud-enabled [Virtual Tape Library (VTL)](https://en.wikipedia.org/wiki/Virtual_tape_library) solution to transfer the data to a Cloud Storage\nbucket. You can then use Cloud Run to transcode data present in the\nbucket and move it to BigQuery.\n\nThis page discusses how to read mainframe data copied into a Cloud Storage\nbucket, transcode it from the extended binary coded decimal interchange code\n(EBCDIC) dataset to the ORC format in UTF-8, and load the dataset to a\nBigQuery table.\n| **Note:** This page doesn't explain you how to copy your data from your mainframe to a Cloud Storage bucket. The procedure described in this page starts with the assumption that you've moved your mainframe data to a Cloud Storage bucket.\n\nThe following diagram shows how you can move your mainframe data to a\nCloud Storage bucket using a VTL solution, transcode the data to the ORC\nformat using Cloud Run, and then move the content to BigQuery.\n\n\u003cbr /\u003e\n\nRemotely transcode mainframe data using VTL\n\n\u003cbr /\u003e\n\nBefore you begin\n----------------\n\n- Choose a VTL solution that suits your requirements and move your mainframe data to a Cloud Storage bucket and save it as a `.dat`. Ensure that you add a [metadata key](/storage/docs/viewing-editing-metadata#command-line_1) named `x-goog-meta-lrecl` to the uploaded `.dat` file, and that the metadata key length is equal to the original file's record length, for example 80.\n- [Deploy Mainframe Connector on Cloud Run](/mainframe-connector/docs/deploy-mainframe-connector).\n- In your mainframe, set the `GCSDSNURI` environment variable to the prefix that you have used for your mainframe data on Cloud Storage bucket. \n\n ```\n export GCSDSNURI=\"gs://BUCKET/PREFIX\"\n ```\n Replace the following:\n - \u003cvar translate=\"no\"\u003eBUCKET\u003c/var\u003e: The name of the Cloud Storage bucket.\n - \u003cvar translate=\"no\"\u003ePREFIX\u003c/var\u003e: The prefix that you want to use in the bucket.\n- [Create a service account](/iam/docs/service-accounts-create) or identify an existing service account to use with Mainframe Connector. This service account must have permissions to access Cloud Storage buckets, BigQuery datasets, and any other Google Cloud resource that you want to use.\n- Ensure that the service account you created is assigned the [Cloud Run Invoker role](/run/docs/reference/iam/roles#run.invoker).\n\nTranscode mainframe data uploaded to a Cloud Storage bucket\n-----------------------------------------------------------\n\nTo move mainframe data to Google Cloud using VTL and transcode remotely,\nyou must perform the following tasks:\n\n1. Read and transcode the data present in a Cloud Storage bucket to the ORC format. The transcoding operation converts a mainframe EBCDIC dataset to the ORC format in UTF-8.\n2. Load the dataset to a BigQuery table.\n3. (Optional) Execute a SQL query on the BigQuery table.\n4. (Optional) Export data from BigQuery into a binary file in Cloud Storage.\n\nTo perform these tasks, follow these steps:\n\n1. In your mainframe, create a job to read the data from a `.dat`\n file in a Cloud Storage bucket, and transcode it to ORC format, as follows.\n\n | **Note**\n | - Not all Google Cloud commands support remote transcoding. For more information, see [Mainframe Connector API reference](/mainframe-connector/docs/reference).\n | - Variables with the suffix FILLER are ignored during the import process.\n | - From version 5.12.0 onwards, Mainframe Connector replaces hyphens (\"-\") with underscores (\"_\") in variable names. If you want to keep hyphens in your variable names, disable this automatic conversion by setting the database variable `BQSH_FEATURE_CONVERT_UNDERSCORE_IN_FIELDS_NAME` to `false`.\n\n For the complete list of environment variables supported by\n Mainframe Connector, see [Environment variables](/mainframe-connector/docs/environment-variables). \n\n //STEP01 EXEC BQSH\n //COPYBOOK DD DISP=SHR,DSN=\u003cHLQ\u003e.COPYBOOK.FILENAME\n //STDIN DD *\n gsutil cp --replace gs://mybucket/tablename.orc \\\n --inDsn \u003cvar translate=\"no\"\u003eINPUT_FILENAME\u003c/var\u003e \\\n --remoteHost \u003cmainframe-connector-url\u003e.a.run.app \\\n --remotePort 443 \\\n --project_id \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e\n /*\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e: The name of the project in which you want to execute the query.\n - \u003cvar translate=\"no\"\u003eINPUT_FILENAME\u003c/var\u003e: The name of the `.dat` file that you uploaded to a Cloud Storage bucket.\n\n If you want to log the commands executed during this process, you can [enable load statistics](/mainframe-connector/docs/reference#enable_load_statistics).\n2. (Optional) Create and submit a BigQuery query job that executes a SQL read from\n the [QUERY DD file](/mainframe-connector/docs/reference#dataset-names).\n Typically the query will be a `MERGE` or `SELECT INTO DML`\n statement that results in transformation of a BigQuery table. Note\n that Mainframe Connector logs in job metrics but doesn't write query\n results to a file.\n\n You can query BigQuery in various ways-inline, with a separate\n dataset using DD, or with a separate dataset using DSN. \n\n Example JCL\n //STEP03 EXEC BQSH\n //QUERY DD DSN=\u003cHLQ\u003e.QUERY.FILENAME,DISP=SHR\n //STDIN DD *\n PROJECT=\u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e\n LOCATION=\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e\n bq query --project_id=$PROJECT \\\n --location=$LOCATION/*\n /*\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e: The name of the project in which you want to execute the query.\n - \u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e: The location for where the query will be executed. We recommended that you execute the query in a location close to the data.\n3. (Optional) Create and submit an export job that executes a SQL read from the\n [QUERY DD file](/mainframe-connector/docs/reference#dataset-names), and exports\n the resulting dataset to Cloud Storage as a binary file.\n\n Example JCL\n //STEP04 EXEC BQSH\n //OUTFILE DD DSN=\u003cHLQ\u003e.DATA.FILENAME,DISP=SHR\n //COPYBOOK DD DISP=SHR,DSN=\u003cHLQ\u003e.COPYBOOK.FILENAME\n //QUERY DD DSN=\u003cHLQ\u003e.QUERY.FILENAME,DISP=SHR\n //STDIN DD *\n PROJECT=\u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e\n DATASET_ID=\u003cvar translate=\"no\"\u003eDATASET_ID\u003c/var\u003e\n DESTINATION_TABLE=\u003cvar translate=\"no\"\u003eDESTINATION_TABLE\u003c/var\u003e\n BUCKET=\u003cvar translate=\"no\"\u003eBUCKET\u003c/var\u003e\n bq export --project_id=$PROJECT \\\n --dataset_id=$DATASET_ID \\\n --destination_table=$DESTINATION_TABLE \\\n --location=\"US\" \\\n --bucket=$BUCKET \\\n --remoteHost \u003cmainframe-connector-url\u003e.a.run.app \\\n --remotePort 443\n /*\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e: The name of the project in which you want to execute the query.\n - \u003cvar translate=\"no\"\u003eDATASET_ID\u003c/var\u003e: The BigQuery dataset ID that contains the table that you want to export.\n - \u003cvar translate=\"no\"\u003eDESTINATION_TABLE\u003c/var\u003e: The BigQuery table that you want to export.\n - \u003cvar translate=\"no\"\u003eBUCKET\u003c/var\u003e: The Cloud Storage bucket that will contain the output binary file."]]