Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Set data percakapan berisi data transkrip percakapan, dan digunakan untuk
melatih model kustom Smart Reply atau Ringkasan.
Smart Reply menggunakan transkrip percakapan
untuk merekomendasikan respons teks kepada agen manusia yang berkomunikasi dengan pengguna akhir.
Model kustom peringkasan
dilatih pada set data percakapan yang berisi transkrip dan
data anotasi. Model ini menggunakan anotasi untuk membuat ringkasan percakapan kepada agen manusia setelah percakapan selesai.
Ada dua cara untuk membuat set data: Menggunakan alur kerja tutorial Konsol,
atau membuat set data secara manual di Konsol menggunakan tab Data->Set Data. Sebaiknya gunakan tutorial Konsol sebagai opsi
pertama. Untuk menggunakan tutorial Konsol, buka
Konsol Agent Assist
lalu klik tombol Mulai di bagian fitur yang ingin Anda uji.
Halaman ini menunjukkan cara membuat set data secara manual.
Sebelum memulai
Ikuti petunjuk penyiapan Dialogflow untuk mengaktifkan Dialogflow di project Google Cloud Platform.
Sebaiknya baca halaman dasar-dasar Agent Assist sebelum memulai tutorial ini.
Jika Anda menerapkan Smart Reply menggunakan data transkrip Anda sendiri, pastikan
transkrip Anda berada di JSON dalam
format yang ditentukan
dan disimpan di
bucket Google Cloud Storage. Set data percakapan harus berisi minimal 30.000 percakapan. Jika tidak, pelatihan model akan gagal. Sebagai aturan umum, semakin banyak percakapan yang Anda miliki,
makin baik kualitas model Anda. Sebaiknya hapus percakapan
dengan kurang dari 20 pesan atau 3 giliran percakapan (perubahan
saat peserta membuat ucapan). Sebaiknya Anda juga menghapus pesan bot atau pesan yang otomatis dibuat oleh sistem (misalnya, "Agen memasuki ruang chat"). Sebaiknya upload percakapan
minimal 3 bulan untuk memastikan cakupan sebanyak mungkin kasus penggunaan. Jumlah maksimum percakapan dalam set data percakapan adalah 1.000.000.
Jika Anda menerapkan Ringkasan menggunakan data transkrip dan
anotasi Anda sendiri, pastikan transkrip Anda dalam
format yang ditentukan
dan disimpan di
bucket Google Cloud Storage. Jumlah minimum anotasi pelatihan yang direkomendasikan adalah 1.000. Jumlah minimum
yang diterapkan adalah 100.
Buka Konsol Bantuan Agen.
Pilih project Google Cloud Platform Anda, lalu klik opsi menu Data di margin kiri halaman. Menu Data menampilkan semua
data Anda. Ada dua tab, masing-masing untuk set data percakapan dan
pusat informasi.
Klik tab set data percakapan, lalu klik tombol +Buat baru di kanan atas halaman set data percakapan.
Membuat set data percakapan
Masukkan Nama dan Deskripsi opsional untuk set data baru Anda. Di kolom
Conversation data, masukkan URI bucket penyimpanan yang
berisi transkrip percakapan Anda. Agent Assist mendukung penggunaan
simbol * untuk pencocokan karakter pengganti. URI harus memiliki format
berikut:
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-19 UTC."],[[["\u003cp\u003eConversation datasets, which contain conversation transcript data, are used to train Smart Reply models that suggest text responses and Summarization models that generate conversation summaries for human agents.\u003c/p\u003e\n"],["\u003cp\u003eDatasets can be created manually in the Console using the "Data -> Datasets" tab, or it is recommended to use the Console tutorials located in the Agent Assist Console under the "Get started" button.\u003c/p\u003e\n"],["\u003cp\u003eTo use Smart Reply, conversation datasets must contain at least 30,000 conversations in \u003ccode\u003eJSON\u003c/code\u003e format and stored in a Google Cloud Storage bucket, and you should aim to remove any conversations with fewer than 20 messages or three conversation turns.\u003c/p\u003e\n"],["\u003cp\u003eFor Summarization, in addition to conversation transcripts in the required format, your dataset will need to contain at least 100 conversation annotations and a recommended 1000, also stored in a Google Cloud Storage bucket.\u003c/p\u003e\n"],["\u003cp\u003eTo create a new conversation dataset, you will need to specify a name, an optional description, and the URI of the Google Cloud Storage bucket containing your conversation transcripts, using the \u003ccode\u003egs://<bucket name>/<object name>\u003c/code\u003e format.\u003c/p\u003e\n"]]],[],null,["# Create a conversation dataset\n\nA conversation dataset contains conversation transcript data, and is used to\ntrain either a Smart Reply or Summarization custom model.\n[Smart Reply](/agent-assist/docs/smart-reply) uses the conversation transcripts\nto recommend text responses to human agents conversing with an end-user.\n[Summarization custom models](/agent-assist/docs/summarization-console)\nare trained on conversation datasets that contain both transcripts and\n**annotation** data. They use the annotations to generate conversation\nsummaries to human agents after a conversation has completed.\n\nThere are two ways to create a dataset: Using the Console tutorial workflows,\nor manually creating a dataset in the Console using the **Data** **-\\\u003e**\n**Datasets** tab. We recommend that you use the Console tutorials as a first\noption. To use the Console tutorials, navigate to the\n[Agent Assist Console](https://agentassist.cloud.google.com)\nand click the **Get started** button under the feature you'd like to test.\n\nThis page demonstrates how to create a dataset manually.\n\nBefore you begin\n----------------\n\n1. Follow the [Dialogflow setup](/dialogflow/es/docs/quick/setup?hl=en)\n instructions to enable Dialogflow on a Google Cloud Platform project.\n\n2. We recommend that you read the Agent Assist\n [basics](/agent-assist/docs/basics) page before starting this tutorial.\n\n3. If you are implementing Smart Reply using your own transcript data, make\n sure your transcripts are in `JSON` in the specified\n [format](/agent-assist/docs/conversation-data-format#conversation_transcript_data)\n and stored in a\n [Google Cloud Storage bucket](/storage/docs/creating-buckets). A\n conversation dataset must contain at least 30,000 conversations, otherwise\n model training will fail. As a general rule, the more conversations you have\n the better your model quality will be. We suggest that you remove any\n conversations with fewer than 20 messages or 3 conversation turns (changes\n in which participant is making an utterance). We also suggest that you\n remove any bot messages or messages automatically generated by systems (for\n example, \"Agent enters the chat room\"). We recommend that you upload\n at least 3 months of conversations to ensure coverage of as many use cases\n as possible. The maximum number of conversations in a conversation dataset\n is 1,000,000.\n\n4. If you are implementing Summarization using your own transcript and\n annotation data, make sure your transcripts are in the specified\n [format](/agent-assist/docs/summarization#summarization_training_data)\n and stored in a\n [Google Cloud Storage bucket](/storage/docs/creating-buckets). The\n recommended minimum number of training annotations is 1000. The enforced\n minimum number is 100.\n\n5. Navigate to the [Agent Assist Console](https://agentassist.cloud.google.com).\n Select your Google Cloud Platform project, then click on the **Data** menu\n option on the far left margin of the page. The **Data** menu displays all of\n your data. There are two tabs, one each for **conversation datasets** and\n **knowledge bases**.\n\n6. Click on the **conversation datasets** tab, then on the **+Create new**\n button at the top right of the conversation datasets page.\n\nCreate a conversation dataset\n-----------------------------\n\n1. Enter a **Name** and optional **Description** for your new dataset. In the\n **Conversation data** field, enter the URI of the storage bucket that\n contains your conversation transcripts. Agent Assist supports use of\n the `*` symbol for wildcard matching. The URI should have the following\n format:\n\n gs://\u003cbucket name\u003e/\u003cobject name\u003e\n\n For example: \n\n gs://mydata/conversationjsons/conv0*.json\n gs://mydatabucket/test/conv.json\n\n2. Click **Create** . Your new dataset now appears in the dataset list on the\n **Data** menu page under the **Conversation datasets** tab.\n\nWhat's next\n-----------\n\nTrain a [Smart Reply](/agent-assist/docs/smart-reply) or\n[Summarization](/agent-assist/docs/summarization-console) model on\none or more conversation datasets\n[using the Agent Assist console](/agent-assist/docs/model-training)."]]