本頁面由 Cloud Translation API 翻譯而成。

透過模型調整機制來改善語音轉錄結果

總覽

您可以使用模型調整功能，讓 Speech-to-Text 比其他系統建議選項更頻繁地辨識特定字詞或詞組。舉例來說，假設您的音訊資料經常包含「天氣」一詞，當 Speech-to-Text 遇到「weather」這個字詞時，您希望系統將這個字詞轉錄為「weather」，而不是「whether」。在這種情況下，您可以使用模型調整功能，讓 Speech-to-Text 偏向辨識「weather」。

模型調整功能在下列用途中特別實用：

提升音訊資料中常見字詞和詞組的準確率。舉例來說，您可以讓辨識模型偵測使用者常說的語音指令。
擴充 Speech-to-Text 辨識的字詞詞彙。Speech-to-Text 包含的詞彙量非常大。不過，如果音訊資料經常包含一般語言中較少出現的字詞 (例如專有名詞或特定領域的字詞)，您可以使用模型調整功能新增這些字詞。
當提供的音訊包含噪音或不太清楚時，可提高語音轉錄的準確度。

如要瞭解模型調整功能是否適用於您的語言，請參閱語言支援頁面。

提升字詞和詞組辨識準確度

如要提高語音轉文字在轉錄音訊資料時，辨識「天氣」一詞的可能性，您可以在 SpeechAdaptation 資源的 PhraseSet 物件中，傳遞單字「天氣」。

當您提供多字詞詞組時，Speech-to-Text 更有可能依序辨識這些字詞。提供詞組也會增加辨識部分詞組 (包含個別字詞在內) 的可能性。如要瞭解這些字詞的數量和大小限制，請參閱「內容限制」頁面。

您也可以使用模型適應強化功能微調模型適應強度。

使用類別改善辨識結果

類別代表自然語言中常見的概念，例如貨幣單位和日曆日期。您可以使用類別，針對大量對應於常見概念的字詞群組 (但不一定包含相同的字詞或詞組) 改善轉錄準確度。

舉例來說，假設您的音訊資料包含使用者說出街道地址的錄音。你可能有錄音檔，其中有人說：「我的房子位於 123 號中山路，左側第四間房子。」在這種情況下，您希望 Speech-to-Text 將第一個數字序列 (「123」) 視為地址，而非序數 (「一百二十三」)。不過，並非所有人都住在「123 Main Street」。在 PhraseSet 資源中列出所有可能的街道地址並不實際。您可以使用類別，指出系統應辨識門牌號碼，無論實際門牌號碼為何。在這個範例中，Speech-to-Text 就能更準確地轉錄「123 Main Street」和「987 Grand Boulevard」等字詞，因為系統會將這些字詞視為地址號碼。

類別符記

如要在模型調整中使用類別，請在 PhraseSet 資源的 phrases 欄位中加入類別符記。請參閱支援的類別符記清單，瞭解您的語言支援哪些符記。舉例來說，如要改善來源音訊中的地址號碼轉錄結果，請在 SpeechContext 物件中提供 $ADDRESSNUM 值。

您可以將類別做為 phrases 陣列中的獨立項目，也可以在較長的多字詞詞組中嵌入一或多個類別符記。舉例來說，您可以在字串中加入類別符號 ["my address is $ADDRESSNUM"]，藉此在較大的詞組中指出地址編號。不過，如果音訊含有相似但不完全相同的字詞，例如「I am at 123 Main Street」，這組字詞就無法派上用場。為了協助系統辨識類似的字詞，請務必另外加入類別符記：["my address is $ADDRESSNUM", "$ADDRESSNUM"]。如果您使用無效或格式不正確的類別符記，Speech-to-Text 會忽略該符記，但不會觸發錯誤，且仍會使用該詞組的其餘部分做為上下文。

自訂類別

您也可以自行建立 CustomClass，這是由您自訂的相關項目或值清單組成的類別。舉例來說，您想轉錄音訊資料，其中可能包含數百家區域餐廳的任何一家餐廳名稱。餐廳名稱在一般對話中較少出現，因此系統較不可能將其選為「正確」答案。您可以使用自訂類別，在音訊中出現這些名稱時，讓辨識模型偏向正確的辨識結果。

如要使用自訂類別，請建立 CustomClass 資源，其中包含每個餐廳名稱做為 ClassItem。自訂類別的運作方式與預先建構的類別符號相同。phrase 可同時包含預先建構的類別符記和自訂類別。

ABNF 文法

您也可以使用擴展巴科斯諾爾形式 (ABNF) 中的文法，指定字詞模式。在要求的模型調整中加入 ABNF 文法，可提高 Speech-to-Text 辨識所有符合指定文法的字詞的機率。

如要使用這項功能，請在要求的 SpeechAdaptation 欄位中加入 ABNF grammar 物件。ABNF 文法也能參照 CustomClass 和 PhraseSet 資源。如要進一步瞭解這個欄位的語法，請參閱 Speech Recognition Grammar Specification 和下方的 code sample。

使用增強功能微調轉錄結果

在大多數情況下，預設的模型調整功能應已提供足夠的效果。模型調整增強功能可讓您為某些詞組指派較高的權重，藉此提高辨識模型偏誤。建議您只有在符合下列條件時，才導入強化功能：1) 已導入模型調整功能，2) 想進一步調整模型調整功能對轉錄結果的影響程度。

舉例來說，你有許多錄音檔，使用者詢問「前往縣市博覽會的車資」，其中「博覽會」一詞出現的頻率高於「車資」。在這種情況下，您可以使用模型調整功能，在 PhraseSet 資源中將「fair」和「fare」新增為 phrases，藉此提高模型同時辨識「fair」和「fare」的機率。這樣一來，語音轉文字服務就會比起「hare」或「lair」等字詞，更常辨識「fair」和「fare」。

不過，由於「fair」在音訊中出現的頻率較高，因此應優先辨識「fair」，您可能已經使用 Speech-to-Text API 轉錄音訊，並發現系統在辨識正確字詞 (「fair」) 時出現大量錯誤。在這種情況下，您可能想額外使用含有提升的字詞，為「公平」指定比「票價」更高的提升值。由於「fair」的權重值較高，Speech-to-Text API 會偏向選擇「fair」，而不是「fare」。如果沒有加權值，辨識模型會以相同機率辨識「fair」和「fare」。

增強功能基本概念

使用加權時，您會為 PhraseSet 資源中的 phrase 項目指派加權值。在音訊資料中選取可能的字詞轉錄時，Speech-to-Text 會參考這個加權值。值越高，Speech-to-Text 從可能選項中選擇該字詞或詞組的機率便越高。

舉例來說，假設您想為「我最喜歡的美國自然歷史博物館展覽是藍鯨」這句話指派加權值，如果將該詞組新增至 phrase 物件並指派加權值，辨識模型就更有可能逐字辨識該詞組。

如果您為多字詞詞組加強，但未獲得預期的結果，建議您將組成該詞組的所有雙字詞 (2 個字詞，按順序排列) 新增為額外的 phrase 項目，並為每個項目指派加強值。接著，您可以參考上述範例，考慮新增其他二元組和 N 元組 (超過 2 個字詞)，例如「my favorite」、「my favorite exhibit」、「favorite exhibit」、「my favorite exhibit at the American Museum of Natural History」、「American Museum of Natural History」、「blue whale」等等。如此一來，Speech-to-Text 辨識模型就更有可能在音訊中辨識出含有原始加強詞組部分，但不完全相符的相關詞組。

設定加強值

增強值必須是介於 0 到 20 之間的浮點值。實際上，加強值的最大上限為 20。為取得最佳結果，請嘗試調整強化值，直到獲得準確的轉錄結果為止。

較高的加強值可減少偽陰性情形，也就是音訊中出現的字詞或詞組，但 Speech-to-Text 未正確辨識的情況。不過，使用加強功能也會提高出現誤判的可能性，也就是在轉錄稿中出現某個字詞或詞組，但音訊中並未出現該字詞或詞組的情況。

接收逾時通知

Speech-to-Text 回應包含 SpeechAdaptationInfo 欄位，可提供辨識期間的模型調整行為資訊。如果發生與模型調整相關的逾時情形，adaptationTimeout 會是 true，而 timeoutMessage 會指定導致逾時的調整設定。發生逾時時，模型調整作業不會對傳回的轉錄稿產生影響。

使用模型適應的應用實例

以下範例將逐步說明如何使用模型調整功能，轉錄某人說出「call me fionity and oh my gosh what do we have here ionity」的音訊錄音檔。在這種情況下，模型必須正確識別「fionity」和「ionity」。

下列指令會在音訊上執行辨識作業，但不進行模型調整。轉錄結果不正確：「call me Fiona tea and oh my gosh what do we have here I own a day」。

   curl -H "Authorization: Bearer $(gcloud auth
   --impersonate-service-account=$SA_EMAIL print-access-token)" -H
   "Content-Type: application/json; charset=utf-8"
   "https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
   {"languageCode": "en-US"}, "audio":
   {"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

     {
       "config":{
       "languageCode":"en-US"
       },
       "audio":{
          "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
       }
     }

使用 `PhraseSet` 改善語音轉錄品質

建立 PhraseSet：

curl -X POST -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets"
-d '{"phraseSetId": "test-phrase-set-1"}'

要求範例：

{
   "phraseSetId":"test-phrase-set-1"
}

取得 PhraseSet：

curl -X GET -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id>/locations/global/phraseSets/test-phrase-set-1"\

將「fionity」和「ionity」詞組新增至 PhraseSet，並為每個詞組指派 boost 值 10：

curl -X PATCH -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets/test-phrase-set-1?updateMask=phrases"\
-d '{"phrases": [{"value": "ionity", "boost": 10}, {"value": "fionity", "boost": 10}]}'

PhraseSet 現已更新為：

{
  "phrases":[
     {
          "value":"ionity",
          "boost":10
       },
       {
          "value":"fionity",
          "boost":10
       }
    ]
 }

再次辨識音訊，這次使用模型調整功能和先前建立的 PhraseSet。轉錄結果現在正確無誤：「call me fionity and oh my gosh what do we have here ionity」。

curl -H "Authorization: Bearer $(gcloud auth --impersonate-service-account=$SA_EMAIL print-access-token)"
-H "Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phrase_set_references": ["projects/project_id/locations/global/phraseSets/test-phrase-set-1"]},
"languageCode": "en-US"}, "audio": {"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

{
   "config":{
      "adaptation":{
         "phrase_set_references":[
            "projects/project_id/locations/global/phraseSets/test-phrase-set-1"
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
}

使用 `CustomClass` 改善語音轉錄結果

建立 CustomClass：

curl -X POST -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses"
-d '{"customClassId": "test-custom-class-1"}'

要求範例：

{
   "customClassId": "test-custom-class-1"
}

取得 CustomClass：

 curl -X GET -H "Authorization: Bearer $(gcloud auth
 --impersonate-service-account=$SA_EMAIL print-access-token)" -H
 "Content-Type: application/json; charset=utf-8"
 "https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses/test-custom-class-1"

辨識測試音訊片段。CustomClass 為空白，因此傳回的轉錄稿仍不正確：「call me Fiona tea and oh my gosh what do we have here I own a day」：

curl -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phraseSets": [{"phrases": [{"value":
"${projects/project_idlocations/global/customClasses/test-custom-class-1}",
"boost": "10"}]}]}, "languageCode": "en-US"}, "audio":
{"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

  {
   "config":{
      "adaptation":{
         "phraseSets":[
            {
               "phrases":[
                  {
                     "value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
                     "boost":"10"
                  }
               ]
            }
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
 }

將「fionity」和「ionity」詞組新增至自訂類別：

curl -X PATCH -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses/test-custom-class-1?updateMask=items"
-d '{"items": [{"value": "ionity"}, {"value": "fionity"}]}'

這會將自訂類別更新為以下內容：

{
   "items":[
      {
         "value":"ionity"
      },
      {
         "value":"fionity"
      }
   ]
}

再次辨識音訊樣本，這次在 CustomClass 中使用「fionity」和「ionity」。轉錄結果現在正確無誤：「call me fionity and oh my gosh what do we have here ionity」。

curl -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phraseSets": [{"phrases": [{"value":
"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
"boost": "10"}]}]}, "languageCode": "en-US"}, "audio":
{"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

{
   "config":{
      "adaptation":{
         "phraseSets":[
            {
               "phrases":[
                  {
"value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
                     "boost":"10"
                  }
               ]
            }
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
}

在 `PhraseSet` 中參照 `CustomClass`

更新先前建立的 PhraseSet 資源，以便參照 CustomClass：

curl -X PATCH -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets/test-phrase-set-1?updateMask=phrases"
-d '{"phrases": [{"value": "${projects/project_id/locations/global/customClasses/test-custom-class-1}", "boost": 10}]}'

要求範例：

{
   "config":{
      "adaptation":{
         "phraseSets":[
            {
               "phrases":[
                  {
                     "value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
                     "boost":"10"
                  }
               ]
            }
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
}

使用 PhraseSet 資源 (參照 CustomClass) 辨識音訊。轉錄內容正確無誤："call me fionity and oh my gosh what do we have here ionity"。

curl -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phrase_set_references":
["projects/project_id/locations/global/phraseSets/test-phrase-set-1"]},
"languageCode": "en-US"}, "audio":
{"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

{
   "phrases":[
      {
         "value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
         "boost":10
      }
   ]
}

刪除 `CustomClass` 和 `PhraseSet`

刪除 PhraseSet：

curl -X DELETE -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets/test-phrase-set-1"

刪除 CustomClass：

curl -X DELETE -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses/test-custom-class-1"

使用 `ABNF Grammar` 改善語音轉錄結果

使用 abnf_grammar 辨識音訊。這個範例會參照 CustomClass 資源：projects/project_id/locations/global/customClasses/test-custom-class-1、內嵌 CustomClass：test-custom-class-2、類別符記：ADDRESSNUM，以及 PhraseSet 資源：projects/project_id/locations/global/phraseSets/test-phrase-set-1。字串中 (外部宣告後) 的第一個規則會視為根目錄。

要求範例：

{
   "config":{
      "adaptation":{
         "abnf_grammar":{
            "abnf_strings": [ 
              "external ${projects/project_id/locations/global/phraseSets/test-phrase-set-1}" ,
              "external ${projects/project_id/locations/global/customClasses/test-custom-class-1}" ,
              "external ${test-custom-class-2}" ,
              "external $ADDRESSNUM" ,
              "$root = $test-phrase-set-1 $name lives in $ADDRESSNUM;" ,
              "$name = $title $test-custom-class-1 $test-custom-class-2" ,
              "$title = Mr | Mrs | Miss | Dr | Prof ;" 
            ]
         }
      }
   }
}

後續步驟

瞭解如何在 Speech-to-Text 要求中使用模型調整功能。
請查看支援的類別符記清單。

透過模型調整機制來改善語音轉錄結果 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

總覽

提升字詞和詞組辨識準確度

使用類別改善辨識結果

類別符記

自訂類別

ABNF 文法

使用增強功能微調轉錄結果

增強功能基本概念

設定加強值

接收逾時通知

使用模型適應的應用實例

使用 PhraseSet 改善語音轉錄品質

使用 CustomClass 改善語音轉錄結果

在 PhraseSet 中參照 CustomClass

刪除 CustomClass 和 PhraseSet

使用 ABNF Grammar 改善語音轉錄結果

後續步驟

透過模型調整機制來改善語音轉錄結果

使用 `PhraseSet` 改善語音轉錄品質

使用 `CustomClass` 改善語音轉錄結果

在 `PhraseSet` 中參照 `CustomClass`

刪除 `CustomClass` 和 `PhraseSet`

使用 `ABNF Grammar` 改善語音轉錄結果