分類內容

內容分類會分析文件內容並傳回符合文件文字內容的類別清單。如要將文件內容分類,請呼叫 classifyText 方法。

如要查看 classifyText 方法傳回的完整內容類別清單,請參閱這裡

您可以設定選用的 classificationModelOptions 欄位,為 classifyText 方法選擇要使用的模型:

本節說明如何將文件中的內容分類。請分別提交每份文件的要求。

分類內容

以下示範如何分類以字串提供的內容:

通訊協定

如要分類文件中的內容,請向 documents:classifyText REST 方法發出 POST 要求,並提供適當的要求主體,如同下列範例所示。

範例中使用的 gcloud auth application-default print-access-token 指令,可取得使用 Google Cloud Platform gcloud CLI 為專案設定的服務帳戶存取權杖。如需安裝 gcloud CLI、使用服務帳戶建立專案的操作說明,請參閱快速入門

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'document':{
    'type':'PLAIN_TEXT',
    'content':'Google, headquartered in Mountain View, unveiled the new Android
    phone at the Consumer Electronic Show.  Sundar Pichai said in his keynote
    that users love their new Android phones.'
  },
  'classificationModelOptions': {
    'v2Model': {
      'contentCategoriesVersion': 'V2',
    }
  }
}" "https://language.googleapis.com/v1/documents:classifyText"

Go

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Go API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


func classifyText(ctx context.Context, client *language.Client, text string) (*languagepb.ClassifyTextResponse, error) {
	return client.ClassifyText(ctx, &languagepb.ClassifyTextRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_Content{
				Content: text,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		ClassificationModelOptions: &languagepb.ClassificationModelOptions{
			ModelType: &languagepb.ClassificationModelOptions_V2Model_{
				V2Model: &languagepb.ClassificationModelOptions_V2Model{
					ContentCategoriesVersion: languagepb.ClassificationModelOptions_V2Model_V2,
				},
			},
		},
	})
}

Java

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Java API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

// Instantiate the Language client com.google.cloud.language.v2.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  // Set content to the text string
  Document doc = Document.newBuilder().setContent(text).setType(Type.PLAIN_TEXT).build();
  ClassifyTextRequest request = ClassifyTextRequest.newBuilder().setDocument(doc).build();
  // Detect categories in the given text
  ClassifyTextResponse response = language.classifyText(request);

  for (ClassificationCategory category : response.getCategoriesList()) {
    System.out.printf(
        "Category name : %s, Confidence : %.3f\n",
        category.getName(), category.getConfidence());
  }
}

Node.js

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Node.js API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following line to run this code.
 */
// const text = 'Your text to analyze, e.g. Hello, world!';

// Prepares a document, representing the provided text
const document = {
  content: text,
  type: 'PLAIN_TEXT',
};

const classificationModelOptions = {
  v2Model: {
    contentCategoriesVersion: 'V2',
  },
};

// Classifies text in the document
const [classification] = await client.classifyText({
  document,
  classificationModelOptions,
});
console.log('Categories:');
classification.categories.forEach(category => {
  console.log(`Name: ${category.name}, Confidence: ${category.confidence}`);
});

Python

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Python API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

from google.cloud import language_v1


def sample_classify_text(text_content):
    """
    Classifying Content in a String

    Args:
      text_content The text content to analyze.
    """

    client = language_v1.LanguageServiceClient()

    # text_content = "That actor on TV makes movies in Hollywood and also stars in a variety of popular new TV shows."

    # Available types: PLAIN_TEXT, HTML
    type_ = language_v1.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {"content": text_content, "type_": type_, "language": language}

    content_categories_version = (
        language_v1.ClassificationModelOptions.V2Model.ContentCategoriesVersion.V2
    )
    response = client.classify_text(
        request={
            "document": document,
            "classification_model_options": {
                "v2_model": {"content_categories_version": content_categories_version}
            },
        }
    )
    # Loop through classified categories returned from the API
    for category in response.categories:
        # Get the name of the category representing the document.
        # See the predefined taxonomy of categories:
        # https://cloud.google.com/natural-language/docs/categories
        print(f"Category name: {category.name}")
        # Get the confidence. Number representing how certain the classifier
        # is that this category represents the provided text.
        print(f"Confidence: {category.confidence}")

其他語言

C#:請按照用戶端程式庫頁面上的 C# 設定說明操作,然後參閱 .NET 適用的 Natural Language 參考說明文件

PHP:請按照用戶端程式庫頁面上的 PHP 設定操作說明操作,然後參閱 PHP 的 Natural Language 參考文件

Ruby:請按照用戶端程式庫頁面上的 Ruby 設定說明操作,然後參閱 Ruby 適用的 Natural Language 參考文件

分類 Cloud Storage 中的內容

以下是將 Cloud Storage 文字檔案中儲存的內容分類的範例:

通訊協定

如要將 Cloud Storage 中儲存的文件內容分類,請向 documents:classifyText REST 方法發出 POST 要求,並提供適當的要求主體及文件路徑,如同下列範例所示。

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'document':{
    'type':'PLAIN_TEXT',
    'gcsContentUri':'gs://<bucket-name>/<object-name>'
  }
  'classificationModelOptions': {
    'v1Model': {
    }
  }
}" "https://language.googleapis.com/v1/documents:classifyText"

Go

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Go API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


func classifyTextFromGCS(ctx context.Context, gcsURI string) (*languagepb.ClassifyTextResponse, error) {
	return client.ClassifyText(ctx, &languagepb.ClassifyTextRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_GcsContentUri{
				GcsContentUri: gcsURI,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
	})
}

Java

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Java API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

// Instantiate the Language client com.google.cloud.language.v2.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  // Set the GCS content URI path
  Document doc =
      Document.newBuilder().setGcsContentUri(gcsUri).setType(Type.PLAIN_TEXT).build();
  ClassifyTextRequest request = ClassifyTextRequest.newBuilder().setDocument(doc).build();
  // Detect categories in the given file
  ClassifyTextResponse response = language.classifyText(request);

  for (ClassificationCategory category : response.getCategoriesList()) {
    System.out.printf(
        "Category name : %s, Confidence : %.3f\n",
        category.getName(), category.getConfidence());
  }
}

Node.js

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Node.js API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

// Imports the Google Cloud client library.
const language = require('@google-cloud/language');

// Creates a client.
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following lines to run this code
 */
// const bucketName = 'Your bucket name, e.g. my-bucket';
// const fileName = 'Your file name, e.g. my-file.txt';

// Prepares a document, representing a text file in Cloud Storage
const document = {
  gcsContentUri: `gs://${bucketName}/${fileName}`,
  type: 'PLAIN_TEXT',
};

// Classifies text in the document
const [classification] = await client.classifyText({document});

console.log('Categories:');
classification.categories.forEach(category => {
  console.log(`Name: ${category.name}, Confidence: ${category.confidence}`);
});

Python

如要瞭解如何安裝及使用 Natural Language 的用戶端程式庫,請參閱 Natural Language 用戶端程式庫。詳情請參閱 Natural Language Python API 參考資料說明文件

如要向 Natural Language 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

from google.cloud import language_v1


def sample_classify_text(gcs_content_uri):
    """
    Classifying Content in text file stored in Cloud Storage

    Args:
      gcs_content_uri Google Cloud Storage URI where the file content is located.
      e.g. gs://[Your Bucket]/[Path to File]
      The text file must include at least 20 words.
    """

    client = language_v1.LanguageServiceClient()

    # gcs_content_uri = 'gs://cloud-samples-data/language/classify-entertainment.txt'

    # Available types: PLAIN_TEXT, HTML
    type_ = language_v1.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {
        "gcs_content_uri": gcs_content_uri,
        "type_": type_,
        "language": language,
    }

    response = client.classify_text(request={"document": document})
    # Loop through classified categories returned from the API
    for category in response.categories:
        # Get the name of the category representing the document.
        # See the predefined taxonomy of categories:
        # https://cloud.google.com/natural-language/docs/categories
        print(f"Category name: {category.name}")
        # Get the confidence. Number representing how certain the classifier
        # is that this category represents the provided text.
        print(f"Confidence: {category.confidence}")

其他語言

C#:請按照用戶端程式庫頁面上的 C# 設定說明操作,然後參閱 .NET 適用的 Natural Language 參考說明文件

PHP:請按照用戶端程式庫頁面上的 PHP 設定操作說明操作,然後參閱 PHP 的 Natural Language 參考文件

Ruby:請按照用戶端程式庫頁面上的 Ruby 設定說明操作,然後參閱 Ruby 適用的 Natural Language 參考文件