使用 Update API

總覽

Update API 可讓用戶端應用程式下載經雜湊處理的 Web Risk 清單,以便儲存在本機或記憶體內資料庫。然後在本機檢查網址。如果在本機資料庫中發現相符資料,用戶端就會傳送要求至 Web Risk 伺服器,以確認該網址是否列於 Web Risk 清單中。

更新本機資料庫

為確保資料庫維持最新狀態,用戶端必須定期更新本機資料庫中的 Web Risk 清單。為節省頻寬,用戶端會下載網址的雜湊前置字串,而非原始網址。舉例來說,如果「www.badurl.com/」位於 Web Risk 清單中,用戶端會下載該網址的 SHA256 雜湊前置字串,而非網址本身。在大多數情況下,雜湊前置字串長度為 4 個位元組,也就是說,下載單一清單項目的平均頻寬費用為 4 個位元組 (壓縮前)。

如要更新本機資料庫中的 Web Risk 清單,請將 HTTP GET 要求傳送至 threatLists.computeDiff 方法:

  • HTTP GET 要求包含要更新的清單名稱,以及用戶端限制,以因應記憶體和頻寬限制。
  • HTTP GET 回應會傳回完整更新或部分更新。 回應可能也會傳回建議的等待時間,直到下一個計算差異作業為止。

例如:threatLists.computeDiff

HTTP GET 要求

在下列範例中,系統會要求 MALWARE Web Risk 清單的差異。詳情請參閱threatLists.computeDiff 查詢參數,以及程式碼範例後方的說明。

HTTP 方法和網址:

GET https://webrisk.googleapis.com/v1/threatLists:computeDiff?threatType=MALWARE&versionToken=Gg4IBBADIgYQgBAiAQEoAQ%3D%3D&constraints.maxDiffEntries=2048&constraints.maxDatabaseEntries=4096&constraints.supportedCompressions=RAW&key=API_KEY

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
"https://webrisk.googleapis.com/v1/threatLists:computeDiff?threatType=MALWARE&versionToken=Gg4IBBADIgYQgBAiAQEoAQ%3D%3D&constraints.maxDiffEntries=2048&constraints.maxDatabaseEntries=4096&constraints.supportedCompressions=RAW&key=API_KEY"

PowerShell

執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://webrisk.googleapis.com/v1/threatLists:computeDiff?threatType=MALWARE&versionToken=Gg4IBBADIgYQgBAiAQEoAQ%3D%3D&constraints.maxDiffEntries=2048&constraints.maxDatabaseEntries=4096&constraints.supportedCompressions=RAW&key=API_KEY" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "recommendedNextDiff": "2020-01-08T19:41:45.436722194Z",
  "responseType": "RESET",
  "additions": {
    "rawHashes": [
      {
        "prefixSize": 4,
        "rawHashes": "AArQMQAMoUgAPn8lAE..."
      }
    ]
  },
  "newVersionToken": "ChAIARAGGAEiAzAwMSiAEDABEPDyBhoCGAlTcIVL",
  "checksum": {
    "sha256": "wy6jh0+MAg/V/+VdErFhZIpOW+L8ulrVwhlV61XkROI="
  }
}

Java


import com.google.cloud.webrisk.v1.WebRiskServiceClient;
import com.google.protobuf.ByteString;
import com.google.webrisk.v1.CompressionType;
import com.google.webrisk.v1.ComputeThreatListDiffRequest;
import com.google.webrisk.v1.ComputeThreatListDiffRequest.Constraints;
import com.google.webrisk.v1.ComputeThreatListDiffResponse;
import com.google.webrisk.v1.ThreatType;
import java.io.IOException;

public class ComputeThreatListDiff {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // The threat list to update. Only a single ThreatType should be specified per request.
    ThreatType threatType = ThreatType.MALWARE;

    // The current version token of the client for the requested list. If the client does not have
    // a version token (this is the first time calling ComputeThreatListDiff), this may be
    // left empty and a full database snapshot will be returned.
    ByteString versionToken = ByteString.EMPTY;

    // The maximum size in number of entries. The diff will not contain more entries
    // than this value. This should be a power of 2 between 2**10 and 2**20.
    // If zero, no diff size limit is set.
    int maxDiffEntries = 1024;

    // Sets the maximum number of entries that the client is willing to have in the local database.
    // This should be a power of 2 between 2**10 and 2**20. If zero, no database size limit is set.
    int maxDatabaseEntries = 1024;

    // The compression type supported by the client.
    CompressionType compressionType = CompressionType.RAW;

    computeThreatDiffList(threatType, versionToken, maxDiffEntries, maxDatabaseEntries,
        compressionType);
  }

  // Gets the most recent threat list diffs. These diffs should be applied to a local database of
  // hashes to keep it up-to-date.
  // If the local database is empty or excessively out-of-date,
  // a complete snapshot of the database will be returned. This Method only updates a
  // single ThreatList at a time. To update multiple ThreatList databases, this method needs to be
  // called once for each list.
  public static void computeThreatDiffList(ThreatType threatType, ByteString versionToken,
      int maxDiffEntries, int maxDatabaseEntries, CompressionType compressionType)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `webRiskServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (WebRiskServiceClient webRiskServiceClient = WebRiskServiceClient.create()) {

      Constraints constraints = Constraints.newBuilder()
          .setMaxDiffEntries(maxDiffEntries)
          .setMaxDatabaseEntries(maxDatabaseEntries)
          .addSupportedCompressions(compressionType)
          .build();

      ComputeThreatListDiffResponse response = webRiskServiceClient.computeThreatListDiff(
          ComputeThreatListDiffRequest.newBuilder()
              .setThreatType(threatType)
              .setVersionToken(versionToken)
              .setConstraints(constraints)
              .build());

      // The returned response contains the following information:
      // https://cloud.google.com/web-risk/docs/reference/rpc/google.cloud.webrisk.v1#computethreatlistdiffresponse
      // Type of response: DIFF/ RESET/ RESPONSE_TYPE_UNSPECIFIED
      System.out.println(response.getResponseType());
      // List of entries to add and/or remove.
      // System.out.println(response.getAdditions());
      // System.out.println(response.getRemovals());

      // New version token to be used the next time when querying.
      System.out.println(response.getNewVersionToken());

      // Recommended next diff timestamp.
      System.out.println(response.getRecommendedNextDiff());

      System.out.println("Obtained threat list diff.");
    }
  }
}

Python

from google.cloud import webrisk_v1
from google.cloud.webrisk_v1 import ComputeThreatListDiffResponse


def compute_threatlist_diff(
    threat_type: webrisk_v1.ThreatType,
    version_token: bytes,
    max_diff_entries: int,
    max_database_entries: int,
    compression_type: webrisk_v1.CompressionType,
) -> ComputeThreatListDiffResponse:
    """Gets the most recent threat list diffs.

    These diffs should be applied to a local database of hashes to keep it up-to-date.
    If the local database is empty or excessively out-of-date,
    a complete snapshot of the database will be returned. This Method only updates a
    single ThreatList at a time. To update multiple ThreatList databases, this method needs to be
    called once for each list.

    Args:
        threat_type: The threat list to update. Only a single ThreatType should be specified per request.
            threat_type = webrisk_v1.ThreatType.MALWARE

        version_token: The current version token of the client for the requested list. If the
            client does not have a version token (this is the first time calling ComputeThreatListDiff),
            this may be left empty and a full database snapshot will be returned.

        max_diff_entries: The maximum size in number of entries. The diff will not contain more entries
            than this value. This should be a power of 2 between 2**10 and 2**20.
            If zero, no diff size limit is set.
            max_diff_entries = 1024

        max_database_entries: Sets the maximum number of entries that the client is willing to have in the local database.
            This should be a power of 2 between 2**10 and 2**20. If zero, no database size limit is set.
            max_database_entries = 1024

        compression_type: The compression type supported by the client.
            compression_type = webrisk_v1.CompressionType.RAW

    Returns:
        The response which contains the diff between local and remote threat lists. In addition to the threat list,
        the response also contains the version token and the recommended time for next diff.
    """

    webrisk_client = webrisk_v1.WebRiskServiceClient()

    constraints = webrisk_v1.ComputeThreatListDiffRequest.Constraints()
    constraints.max_diff_entries = max_diff_entries
    constraints.max_database_entries = max_database_entries
    constraints.supported_compressions = [compression_type]

    request = webrisk_v1.ComputeThreatListDiffRequest()
    request.threat_type = threat_type
    request.version_token = version_token
    request.constraints = constraints

    response = webrisk_client.compute_threat_list_diff(request)

    # The returned response contains the following information:
    # https://cloud.google.com/web-risk/docs/reference/rpc/google.cloud.webrisk.v1#computethreatlistdiffresponse
    # Type of response: DIFF/ RESET/ RESPONSE_TYPE_UNSPECIFIED
    print(response.response_type)
    # New version token to be used the next time when querying.
    print(response.new_version_token)
    # Recommended next diff timestamp.
    print(response.recommended_next_diff)

    return response

Web Risk 清單

threatType 欄位會識別 Web Risk 清單。在範例中,系統會要求 MALWARE Web Risk 清單的差異。

版本權杖

versionToken」欄位會保留 Web Risk 清單的目前用戶端狀態。版本權杖會傳回至 threatLists.computeDiff 回應newVersionToken 欄位。如果是初始更新,請將 versionToken 欄位留空。

大小限制

maxDiffEntries 欄位會指定用戶端可管理的更新總數 (在本例中為 2048)。maxDatabaseEntries 欄位會指定本機資料庫可管理的項目總數 (在本例中為 4096)。用戶端應設定大小限制,以保護記憶體和頻寬限制,並防範清單成長。詳情請參閱「更新限制」。

支援的壓縮方式

supportedCompressions 欄位會列出用戶端支援的壓縮類型。在這個範例中,用戶端只支援未壓縮的原始資料。 不過,Web Risk 支援其他壓縮類型。詳情請參閱壓縮

HTTP GET 回應

在本例中,回應會使用要求壓縮類型,傳回 Web Risk 清單的部分更新。

回應主體

回應主體包含差異資訊 (回應類型、要套用至本機資料庫的增修內容、新版本權杖和檢查碼)。

在這個範例中,回應也包含建議的下一個差異時間。詳情請參閱threatLists.computeDiff 回應主體,以及程式碼範例後方的說明。

{
  "responseType" :   "DIFF",
  "recommendedNextDiff": "2019-12-31T23:59:59.000000000Z",
  "additions": {
    "compressionType": "RAW",
    "rawHashes": [{
      "prefixSize": 4,
      "rawHashes":  "rnGLoQ=="
    }]
  },
  "removals": {
    "rawIndices": {
      "indices": [0, 2, 4]
    }
  },
  "newVersionToken": "ChAIBRADGAEiAzAwMSiAEDABEAFGpqhd",
  "checksum": {
    "sha256": "YSgoRtsRlgHDqDA3LAhM1gegEpEzs1TjzU33vqsR8iM="
  },
  "recommendedNextDiff": "2019-07-17T15:01:23.045123456Z"
}

資料庫差異

responseType 欄位會指出是部分更新 (DIFF) 或完整更新 (RESET)。在範例中,系統會傳回部分差異,因此回應會包含新增和移除的項目。可以有多個新增項目,但只能有一組移除項目。詳情請參閱「資料庫差異」。

新版本權杖

newVersionToken 欄位會保留新更新的 Web Risk 清單的新版本權杖。用戶端必須儲存新的用戶端狀態,以供後續更新要求使用 (threatLists.computeDiff 要求中的 versionToken 欄位)。

檢查碼機制

用戶端可透過檢查碼確認本機資料庫未發生任何損毀情形。如果總和檢查碼不相符,用戶端必須清除資料庫,並使用空白的 versionToken 欄位重新發布更新。不過,處於這種情況的用戶端仍須遵守更新的時間間隔。詳情請參閱「要求頻率」。

recommendedNextDiff 欄位會指出時間戳記,用戶端應等待該時間戳記過後,再傳送其他更新要求。請注意,回應中可能包含建議的等待時間,也可能沒有。詳情請參閱「要求頻率」。

檢查網址

如要檢查網址是否在 Web Risk 清單中,用戶端必須先計算網址的雜湊和雜湊前置字元。詳情請參閱「網址和雜湊」。然後,用戶端會查詢本機資料庫,判斷是否有相符項目。如果本機資料庫中沒有雜湊前置字元,則該網址視為安全 (即不在 Web Risk 清單中)。

如果雜湊前置字串存在於本機資料庫中 (雜湊前置字串衝突),用戶端必須將雜湊前置字串傳送至 Web Risk 伺服器進行驗證。伺服器會傳回包含指定雜湊前置字串的所有完整長度 SHA 256 雜湊。如果其中一個完整雜湊值與有問題的網址完整雜湊值相符,則該網址會被視為不安全。如果沒有任何完整長度的雜湊值與有問題網址的完整長度雜湊值相符,則該網址視為安全。

Google 絕不會得知您檢查的網址。Google 會瞭解網址的雜湊前置字串,但雜湊前置字串無法提供實際網址的相關資訊。

如要檢查網址是否在 Web Risk 清單中,請將 HTTP GET 要求傳送至 hashes.search 方法:

  • HTTP GET 要求包含要檢查的網址雜湊前置字元。
  • HTTP GET 回應會傳回相符的完整長度雜湊值,以及正面和負面到期時間。

範例:hashes.search

HTTP GET 要求

在下列範例中,系統會傳送兩個 Web Risk 清單的名稱和雜湊前置字串,以供比較和驗證。詳情請參閱hashes.search 查詢參數,以及程式碼範例後方的說明。

curl \
  -H "Content-Type: application/json" \
  "https://webrisk.googleapis.com/v1/hashes:search?key=YOUR_API_KEY&threatTypes=MALWARE&threatTypes=SOCIAL_ENGINEERING&hashPrefix=WwuJdQ%3D%3D"

Java


import com.google.cloud.webrisk.v1.WebRiskServiceClient;
import com.google.protobuf.ByteString;
import com.google.webrisk.v1.SearchHashesRequest;
import com.google.webrisk.v1.SearchHashesResponse;
import com.google.webrisk.v1.SearchHashesResponse.ThreatHash;
import com.google.webrisk.v1.ThreatType;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Arrays;
import java.util.Base64;
import java.util.List;

public class SearchHashes {

  public static void main(String[] args) throws IOException, NoSuchAlgorithmException {
    // TODO(developer): Replace these variables before running the sample.
    // A hash prefix, consisting of the most significant 4-32 bytes of a SHA256 hash.
    // For JSON requests, this field is base64-encoded. Note that if this parameter is provided
    // by a URI, it must be encoded using the web safe base64 variant (RFC 4648).
    String uri = "http://example.com";
    String encodedUri = Base64.getUrlEncoder().encodeToString(uri.getBytes(StandardCharsets.UTF_8));
    MessageDigest digest = MessageDigest.getInstance("SHA-256");
    byte[] encodedHashPrefix = digest.digest(encodedUri.getBytes(StandardCharsets.UTF_8));

    // The ThreatLists to search in. Multiple ThreatLists may be specified.
    // For the list on threat types, see: https://cloud.google.com/web-risk/docs/reference/rpc/google.cloud.webrisk.v1#threattype
    List<ThreatType> threatTypes = Arrays.asList(ThreatType.MALWARE, ThreatType.SOCIAL_ENGINEERING);

    searchHash(ByteString.copyFrom(encodedHashPrefix), threatTypes);
  }

  // Gets the full hashes that match the requested hash prefix.
  // This is used after a hash prefix is looked up in a threatList and there is a match.
  // The client side threatList only holds partial hashes so the client must query this method
  // to determine if there is a full hash match of a threat.
  public static void searchHash(ByteString encodedHashPrefix, List<ThreatType> threatTypes)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `webRiskServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (WebRiskServiceClient webRiskServiceClient = WebRiskServiceClient.create()) {

      // Set the hashPrefix and the threat types to search in.
      SearchHashesResponse response = webRiskServiceClient.searchHashes(
          SearchHashesRequest.newBuilder()
              .setHashPrefix(encodedHashPrefix)
              .addAllThreatTypes(threatTypes)
              .build());

      // Get all the hashes that match the prefix. Cache the returned hashes until the time
      // specified in threatHash.getExpireTime()
      // For more information on response type, see: https://cloud.google.com/web-risk/docs/reference/rpc/google.cloud.webrisk.v1#threathash
      for (ThreatHash threatHash : response.getThreatsList()) {
        System.out.println(threatHash.getHash());
      }
      System.out.println("Completed searching threat hashes.");
    }
  }
}

Python

from google.cloud import webrisk_v1


def search_hashes(hash_prefix: bytes, threat_type: webrisk_v1.ThreatType) -> list:
    """Gets the full hashes that match the requested hash prefix.

    This is used after a hash prefix is looked up in a threatList and there is a match.
    The client side threatList only holds partial hashes so the client must query this method
    to determine if there is a full hash match of a threat.

    Args:
        hash_prefix: A hash prefix, consisting of the most significant 4-32 bytes of a SHA256 hash.
            For JSON requests, this field is base64-encoded. Note that if this parameter is provided
            by a URI, it must be encoded using the web safe base64 variant (RFC 4648).
            Example:
                uri = "http://example.com"
                sha256 = sha256()
                sha256.update(base64.urlsafe_b64encode(bytes(uri, "utf-8")))
                hex_string = sha256.digest()

        threat_type: The ThreatLists to search in. Multiple ThreatLists may be specified.
            For the list on threat types, see:
            https://cloud.google.com/web-risk/docs/reference/rpc/google.cloud.webrisk.v1#threattype
            threat_type = [webrisk_v1.ThreatType.MALWARE, webrisk_v1.ThreatType.SOCIAL_ENGINEERING]

    Returns:
        A hash list that contain all hashes that matches the given hash prefix.
    """
    webrisk_client = webrisk_v1.WebRiskServiceClient()

    # Set the hashPrefix and the threat types to search in.
    request = webrisk_v1.SearchHashesRequest()
    request.hash_prefix = hash_prefix
    request.threat_types = [threat_type]

    response = webrisk_client.search_hashes(request)

    # Get all the hashes that match the prefix. Cache the returned hashes until the time
    # specified in threat_hash.expire_time
    # For more information on response type, see:
    # https://cloud.google.com/web-risk/docs/reference/rpc/google.cloud.webrisk.v1#threathash
    hash_list = []
    for threat_hash in response.threats:
        hash_list.append(threat_hash.hash)
    return hash_list

Web Risk 清單

threatTypes 欄位會識別 Web Risk 清單。在這個範例中,系統會識別出兩個清單:MALWARESOCIAL_ENGINEERING

威脅雜湊前置字串

hashPrefix 欄位包含要檢查的網址雜湊前置字元。這個欄位必須包含本機資料庫中的確切雜湊前置字元。舉例來說,如果本機雜湊前置字串長度為 4 個位元組,則 hashPrefix 欄位的長度必須為 4 個位元組。如果本機雜湊前置字串的長度增加到 7 個位元組,則 hashPrefix 欄位的長度必須為 7 個位元組。

HTTP GET 回應

在下列範例中,回應會傳回相符的威脅,其中包含相符的 Web Risk 清單和到期時間。

回應主體

回應主體會包含比對資訊 (清單名稱、完整長度的雜湊值和快取時間長度)。詳情請參閱hashes.search 回應主體,以及程式碼範例後方的說明。

{
  "threats": [{
      "threatTypes": ["MALWARE"],
      "hash": "WwuJdQx48jP-4lxr4y2Sj82AWoxUVcIRDSk1PC9Rf-4="
      "expireTime": "2019-07-17T15:01:23.045123456Z"
    }, {
      "threatTypes": ["MALWARE", "SOCIAL_ENGINEERING"],
      "hash": "WwuJdQxaCSH453-uytERC456gf45rFExcE23F7-hnfD="
      "expireTime": "2019-07-17T15:01:23.045123456Z"
    },
  }],
  "negativeExpireTime": "2019-07-17T15:01:23.045123456Z"
}

相符

threats 欄位會傳回與雜湊前置字串相符的完整雜湊。 與這些雜湊值相應的網址會被視為不安全。如果找不到相符的雜湊前置字串,系統不會傳回任何內容,且對應的網址會視為安全。

到期時間

expireTimenegativeExpireTime 欄位會分別指出雜湊必須視為不安全或安全的時間。詳情請參閱「快取」。