YARA-L 2.0 语言概览

支持的语言:

YARA-L 2.0 是一种计算机语言,用于创建在企业日志数据注入您的 Google Security Operations 实例时搜索数据的规则。YARA-L 语法派生自 VirusTotal 开发的 YARA 语言。 该语言与 Google SecOps Detection Engine 配合使用,让您可以在大量数据中寻找威胁和其他事件。

详情请参阅以下内容:

YARA-L 2.0 示例规则

以下示例展示了使用 YARA-L 2.0 编写的规则。每个示例都展示了如何关联规则语言中的事件。

规则和调整

以下规则会检查事件数据中是否存在特定模式,如果找到这些模式,则会创建检测结果。此规则包含一个用于跟踪事件类型的变量 $e1 和一个 UDM 字段 metadata.event_type。该规则会检查正则表达式匹配项是否包含 e1。当发生 $e1 事件时,系统会创建检测结果。规则中包含 not 条件,以排除某些非恶意路径。您可以添加 not 条件来防止出现假正例。

rule suspicious_unusual_location_svchost_execution
{

 meta:
   author = "Google Cloud Security"
   description = "Windows 'svchost' executed from an unusual location"
   yara_version = "YL2.0"
   rule_version = "1.0"

 events:

   $e1.metadata.event_type = "PROCESS_LAUNCH"
   re.regex($e1.principal.process.command_line, `\bsvchost(\.exe)?\b`) nocase
   not re.regex($e1.principal.process.command_line, `\\Windows\\System32\\`) nocase

condition:

   $e1
}

不同城市的登录信息

以下规则会在 5 分钟内搜索从两个或多个城市登录过您的企业的用户:

rule DifferentCityLogin {
  meta:

  events:
    $udm.metadata.event_type = "USER_LOGIN"
    $udm.principal.user.userid = $user
    $udm.principal.location.city = $city

  match:
    $user over 5m

  condition:
    $udm and #city > 1
}

Match 变量$user

事件变量$udm

占位符变量$city$user

下面介绍了此规则的工作原理:

  • 对用户名为 ($user) 的事件进行分组,并在找到匹配项时返回该值 ($user)。
  • 时间范围为 5 分钟,表示只有间隔不到 5 分钟的事件相关联。
  • 搜索事件类型为 USER_LOGIN 的事件组 ($udm)。
  • 对于该事件组,该规则会将用户 ID 命名为 $user,并将登录城市命名为 $city.
  • 如果 5 分钟时间范围内事件组 ($udm) 中的不同 city 值数量(以 #city 表示)大于 1,则返回匹配项。

快速创建和删除用户

以下规则可搜索在 4 小时内创建然后删除的用户:

rule UserCreationThenDeletion {
  meta:

  events:
    $create.target.user.userid = $user
    $create.metadata.event_type = "USER_CREATION"

    $delete.target.user.userid = $user
    $delete.metadata.event_type = "USER_DELETION"

    $create.metadata.event_timestamp.seconds <=
       $delete.metadata.event_timestamp.seconds

  match:
    $user over 4h

  condition:
    $create and $delete
}

事件变量$create$delete

Match 变量$user

占位符变量:不适用

下面介绍了此规则的工作原理:

  • 对用户名为 ($user) 的事件进行分组,并在找到匹配项时返回该值 ($user)。
  • 时间范围为 4 小时,这意味着只有不到 4 小时的事件才会关联。
  • 搜索两个事件组($create$delete,其中 $create 等同于 #create >= 1)。
  • $create 对应于 USER_CREATION 事件,并以 $user 形式调用用户 ID。
  • $user 用于将两组事件联接在一起。
  • $delete 对应于 USER_DELETION 事件,并以 $user 形式调用用户 ID。此规则查找两个事件组中用户标识符相同的匹配项。
  • 此规则查找事件 $delete 发生的时间晚于 $create 事件的情况,并在发现时返回匹配项。

单事件规则

单事件规则是与单个事件相关联的规则。单事件规则可以是:

  • 没有匹配部分的任何规则。
  • 包含 match 部分和 condition 部分的规则仅检查是否存在 1 个事件(例如,“$e”“#e > 0”“#e >= 1”“1 <= #e”“0 < #e”)。

例如,以下规则会搜索用户的登录事件,然后返回在 Google SecOps 账号中存储的企业数据中遇到的首个事件:

rule SingleEventRule {
  meta:
    author = "noone@altostrat.com"

  events:
    $e.metadata.event_type = "USER_LOGIN"

  condition:
    $e
}

下面是包含匹配部分的单事件规则的另一个示例。此规则会搜索在 5 分钟内至少登录过一次的用户。它只检查用户登录事件是否存在。

rule SingleEventRule {
  meta:
    author = "alice@example.com"
    description = "windowed single event example rule"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 5m

  condition:
    #e > 0
}
rule MultiEventRule{
  meta:
    author = "alice@example.com"
    description = "Rule with outcome condition and simple existence condition on one event variable"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 10m

  outcome:
    $num_events_in_match_window = count($e.metadata.id)

  condition:
    #e > 0 and $num_events_in_match_window >= 10 // Could be rewritten as #e >= 10
}

多事件规则

使用多事件规则对指定时间范围内的多个事件进行分组,并尝试找到事件之间的相关性。典型的多事件规则包含以下部分:

  • 一个 match 部分,用于指定需要对事件进行分组的时间范围。
  • 一个 condition 部分,用于指定应触发检测和检查多个事件是否存在的条件。

例如,以下规则会搜索在 10 分钟内至少登录了 10 次的用户:

rule MultiEventRule {
  meta:
    author = "noone@altostrat.com"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 10m

  condition:
    #e >= 10
}

IP 地址范围内的单个事件

以下示例展示了一条规则,用于搜索两个特定主机名和特定 IP 地址范围之间的匹配项:

rule OrsAndNetworkRange {
  meta:
    author = "noone@altostrat.com"

  events:
    // Checks CIDR ranges.
    net.ip_in_range_cidr($e.principal.ip, "203.0.113.0/24")

    // Detection when the hostname field matches either value using or.
    $e.principal.hostname = /pbateman/ or $e.principal.hostname = /sspade/

  condition:
    $e
}

any 和 all 规则示例

以下规则会搜索所有来源 IP 地址在 5 分钟的时间内与已知安全的 IP 地址不匹配的登录事件。

rule SuspiciousIPLogins {
  meta:
    author = "alice@example.com"

  events:
    $e.metadata.event_type = "USER_LOGIN"

    // Detects if all source IP addresses in an event do not match "100.97.16.0"
    // For example, if an event has source IP addresses
    // ["100.97.16.1", "100.97.16.2", "100.97.16.3"],
    // it will be detected since "100.97.16.1", "100.97.16.2",
    // and "100.97.16.3" all do not match "100.97.16.0".

    all $e.principal.ip != "100.97.16.0"

    // Assigns placeholder variable $ip to the $e.principal.ip repeated field.
    // There will be one detection per source IP address.
    // For example, if an event has source IP addresses
    // ["100.97.16.1", "100.97.16.2", "100.97.16.3"],
    // there will be one detection per address.

    $e.principal.ip = $ip

  match:
    $ip over 5m

  condition:
    $e
}

规则中的正则表达式

以下 YARA-L 2.0 正则表达式示例搜索从 altostrat.com 网域收到电子邮件的事件。由于 nocase 已添加到 $host 变量 regex 比较和 regex 函数,因此这些比较都不区分大小写。

rule RegexRuleExample {
  meta:
    author = "noone@altostrat.com"

  events:
    $e.principal.hostname = $host
    $host = /.*HoSt.*/ nocase
    re.regex($e.network.email.from, `.*altostrat\.com`) nocase

  match:
    $host over 10m

  condition:
    #e > 10
}

复合规则示例

复合检测通过使用复合规则来增强威胁检测功能。 这些复合规则使用其他规则的检测结果作为输入。这样一来,系统便可检测到单个规则可能无法检测到的复杂威胁。如需了解详情,请参阅复合检测概览

绊线检测

触发器复合检测是最简单的一种复合检测,可对检测结果中的字段(例如结果变量或规则元数据)进行操作。它们有助于过滤可能表明风险较高的条件(例如管理员用户或生产环境)的检测结果。

rule composite_admin_detection {
  meta:
    rule_name = "Detection with Admin User"
    author = "Google Cloud Security"
    description = "Composite rule that looks for any detections where the actor is an admin user"
    severity = "Medium"

  events:
    $rule_name = $d.detection.detection.rule_name
    $principal_user = $d.detection.detection.outcomes["principal_users"]
    $principal_user = /admin|root/ nocase

  match:
    $principal_user over 1h

  outcome:
    $risk_score = 75
    $upstream_rules = array_distinct($rule_name)

  condition:
    $d
}

阈值和聚合检测

借助聚合复合检测规则,您可以根据共享属性(例如主机名或用户名)对检测结果进行分组,并分析汇总的数据。以下是常见的应用场景:

  • 识别生成大量安全提醒或汇总风险的用户。
  • 通过汇总相关检测结果来检测具有异常活动模式的主机。

风险汇总示例:

rule composite_risk_aggregation {
  meta:
    rule_name = "Risk Aggregation Composite"
    author = "Google Cloud Security"
    description = "Composite detection that aggregates risk of a user over 48 hours"
    severity = "High"

  events:
    $rule_name = $d.detection.detection.rule_name
    $principal_user = $d.detection.detection.outcomes["principal_users"]
    $risk = $d.detection.detection.risk_score

  match:
    $principal_user over 48h

  outcome:
    $risk_score = 90
    $cumulative_risk = sum($risk)
    $principal_users = array_distinct($principal_users)
    $upstream_rules = array_distinct($rule_name)

  condition:
    $d and $cumulative_risk > 500
}

策略汇总示例:

rule composite_tactic_aggregation {
  meta:
    rule_name = "MITRE Tactic Aggregation Composite"
    author = "Google Cloud Security"
    description = "Composite detection that detects if a user has triggered detections over multiple mitre tactics."
    severity = "Medium"

  events:
    $principal_user = $d.detection.detection.outcomes["principal_users"]
    $tactic = $d.detection.detection.rule_labels["tactic"]
    $rule_name = $d.detection.detection.rule_name

  match:
    $principal_user over 48h

  outcome:
    $mitre_tactics_count = count_distinct($tactic)
    $mitre_tactics = array_distinct($d.detection.rule_labels["tactic"])
    $risk_score = min(100, (50+15*$mitre_tactics_count))
    $upstream_rules = array_distinct($rule_name)

  condition:
    $d and $mitre_tactics_count > 1
}

顺序复合检测

顺序复合检测用于识别相关事件的模式,其中检测顺序非常重要,例如检测到暴力破解登录尝试后,又检测到成功登录。这些模式可能涉及多个基本检测结果,也可能涉及基本检测结果和事件的组合。

rule composite_bruteforce_login {
  meta:
    rule_name = "Bruteforce Login Composite"
    author = "Google Cloud Security"
    description = "Detects when an IP address associated with a Workspace brute force attempt successfully logs in"
    severity = "High"

  events:
    $bruteforce_detection.detection.detection.rule_name = /Workspace Anomalous Failed Logins/
    $bruteforce_ip = $d.detection.detection.outcomes["principal_ips"]

    $login_event.metadata.product_name = "login"
    $login_event.metadata.product_event_type = "login_success"
    $login_event.metadata.vendor_name = "Google Workspace"
    $login_ip = $login_event.principal.ip

    // Ensure the brute force detection and successful login occurred from the same IP
    $login_ip = $bruteforce_ip

    $target_account = $login_event.target.user.email_addresses

    // Ensure the brute force detection occurred before the successful login
    $bruteforce_detection.detection.detection_time.seconds < $login_event.metadata.event_timestamp.seconds

  match:
    $bruteforce_ip over 24h

  outcome:
    $risk_score = 90
    $principal_users = array_distinct($target_account)

  condition:
    $bruteforce_detection and $login_event
}

情境感知检测

情境感知复合检测功能可利用其他情境(例如威胁 Feed 中发现的 IP 地址)来丰富检测结果。

rule composite_tor_enrichment {
  meta:
    rule_name = "Detection with IP from TOR Feed"
    author = "Google Cloud Security"
    description = "Adds additional context from the TOR intel feed to detections"
    severity = "High"

  events:
    $detection_ip = $d.detection.detection.outcomes["principal_ips"]
    $gcti.graph.metadata.entity_type = "IP_ADDRESS"
    $gcti.graph.metadata.vendor_name = "Google Cloud Threat Intelligence"
    $gcti_feed.graph.metadata.source_type = "GLOBAL_CONTEXT"
    $gcti.graph.metadata.product_name = "GCTI Feed"
    $gcti.graph.metadata.threat.threat_feed_name = "Tor Exit Nodes"

    $detection_ip = $gcti.graph.entity.ip

    $rule_name = $d.detection.detection.rule_name
    $risk = $d.detection.detection.outcomes["risk_score"]

  match:
    $detection_ip, $rule_name over 1h

  outcome:
    $risk_score = 80
    $upstream_rule = array_distinct($rule_name)

  condition:
    $d and $gcti
}

共同出现检测

共同出现复合检测是一种聚合形式,可以检测相关事件的组合,例如用户触发的权限升级和数据渗漏检测的组合。

rule composite_privesc_exfil_sequential {
  meta:
    rule_name = "Privilege Escalation and Exfiltration Composite"
    author = "Google Cloud Security"
    description = "Looks for a detection sequence of privilege escalation followed by exfiltration."
    severity = "High"

  events:
    $privilege_escalation.detection.detection.rule_labels["tactic"] = "TA0004"
    $exfiltration.detection.detection.rule_labels["tactic"] = "TA0010"

    $pe_user = $privilege_escalation.detection.detection.outcomes["principal_users"]
    $ex_user = $exfiltration.detection.detection.outcomes["principal_users"]

    $pe_user = $ex_user

  match:
    $pe_user over 48h

  outcome:
    $risk_score = 75
    $privesc_rules = array_distinct($privilege_escalation.detection.detection.rule_name)
    $exfil_rules = array_distinct($exfiltration.detection.detection.rule_name)

  condition:
    $privilege_escalation and $exfiltration
}

滑动窗口规则示例

以下 YARA-L 2.0 滑动窗口示例搜索 firewall_1 事件后缺少 firewall_2 事件。after 关键字与数据透视事件变量 $e1 结合使用,指定在关联事件时应仅检查每个 firewall_1 事件 10 分钟后。

rule SlidingWindowRuleExample {
  meta:
    author = "alice@example.com"

  events:
    $e1.metadata.product_name = "firewall_1"
    $e1.principal.hostname = $host

    $e2.metadata.product_name = "firewall_2"
    $e2.principal.hostname = $host

  match:
    $host over 10m after $e1

  condition:
    $e1 and !$e2
}

零值排除示例

规则引擎会隐式过滤掉 match 部分中使用的所有占位符的零值。如需了解详情,请参阅match 部分中的零值处理。 可以使用 allow_zero_values 选项停用此功能,如 allow_zero_values 中所述。

但是,对于其他引用的事件字段,除非您明确指定此类条件,否则不会排除零值。

rule ExcludeZeroValues {
  meta:
    author = "alice@example.com"

  events:
    $e1.metadata.event_type = "NETWORK_DNS"
    $e1.principal.hostname = $hostname

    // $e1.principal.user.userid may be empty string.
    $e1.principal.user.userid != "Guest"

    $e2.metadata.event_type = "NETWORK_HTTP"
    $e2.principal.hostname = $hostname

    // $e2.target.asset_id cannot be empty string as explicitly specified.
    $e2.target.asset_id != ""

  match:
    // $hostname cannot be empty string. The rule behaves as if the
    // predicate, `$hostname != ""` was added to the events section, because
    // `$hostname` is used in the match section.
    $hostname over 1h

  condition:
    $e1 and $e2
}

包含 outcome 部分的规则示例

您可以在 YARA-L 2.0 规则中添加可选的 outcome 部分,以提取每次检测的其他信息。在条件部分,您还可以指定结果变量的条件。您可以使用检测规则的 outcome 部分来设置供下游使用的变量。例如,您可以根据正在分析的事件中的数据设置严重程度得分。

详情请参阅以下内容:

包含结果部分的多事件规则:

以下规则会查看两个事件,以获取 $hostname 的值。如果 $hostname 的值在 5 分钟内匹配,则应用严重程度得分。在 match 部分中添加时间段时,规则会在指定的时间段内进行检查。

rule OutcomeRuleMultiEvent {
    meta:
      author = "Google Cloud Security"
    events:
      $u.udm.principal.hostname = $hostname
      $asset_context.graph.entity.hostname = $hostname

      $severity = $asset_context.graph.entity.asset.vulnerabilities.severity

    match:
      $hostname over 5m

    outcome:
      $risk_score =
        max(
            100
          + if($hostname = "my-hostname", 100, 50)
          + if($severity = "HIGH", 10)
          + if($severity = "MEDIUM", 5)
          + if($severity = "LOW", 1)
        )

      $asset_id_list =
        array(
          if($u.principal.asset_id = "",
             "Empty asset id",
             $u.principal.asset_id
          )
        )

      $asset_id_distinct_list = array_distinct($u.principal.asset_id)

      $asset_id_count = count($u.principal.asset_id)

      $asset_id_distinct_count = count_distinct($u.principal.asset_id)

    condition:
      $u and $asset_context and $risk_score > 50 and not arrays.contains($asset_id_list, "id_1234")
}

rule OutcomeRuleMultiEvent {
    meta:
      author = "alice@example.com"
    events:
      $u.udm.principal.hostname = $hostname
      $asset_context.graph.entity.hostname = $hostname

      $severity = $asset_context.graph.entity.asset.vulnerabilities.severity

    match:
      $hostname over 5m

    outcome:
      $total_network_bytes = sum($u.network.sent_bytes) + sum($u.network.received_bytes)

      $risk_score = if(total_network_bytes > 1024, 100, 50) + 
        max(
          if($severity = "HIGH", 10)
          + if($severity = "MEDIUM", 5)
          + if($severity = "LOW", 1)
        )

      $asset_id_list =
        array(
          if($u.principal.asset_id = "",
             "Empty asset id",
             $u.principal.asset_id
          )
        )

      $asset_id_distinct_list = array_distinct($u.principal.asset_id)

      $asset_id_count = count($u.principal.asset_id)

      $asset_id_distinct_count = count_distinct($u.principal.asset_id)

    condition:
      $u and $asset_context and $risk_score > 50 and not arrays.contains($asset_id_list, "id_1234")
}

包含结果部分的单事件规则:

rule OutcomeRuleSingleEvent {
    meta:
        author = "alice@example.com"
    events:
        $u.metadata.event_type = "FILE_COPY"
        $u.principal.file.size = $file_size
        $u.principal.hostname = $hostname

    outcome:
        $suspicious_host = $hostname
        $admin_severity = if($u.principal.userid in %admin_users, "SEVERE", "MODERATE")
        $severity_tag = if($file_size > 1024, $admin_severity, "LOW")

    condition:
        $u
}

将多事件结果规则重构为单事件结果规则。

outcome 部分可用于单事件规则(不含 match 部分的规则)和多事件规则(含 match 部分的规则)。如果您之前将规则设计为多事件规则只是为了使用结果部分,那么您可以选择删除 match 部分来重构这些规则,以提高性能。请注意,由于您的规则不再包含应用分组的 match 部分,您可能会收到更多检测结果。此重构仅适用于使用一个事件变量的规则,如以下示例所示。

仅使用一个事件变量的多事件结果规则(重构的理想候选对象):

rule OutcomeMultiEventPreRefactor {
    meta:
      author = "alice@example.com"
      description = "Outcome refactor rule, before the refactor"

    events:
      $u.udm.principal.hostname = $hostname

    match:
      $hostname over 5m

    outcome:
      $risk_score = max(if($hostname = "my-hostname", 100, 50))

    condition:
      $u
}

您可以删除 match 部分来重构规则。请注意,您还必须移除 outcome 部分中的聚合,因为规则现在是单事件规则。如需详细了解聚合,请参阅结果聚合

rule OutcomeSingleEventPostRefactor {
    meta:
      author = "alice@example.com"
      description = "Outcome refactor rule, after the refactor"

    events:
      $u.udm.principal.hostname = $hostname

    // We deleted the match section.

    outcome:
      // We removed the max() aggregate.
      $risk_score = if($hostname = "my-hostname", 100, 50)

    condition:
      $u
}

函数到占位符规则示例

您可以将占位符变量分配给函数调用的结果,并可以在规则的其他部分(例如 match 部分、outcome 部分或 condition 部分)中使用该占位符变量。请参阅以下示例:

rule FunctionToPlaceholderRule {
    meta:
      author = "alice@example.com"
      description = "Rule that uses function to placeholder assignments"

    events:
        $u.metadata.event_type = "EMAIL_TRANSACTION"

        // Use function-placeholder assignment to extract the
        // address from an email.
        // address@website.com -> address
        $email_to_address_only = re.capture($u.network.email.from , "(.*)@")

        // Use function-placeholder assignment to normalize an email:
        // uid@??? -> uid@company.com
        $email_from_normalized = strings.concat(
            re.capture($u.network.email.from , "(.*)@"),
            "@company.com"
        )

        // Use function-placeholder assignment to get the day of the week of the event.
        // 1 = Sunday, 7 = Saturday.
        $dayofweek = timestamp.get_day_of_week($u.metadata.event_timestamp.seconds)

    match:
        // Use placeholder (from function-placeholder assignment) in match section.
        // Group by the normalized from email, and expose it in the detection.
        $email_from_normalized over 5m

    outcome:
        // Use placeholder (from function-placeholder assignment) in outcome section.
        // Assign more risk if the event happened on weekend.
        $risk_score = max(
            if($dayofweek = 1, 10, 0) +
            if($dayofweek = 7, 10, 0)
        )

    condition:
        // Use placeholder (from function-placeholder assignment) in condition section.
        // Match if an email was sent to multiple addresses.
        #email_to_address_only > 1
}

结果条件示例规则

condition 部分中,您可以使用在 outcome 部分中定义的结果变量。以下示例演示了如何使用结果条件根据风险得分进行过滤,以减少检测中的噪声。

rule OutcomeConditionalRule {
    meta:
        author = "alice@example.com"
        description = "Rule that uses outcome conditionals"

    events:
        $u.metadata.event_type = "FILE_COPY"
        $u.principal.file.size = $file_size
        $u.principal.hostname = $hostname

        // 1 = Sunday, 7 = Saturday.
        $dayofweek = timestamp.get_day_of_week($u.metadata.collected_timestamp.seconds)

    outcome:
        $risk_score =
            if($file_size > 500*1024*1024, 2) + // Files 500MB are moderately risky
            if($file_size > 1024*1024*1024, 3) + // Files over 1G get assigned extra risk
            if($dayofweek=1 or $dayofweek=7, 4) + // Events from the weekend are suspicious
            if($hostname = /highly-privileged/, 5) // Check for files from highly privileged devices

    condition:
        $u and $risk_score >= 10
}

需要更多帮助?从社区成员和 Google SecOps 专业人士那里获得解答。