Overview of aliasing and UDM enrichment in Google Security Operations

Supported in:

This document provides an overview of aliasing and UDM enrichment in Google Security Operations. It outlines common use cases and explains how aliasing and enrichment work within the platform.

Aliasing and UDM enrichment are key concepts in Google SecOps. They work together but serve different purposes.

  • Aliasing identifies the different names and additional context data that describe an indicator.
  • Enrichment uses aliasing to add context to a UDM event.

For example, a UDM event includes the hostname alex-macbook and indicates that a malicious file hash was executed by user alex. Using aliasing, we find that the hostname alex-macbook was assigned the IP address192.0.2.0at the time of the event, and that alex is leaving the company in 2 weeks. Stitching these aliases into the original UDM event adds context.

Supported aliasing and enrichment capabilities

Google SecOps supports aliasing and enrichment for the following:

  • Assets
  • Users
  • Processes
  • File hash metadata
  • Geographic locations
  • Cloud resources

How aliasing works

Aliasing enables enrichment. For example, using aliasing, you can find other IP addresses and MAC addresses associated with a hostname, or the job title and employment status associated with a user ID.

Like other features in Google SecOps, aliasing requires data to be ingested and indexed. Aliasing is organized into three main categories:

  • Customer specific data: Data unique to a customer. For example, only Aristocrat can provide data for amal@aristocrat.com. Customer-specific aliasing types include assets, users, and processes.
  • Global data: Ingested and indexed data that applies to all customers. For example, a globally-sourced indication about a malicious file can be used to check for the presence of that file in your enterprise.
  • Third-party service: Aliasing done by a third-party service provider. Google SecOps uses geographical services to find the physical location of IP addresses.

These types of aliasing are used together to generate asset aliasing results.

Asset aliasing

Asset aliasing links hostnames, IP addresses, MAC addresses, asset IDs, and other metadata. It involves the following steps:

  • EDR aliasing: Maps product IDs (asset IDs) to hostnames. EDR mapping fields are derived exclusively from the CS_EDR log type.
  • DHCP aliasing: Uses DHCP events to link hostnames, MAC addresses, and IP addresses.
  • Asset context aliasing: Associates an asset indicator with entity data, such as hostname, IP address, MAC address, software version, and deployment status.

EDR mapping indexed fields

Google SecOps indexes EDR MAPPING fields to generate aliases that link hostnames and product specific IDs.

The following table lists the UDM fields and their corresponding indicator types:

UDM field Indicator type
principal.hostname and principal.asset.hostname HOSTNAME
principal.asset_id and principal.asset.asset_id PRODUCT_SPECIFIC_ID

DHCP indexed fields

Google SecOps indexes DHCP records to generate aliases that link hostnames, IP addresses, and MAC addresses.

The following table lists the UDM fields and their corresponding indicator types used for asset aliasing:

UDM field Indicator type
principal.ip and principal.asset.ip ASSET_IP_ADDRESS
principal.mac and principal.asset.mac MAC
principal.hostname and principal.asset.hostname HOSTNAME
principal.asset_id and principal.asset.asset_id PRODUCT_SPECIFIC_ID
network.dhcp.yiaddr on ACK, OFFER, WIN_DELETED, and WIN_EXPIRED ASSET_IP_ADDRESS
network.dhcp.ciaddr on INFORM, RELEASE, and REQUEST ASSET_IP_ADDRESS
network.dhcp.requested_address on DECLINE ASSET_IP_ADDRESS
network.dhcp.chaddr MAC
network.dhcp.client_hostname HOSTNAME

Asset context indexed fields

Google SecOps ingests ASSET_CONTEXT events as entity context events, rather than UDM events.

The following table lists the entity fields and their corresponding indicator types:

Entity field Indicator type
entity.asset.product_object_id PRODUCT_OBJECT_ID
entity.metadata.product_entity_id (if the product object ID for the asset is missing) PRODUCT_OBJECT_ID
entity.asset.asset_id PRODUCT_SPECIFIC_ID
entity.asset.hostname HOSTNAME
entity.asset.ip ASSET_IP_ADDRESS
entity.asset.mac MAC
entity.namespace NAMESPACE

User aliasing

User aliasing finds information through a user indicator. For example, using an employee email address, you can find more details about that employee, such as their name, job title, and employment status. User aliasing uses the USER_CONTEXT event batch type for aliasing.

User context indexed fields

Google SecOps ingests USER_CONTEXT events as entity context events, rather than UDM events.

The following table lists the entity fields and their corresponding indicator types:

Entity field Indicator type
entity.user.product_object_id PRODUCT_OBJECT_ID
entity.metadata.product_entity_id (if user product object ID is missing) PRODUCT_OBJECT_ID
entity.user.userid USERNAME
entity.user.email_addresses EMAIL
entity.user.windows_sid WINDOWS_SID
entity.user.employee_id EMPLOYEE_ID
entity.namespace NAMESPACE

Process aliasing

Process aliasing maps a product-specific process ID (product_specific_process_id) to the actual process, and retrieves information about the parent process. Process aliasing uses the EDR event batch type for aliasing.

EDR indexed fields for process aliasing

When a process is launched, metadata such as command lines, file hashes, and parent process details are collected. The EDR software running on the machine assigns a vendor-specific process UUID.

The following table lists the fields that are indexed during a process launch event:

UDM field Indicator type
target.product_specific_process_id PROCESS_ID
target.process Whole process; not just the indicator

In addition to the target.process field from the normalized event, Google SecOps also collects and indexes parent process information.

File hash metadata aliasing

File hash metadata aliasing identifies file metadata, such as other file hashes or file sizes, based on a given file hash (sha256, sha1, or md5). File hash metadata aliasing uses the FILE_CONTEXT event batch type for aliasing.

File context indexed fields

Google SecOps ingests FILE_CONTEXT events from VirusTotal as entity context events. These events are global and not customer-specific.

The following table lists the indexed entity fields and their corresponding indicator types:

Entity field Indicator type
entity.file.sha256 PRODUCT_OBJECT_ID
entity.metadata.product_entity_id (if the file sha256 is missing) PRODUCT_OBJECT_ID
entity.file.md5 HASH_MD5
entity.file.sha1 HASH_SHA1
entity.file.sha256 HASH_SHA256
entity.namespace NAMESPACE

IP geolocation aliasing

Geographic aliasing provides geolocation-enriched data for external IP addresses. For each IP address in the principal, target, or src field for a UDM event, if the address is unaliased, an ip_geo_artifact subproto is created with the associated location and ASN information.

Geographic aliasing does not use lookback or caching. Due to the high volume of events, Google SecOps maintains an index in memory. The index is sourced from the IPGeo simple server MPM and is updated every two weeks.

Resource aliasing

Resource aliasing returns cloud resource information for a given resource ID. For example, it can return information for a Bigtable instance using its Google Cloud URI. It does not use lookback or caching.

Resource aliasing does not enrich UDM events. However, some products, such as Alert Graph use resource aliasing. Cloud resource aliasing uses the RESOURCE_CONTEXT event batch type.

Resource context indexed fields

Cloud resource metadata context events are ingested as RESOURCE_CONTEXT events.

The following table lists the entity field and their corresponding entity types:

Entity field Indicator type
entity.resource.product_object_id PRODUCT_OBJECT_ID
entity.metadata.product_entity_id (if the product object ID for the resource is missing) PRODUCT_OBJECT_ID
entity.resource.name CLOUD_RESOURCE_NAME
entity.namespace NAMESPACE

Enrichment

Enrichment uses aliasing to add context to a UDM indicator or event in the following ways:

  • Identifies alias entities that describe an indicator, typically a UDM field.
  • Populates the related parts of the UDM message with enriched values linked to the returned aliases or entities.

Asset enrichment

For each UDM event, the pipeline extracts the following UDM fields from the principal, src, and target entities:

UDM Field Indicator Type
hostname HOSTNAME
asset_id PRODUCT_SPECIFIC_ID
mac MAC
ip (IFF asset_id is empty) IP

Each asset indicator is namespaced. The empty namespace is treated as valid. For each asset indicator, the pipeline performs the following actions:

  • Retrieves the aliases for the full day of the event time.
  • Builds a backstory.Asset message from the aliasing response.
  • Maps each noun type and indicator to a backstory.Asset message and merges all related protos.
  • Sets the top level asset fields and asset proto message using the merged backstory.Asset message.

User enrichment

For each UDM event, the pipeline extracts the following UDM fields from principal, src, and target:

UDM field Indicator type
email_addresses EMAIL
userid USERNAME
windows_sid WINDOWS_SID
employee_id EMPLOYEE_ID
product_object_id PRODUCT_OBJECT_ID

For each indicator, the pipeline performs the following actions:

  • Retrieves a list of user entities. For example, the entities of principal.email_address and principal.userid might be the same, or they might be different.
  • Chooses the aliases from the best indicator type, using this priority order: WINDOWS_SID, EMAIL, USERNAME, EMPLOYEE_ID, and PRODUCT_OBJECT_ID.
  • Populates noun.user with the entity whose validity interval intersects with the event time.

Process enrichment

For each UDM event, the pipeline extracts process.product_specific_process_id (PSPI) from the following fields:

  • principal
  • src
  • target
  • principal.process.parent_process
  • src.process.parent_process
  • target.process.parent_process

The pipeline then finds the actual process from the PSPI using process aliasing, which also returns information about the parent process. It merges this data into the related noun.process field in the enriched message.

Artifact enrichment

Artifact enrichment adds file hash metadata from VirusTotal and IP locations from geolocation data. For each UDM event, the pipeline extracts and queries context data for these artifact indicators from the principal, src, and target entities:

  • IP address: Queries data only if it's public or routable.
  • File hashes: Queries hashes in the following order:
    • file.sha256
    • file.sha1
    • file.md5
    • process.file.sha256
    • process.file.sha1
    • process.file.md5

The pipeline uses UNIX epoch and event hour to define the time range for the file artifact queries. If the geolocation data is available, the pipeline overwrites the following UDM fields for the respective principal, src, and target, based on where the geolocation data originates:

  • artifact.ip
  • artifact.location
  • artifact.network (only if the data includes IP network context)
  • location (only if the original data doesn't include this field)

If the pipeline finds file hash metadata, it adds that metadata to the file or process.file fields, depending on where the indicator comes from. The pipeline keeps any existing values that don't overlap with the new data.

Need more help? Get answers from Community members and Google SecOps professionals.