Overview of aliasing and UDM enrichment in Google Security Operations

Supported in:

Google secops SIEM

This document provides an overview of aliasing and UDM enrichment in Google Security Operations. It outlines common use cases and explains how aliasing and enrichment work within the platform.

Aliasing and UDM enrichment are key concepts in Google SecOps. They work together but serve different purposes.

Aliasing identifies the different names and additional context data that describe an indicator.
Enrichment uses aliasing to add context to a UDM event.

For example, a UDM event includes the hostname alex-macbook and indicates that a malicious file hash was executed by user alex. Using aliasing, we find that the hostname alex-macbook was assigned the IP address192.0.2.0at the time of the event, and that alex is leaving the company in 2 weeks. Stitching these aliases into the original UDM event adds context.

Supported aliasing and enrichment capabilities

Google SecOps supports aliasing and enrichment for the following:

Assets
Users
Processes
File hash metadata
Geographic locations
Cloud resources

How aliasing works

Aliasing enables enrichment. For example, using aliasing, you can find other IP addresses and MAC addresses associated with a hostname, or the job title and employment status associated with a user ID.

Like other features in Google SecOps, aliasing requires data to be ingested and indexed. Aliasing is organized into three main categories:

Customer specific data: Data unique to a customer. For example, only Aristocrat can provide data for amal@aristocrat.com. Customer-specific aliasing types include assets, users, and processes.
Global data: Ingested and indexed data that applies to all customers. For example, a globally-sourced indication about a malicious file can be used to check for the presence of that file in your enterprise.
Third-party service: Aliasing done by a third-party service provider. Google SecOps uses geographical services to find the physical location of IP addresses.

These types of aliasing are used together to generate asset aliasing results.

Asset aliasing

Asset aliasing links hostnames, IP addresses, MAC addresses, asset IDs, and other metadata. It involves the following steps:

EDR aliasing: Maps product IDs (asset IDs) to hostnames. EDR mapping fields are derived exclusively from the CS_EDR log type.
DHCP aliasing: Uses DHCP events to link hostnames, MAC addresses, and IP addresses.
Asset context aliasing: Associates an asset indicator with entity data, such as hostname, IP address, MAC address, software version, and deployment status.

EDR mapping indexed fields

Google SecOps indexes EDR MAPPING fields to generate aliases that link hostnames and product specific IDs.

The following table lists the UDM fields and their corresponding indicator types:

UDM field	Indicator type
principal.hostname and principal.asset.hostname	`HOSTNAME`
principal.asset_id and principal.asset.asset_id	`PRODUCT_SPECIFIC_ID`

DHCP indexed fields

Google SecOps indexes DHCP records to generate aliases that link hostnames, IP addresses, and MAC addresses.

The following table lists the UDM fields and their corresponding indicator types used for asset aliasing:

UDM field	Indicator type
principal.ip and principal.asset.ip	`ASSET_IP_ADDRESS`
principal.mac and principal.asset.mac	`MAC`
principal.hostname and principal.asset.hostname	`HOSTNAME`
principal.asset_id and principal.asset.asset_id	`PRODUCT_SPECIFIC_ID`
network.dhcp.yiaddr on ACK, OFFER, WIN_DELETED, and WIN_EXPIRED	`ASSET_IP_ADDRESS`
network.dhcp.ciaddr on INFORM, RELEASE, and REQUEST	`ASSET_IP_ADDRESS`
network.dhcp.requested_address on DECLINE	`ASSET_IP_ADDRESS`
network.dhcp.chaddr	`MAC`
network.dhcp.client_hostname	`HOSTNAME`

Asset context indexed fields

Google SecOps ingests ASSET_CONTEXT events as entity context events, rather than UDM events.

The following table lists the entity fields and their corresponding indicator types:

Entity field	Indicator type
entity.asset.product_object_id	`PRODUCT_OBJECT_ID`
entity.metadata.product_entity_id (if the product object ID for the asset is missing)	`PRODUCT_OBJECT_ID`
entity.asset.asset_id	`PRODUCT_SPECIFIC_ID`
entity.asset.hostname	`HOSTNAME`
entity.asset.ip	`ASSET_IP_ADDRESS`
entity.asset.mac	`MAC`
entity.namespace	`NAMESPACE`

User aliasing

User aliasing finds information through a user indicator. For example, using an employee email address, you can find more details about that employee, such as their name, job title, and employment status. User aliasing uses the USER_CONTEXT event batch type for aliasing.

User context indexed fields

Google SecOps ingests USER_CONTEXT events as entity context events, rather than UDM events.

The following table lists the entity fields and their corresponding indicator types:

Entity field	Indicator type
entity.user.product_object_id	`PRODUCT_OBJECT_ID`
entity.metadata.product_entity_id (if user product object ID is missing)	`PRODUCT_OBJECT_ID`
entity.user.userid	`USERNAME`
entity.user.email_addresses	`EMAIL`
entity.user.windows_sid	`WINDOWS_SID`
entity.user.employee_id	`EMPLOYEE_ID`
entity.namespace	`NAMESPACE`

Process aliasing

Process aliasing maps a product-specific process ID (product_specific_process_id) to the actual process, and retrieves information about the parent process. Process aliasing uses the EDR event batch type for aliasing.

EDR indexed fields for process aliasing

When a process is launched, metadata such as command lines, file hashes, and parent process details are collected. The EDR software running on the machine assigns a vendor-specific process UUID.

The following table lists the fields that are indexed during a process launch event:

UDM field	Indicator type
target.product_specific_process_id	`PROCESS_ID`
target.process	Whole process; not just the indicator

In addition to the target.process field from the normalized event, Google SecOps also collects and indexes parent process information.

File hash metadata aliasing

File hash metadata aliasing identifies file metadata, such as other file hashes or file sizes, based on a given file hash (sha256, sha1, or md5). File hash metadata aliasing uses the FILE_CONTEXT event batch type for aliasing.

File context indexed fields

Google SecOps ingests FILE_CONTEXT events from VirusTotal as entity context events. These events are global and not customer-specific.

The following table lists the indexed entity fields and their corresponding indicator types:

Entity field	Indicator type
entity.file.sha256	`PRODUCT_OBJECT_ID`
entity.metadata.product_entity_id (if the file `sha256` is missing)	`PRODUCT_OBJECT_ID`
entity.file.md5	`HASH_MD5`
entity.file.sha1	`HASH_SHA1`
entity.file.sha256	`HASH_SHA256`
entity.namespace	`NAMESPACE`

IP geolocation aliasing

Geographic aliasing provides geolocation-enriched data for external IP addresses. For each IP address in the principal, target, or src field for a UDM event, if the address is unaliased, an ip_geo_artifact subproto is created with the associated location and ASN information.

Geographic aliasing does not use lookback or caching. Due to the high volume of events, Google SecOps maintains an index in memory. The index is sourced from the IPGeo simple server MPM and is updated every two weeks.

Resource aliasing

Resource aliasing returns cloud resource information for a given resource ID. For example, it can return information for a Bigtable instance using its Google Cloud URI. It does not use lookback or caching.

Resource aliasing does not enrich UDM events. However, some products, such as Alert Graph use resource aliasing. Cloud resource aliasing uses the RESOURCE_CONTEXT event batch type.

Resource context indexed fields

Cloud resource metadata context events are ingested as RESOURCE_CONTEXT events.

The following table lists the entity field and their corresponding entity types:

Entity field	Indicator type
entity.resource.product_object_id	`PRODUCT_OBJECT_ID`
entity.metadata.product_entity_id (if the product object ID for the resource is missing)	`PRODUCT_OBJECT_ID`
entity.resource.name	`CLOUD_RESOURCE_NAME`
entity.namespace	`NAMESPACE`

Enrichment

Enrichment uses aliasing to add context to a UDM indicator or event in the following ways:

Identifies alias entities that describe an indicator, typically a UDM field.
Populates the related parts of the UDM message with enriched values linked to the returned aliases or entities.

Asset enrichment

For each UDM event, the pipeline extracts the following UDM fields from the principal, src, and target entities:

UDM Field	Indicator Type
hostname	`HOSTNAME`
asset_id	`PRODUCT_SPECIFIC_ID`
mac	`MAC`
ip (IFF asset_id is empty)	`IP`

Each asset indicator is namespaced. The empty namespace is treated as valid. For each asset indicator, the pipeline performs the following actions:

Retrieves the aliases for the full day of the event time.
Builds a backstory.Asset message from the aliasing response.
Maps each noun type and indicator to a backstory.Asset message and merges all related protos.
Sets the top level asset fields and asset proto message using the merged backstory.Asset message.

User enrichment

For each UDM event, the pipeline extracts the following UDM fields from principal, src, and target:

UDM field	Indicator type
email_addresses	`EMAIL`
userid	`USERNAME`
windows_sid	`WINDOWS_SID`
employee_id	`EMPLOYEE_ID`
product_object_id	`PRODUCT_OBJECT_ID`

For each indicator, the pipeline performs the following actions:

Retrieves a list of user entities. For example, the entities of principal.email_address and principal.userid might be the same, or they might be different.
Chooses the aliases from the best indicator type, using this priority order: WINDOWS_SID, EMAIL, USERNAME, EMPLOYEE_ID, and PRODUCT_OBJECT_ID.
Populates noun.user with the entity whose validity interval intersects with the event time.

Process enrichment

For each UDM event, the pipeline extracts process.product_specific_process_id (PSPI) from the following fields:

principal
src
target
principal.process.parent_process
src.process.parent_process
target.process.parent_process

The pipeline then finds the actual process from the PSPI using process aliasing, which also returns information about the parent process. It merges this data into the related noun.process field in the enriched message.

Artifact enrichment

Artifact enrichment adds file hash metadata from VirusTotal and IP locations from geolocation data. For each UDM event, the pipeline extracts and queries context data for these artifact indicators from the principal, src, and target entities:

IP address: Queries data only if it's public or routable.
File hashes: Queries hashes in the following order:
- file.sha256
- file.sha1
- file.md5
- process.file.sha256
- process.file.sha1
- process.file.md5

The pipeline uses UNIX epoch and event hour to define the time range for the file artifact queries. If the geolocation data is available, the pipeline overwrites the following UDM fields for the respective principal, src, and target, based on where the geolocation data originates:

artifact.ip
artifact.location
artifact.network (only if the data includes IP network context)
location (only if the original data doesn't include this field)

If the pipeline finds file hash metadata, it adds that metadata to the file or process.file fields, depending on where the indicator comes from. The pipeline keeps any existing values that don't overlap with the new data.

Need more help? Get answers from Community members and Google SecOps professionals.