Overview of aliasing and UDM enrichment in Google Security Operations
This document provides an overview of aliasing and UDM enrichment in Google Security Operations. It outlines common use cases and explains how aliasing and enrichment work within the platform.
Aliasing and UDM enrichment are key concepts in Google SecOps. They work together but serve different purposes.
- Aliasing identifies the different names and additional context data that describe an indicator.
- Enrichment uses aliasing to add context to a UDM event.
For example, a UDM event includes the hostname alex-macbook
and indicates that a
malicious file hash was executed by user alex
. Using aliasing, we find that the
hostname alex-macbook
was assigned the IP address192.0.2.0
at the time
of the event, and that alex
is leaving the company in 2 weeks. Stitching these
aliases into the original UDM event adds context.
Supported aliasing and enrichment capabilities
Google SecOps supports aliasing and enrichment for the following:
- Assets
- Users
- Processes
- File hash metadata
- Geographic locations
- Cloud resources
How aliasing works
Aliasing enables enrichment. For example, using aliasing, you can find other IP addresses and MAC addresses associated with a hostname, or the job title and employment status associated with a user ID.
Like other features in Google SecOps, aliasing requires data to be ingested and indexed. Aliasing is organized into three main categories:
- Customer specific data: Data unique to a customer. For example, only
Aristocrat
can provide data foramal@aristocrat.com
. Customer-specific aliasing types include assets, users, and processes. - Global data: Ingested and indexed data that applies to all customers. For example, a globally-sourced indication about a malicious file can be used to check for the presence of that file in your enterprise.
- Third-party service: Aliasing done by a third-party service provider. Google SecOps uses geographical services to find the physical location of IP addresses.
These types of aliasing are used together to generate asset aliasing results.
Asset aliasing
Asset aliasing links hostnames, IP addresses, MAC addresses, asset IDs, and other metadata. It involves the following steps:
- EDR aliasing: Maps product IDs (asset IDs) to hostnames.
EDR mapping fields are derived exclusively from the
CS_EDR
log type. - DHCP aliasing: Uses DHCP events to link hostnames, MAC addresses, and IP addresses.
- Asset context aliasing: Associates an asset indicator with entity data, such as hostname, IP address, MAC address, software version, and deployment status.
EDR mapping indexed fields
Google SecOps indexes EDR MAPPING fields to generate aliases that link hostnames and product specific IDs.
The following table lists the UDM fields and their corresponding indicator types:
UDM field | Indicator type |
---|---|
principal.hostname and principal.asset.hostname | HOSTNAME |
principal.asset_id and principal.asset.asset_id | PRODUCT_SPECIFIC_ID |
DHCP indexed fields
Google SecOps indexes DHCP records to generate aliases that link hostnames, IP addresses, and MAC addresses.
The following table lists the UDM fields and their corresponding indicator types used for asset aliasing:
UDM field | Indicator type |
---|---|
principal.ip and principal.asset.ip | ASSET_IP_ADDRESS |
principal.mac and principal.asset.mac | MAC |
principal.hostname and principal.asset.hostname | HOSTNAME |
principal.asset_id and principal.asset.asset_id | PRODUCT_SPECIFIC_ID |
network.dhcp.yiaddr on ACK, OFFER, WIN_DELETED, and WIN_EXPIRED | ASSET_IP_ADDRESS |
network.dhcp.ciaddr on INFORM, RELEASE, and REQUEST | ASSET_IP_ADDRESS |
network.dhcp.requested_address on DECLINE | ASSET_IP_ADDRESS |
network.dhcp.chaddr | MAC |
network.dhcp.client_hostname | HOSTNAME |
Asset context indexed fields
Google SecOps ingests ASSET_CONTEXT
events as entity context events,
rather than UDM events.
The following table lists the entity fields and their corresponding indicator types:
Entity field | Indicator type |
---|---|
entity.asset.product_object_id | PRODUCT_OBJECT_ID |
entity.metadata.product_entity_id (if the product object ID for the asset is missing) | PRODUCT_OBJECT_ID |
entity.asset.asset_id | PRODUCT_SPECIFIC_ID |
entity.asset.hostname | HOSTNAME |
entity.asset.ip | ASSET_IP_ADDRESS |
entity.asset.mac | MAC |
entity.namespace | NAMESPACE |
User aliasing
User aliasing finds information through a user indicator. For example, using an
employee email address, you can find more details about that employee, such as
their name, job title, and employment status.
User aliasing uses the USER_CONTEXT
event batch type for aliasing.
User context indexed fields
Google SecOps ingests USER_CONTEXT
events as entity context events,
rather than UDM events.
The following table lists the entity fields and their corresponding indicator types:
Entity field | Indicator type |
---|---|
entity.user.product_object_id | PRODUCT_OBJECT_ID |
entity.metadata.product_entity_id (if user product object ID is missing) | PRODUCT_OBJECT_ID |
entity.user.userid | USERNAME |
entity.user.email_addresses | EMAIL |
entity.user.windows_sid | WINDOWS_SID |
entity.user.employee_id | EMPLOYEE_ID |
entity.namespace | NAMESPACE |
Process aliasing
Process aliasing maps a product-specific process ID (product_specific_process_id
)
to the actual process, and retrieves information about the parent process.
Process aliasing uses the EDR event batch type for aliasing.
EDR indexed fields for process aliasing
When a process is launched, metadata such as command lines, file hashes, and parent process details are collected. The EDR software running on the machine assigns a vendor-specific process UUID.
The following table lists the fields that are indexed during a process launch event:
UDM field | Indicator type |
---|---|
target.product_specific_process_id | PROCESS_ID |
target.process | Whole process; not just the indicator |
In addition to the target.process
field from the normalized event,
Google SecOps also collects and indexes parent process information.
File hash metadata aliasing
File hash metadata aliasing identifies file metadata, such as other file hashes
or file sizes, based on a given file hash (sha256, sha1, or md5).
File hash metadata aliasing uses the FILE_CONTEXT
event batch type for aliasing.
File context indexed fields
Google SecOps ingests FILE_CONTEXT
events from VirusTotal as entity
context events. These events are global and not customer-specific.
The following table lists the indexed entity fields and their corresponding indicator types:
Entity field | Indicator type |
---|---|
entity.file.sha256 | PRODUCT_OBJECT_ID |
entity.metadata.product_entity_id (if the file sha256 is missing) |
PRODUCT_OBJECT_ID |
entity.file.md5 | HASH_MD5 |
entity.file.sha1 | HASH_SHA1 |
entity.file.sha256 | HASH_SHA256 |
entity.namespace | NAMESPACE |
IP geolocation aliasing
Geographic aliasing provides geolocation-enriched data for external IP addresses.
For each IP address in the principal
, target
, or src
field for a UDM event,
if the address is unaliased, an ip_geo_artifact
subproto is created with
the associated location and ASN information.
Geographic aliasing does not use lookback or caching. Due to the high volume of events, Google SecOps maintains an index in memory. The index is sourced from the IPGeo simple server MPM and is updated every two weeks.
Resource aliasing
Resource aliasing returns cloud resource information for a given resource ID. For example, it can return information for a Bigtable instance using its Google Cloud URI. It does not use lookback or caching.
Resource aliasing does not enrich UDM events. However, some products, such as Alert
Graph use resource aliasing. Cloud resource aliasing uses the RESOURCE_CONTEXT
event batch type.
Resource context indexed fields
Cloud resource metadata context events are ingested as RESOURCE_CONTEXT
events.
The following table lists the entity field and their corresponding entity types:
Entity field | Indicator type |
---|---|
entity.resource.product_object_id | PRODUCT_OBJECT_ID |
entity.metadata.product_entity_id (if the product object ID for the resource is missing) | PRODUCT_OBJECT_ID |
entity.resource.name | CLOUD_RESOURCE_NAME |
entity.namespace | NAMESPACE |
Enrichment
Enrichment uses aliasing to add context to a UDM indicator or event in the following ways:
- Identifies alias entities that describe an indicator, typically a UDM field.
- Populates the related parts of the UDM message with enriched values linked to the returned aliases or entities.
Asset enrichment
For each UDM event, the pipeline extracts the following UDM fields from the
principal
, src
, and target
entities:
UDM Field | Indicator Type |
---|---|
hostname | HOSTNAME |
asset_id | PRODUCT_SPECIFIC_ID |
mac | MAC |
ip (IFF asset_id is empty) | IP |
Each asset indicator is namespaced. The empty namespace is treated as valid. For each asset indicator, the pipeline performs the following actions:
- Retrieves the aliases for the full day of the event time.
- Builds a
backstory.Asset
message from the aliasing response. - Maps each noun type and indicator to a backstory.Asset message and merges all related protos.
- Sets the top level asset fields and
asset
proto message using the merged backstory.Asset message.
User enrichment
For each UDM event, the pipeline extracts the following UDM fields from
principal
, src
, and target
:
UDM field | Indicator type |
---|---|
email_addresses | EMAIL |
userid | USERNAME |
windows_sid | WINDOWS_SID |
employee_id | EMPLOYEE_ID |
product_object_id | PRODUCT_OBJECT_ID |
For each indicator, the pipeline performs the following actions:
- Retrieves a list of user entities. For example, the entities of
principal.email_address
andprincipal.userid
might be the same, or they might be different. - Chooses the aliases from the best indicator type, using this priority order:
WINDOWS_SID
,EMAIL
,USERNAME
,EMPLOYEE_ID
, andPRODUCT_OBJECT_ID
. - Populates
noun.user
with the entity whose validity interval intersects with the event time.
Process enrichment
For each UDM event, the pipeline extracts process.product_specific_process_id (PSPI)
from the following fields:
principal
src
target
principal.process.parent_process
src.process.parent_process
target.process.parent_process
The pipeline then finds the actual process from the PSPI using process aliasing,
which also returns information about the parent process. It merges this data into
the related noun.process
field in the enriched message.
Artifact enrichment
Artifact enrichment adds file hash metadata from VirusTotal and IP locations from
geolocation data. For each UDM event, the pipeline extracts and queries context data
for these artifact indicators from the principal
, src
, and target
entities:
- IP address: Queries data only if it's public or routable.
- File hashes: Queries hashes in the following order:
file.sha256
file.sha1
file.md5
process.file.sha256
process.file.sha1
process.file.md5
The pipeline uses UNIX epoch and event hour to define the time range for the file
artifact queries. If the geolocation data is available, the pipeline overwrites the
following UDM fields for the respective principal
, src
, and target
,
based on where the geolocation data originates:
artifact.ip
artifact.location
artifact.network
(only if the data includes IP network context)location
(only if the original data doesn't include this field)
If the pipeline finds file hash metadata, it adds that metadata to the file or
process.file
fields, depending on where the indicator comes from. The pipeline
keeps any existing values that don't overlap with the new data.
Need more help? Get answers from Community members and Google SecOps professionals.