Collect AWS VPC Flow logs
This document describes how you can collect AWS VPC Flow logs by using a Google Security Operations forwarder.
For more information, see Data ingestion to Google SecOps.
An ingestion label identifies the parser that normalizes raw log data into
structured UDM format. The information in this document applies to the parser
with the AWS_VPC_FLOW
ingestion label.
Supported AWS VPC Flow Log Formats
Google SecOps supports the ingestion of AWS VPC Flow Logs in two primary text formats:
JSON Format: The
AWS_VPC_FLOW
log type parses logs in JSON format. In this format, each log entry includes both a key and its corresponding value, making the data self-describing.CSV Format: Google SecOps also provides a parser for AWS VPC Flow Logs in CSV format. This format lists field keys only once in the header row, with subsequent rows containing only comma-separated values.
To ingest AWS VPC Flow Logs in CSV format, specify the log type as
AWS_VPC_FLOW_CSV
when configuring your forwarder. For setup instructions, see
Configure Google SecOps forwarder and syslog to ingest AWS VPC Flow logs.
Because the CSV format doesn't include field keys in each log entry, the
AWS_VPC_FLOW_CSV
parser relies on a strict, predefined order of values. Your
CSV files must adhere to the following field order for correct parsing:
Version,Account_id,Interface_id,Srcaddr,Dstaddr,Srcport,Dstport,Protocol,Packets,Bytes,Start,End,Action,Log_status,Vpc_id,Subnet_id,Instance_id,Tcp_flags,Type,Pkt_srcaddr,Pkt_dstaddr,Region,Az_id,Sublocation_type,Sublocation_id,Pkt_src_aws_service,Pkt_dst_aws_service,Flow_direction,Traffic_path,Ecs_cluster_arn,Ecs_cluster_name,Ecs_container_instance_arn,Ecs_container_instance_id,Ecs_container_id,Ecs_second_container_id,Ecs_service_name,Ecs_task_definition_arn,Ecs_task_arn,Ecs_task_id
The following is an example of a CSV log line:
7,369096419186,eni-0520bb5efed19d33a,10.119.32.34,10.119.223.3,51256,16020,6,14,3881,1723542839,1723542871,ACCEPT,OK,vpc-0769a6844ce873a6a,subnet-0cf9b2cb32f49f258,i-088d6080f45f5744f,0,IPv4,10.119.32.34,10.119.223.3,ap-northeast-1,apne1-az4,-,-,-,-,ingress,,-,-,-,-,-,-,-,-,-,-
For fields where no value is available, an empty value (for example, , ,
)
should be passed to maintain the correct positional order within the CSV row.
Before you begin
Ensure that the Amazon S3 bucket is created. For more information, see Create your first S3 bucket.
Ensure that the Amazon CloudWatch log group is created. For more information, see Working with log groups and log streams.
Configure AWS VPC Flow
Configure AWS VPC Flow based on whether you forward the logs to Amazon S3 or to Amazon CloudWatch.
For information about forwarding logs to the Amazon S3 bucket, see the Configure flow logs to forward logs to Amazon S3 section of this document.
For information about forwarding logs to Amazon CloudWatch, see the Configure flow logs to Amazon CloudWatch section of this document.
Configure flow logs to forward logs to Amazon S3
After you create and configure the Amazon S3 bucket, you can create flow logs for your network interfaces, subnets, and VPCs networks.
Create a flow log for a network interface
- Sign in to the Amazon EC2 console.
- In the navigation pane, select Network Interfaces.
- Select one or more network interfaces.
- Select Actions > Create flow log.
- Configure the flow log settings. For more information, see Configure flow log settings section of this document.
Create a flow log for a subnet
- Sign in to the Amazon VPC console.
- In the navigation pane, select Subnets.
- Select one or more subnets.
- Select Actions > Create flow log.
- Configure the flow log settings. For more information, see Configure flow log settings section of this document.
Create a flow log for a VPC
- Sign in to the Amazon VPC console.
- In the navigation pane, select VPCs.
- Select one or more VPCs.
- Select Actions > Create flow log.
- Configure the flow log settings. For more information, see Configure flow log settings section of this document.
Configure flow log settings
In the Filter section, specify the IP traffic to log:
Accept: log only accepted traffic.
Reject: log only rejected traffic.
All: log accepted and rejected traffic.
In the Maximum aggregation interval section, select 1 minute.
In the Destination section, select Send to an Amazon S3 bucket.
In the S3 bucket ARN section, specify the ARN of an Amazon S3 bucket.
In the Log record format section, specify the following formats for the flow log record:
- To use the default flow log record format, select AWS default format.
- To create a custom format, select Custom format.
Configure the VPC log flow with the custom AWS log format to use MSS true IP features.
In the Log format list, select all the attributes.
In the Format preview section, review the custom format.
In the Log file format section, select Text (default).
In the Hive-compatible S3 prefix section, keep the Enable checkbox unchecked.
In the Partition logs by time section, select Every 1 hour (60 mins).
To add a tag to the flow log, select Add new tag and specify the tag key and value.
Select Create flow log. For more information, see Publish flow logs to Amazon S3.
Configure flow logs to the Amazon CloudWatch
You can configure flow log either from VPCs, subnets, or network interfaces.
In the Filter section, specify the type of IP traffic to log:
Accept: log only accepted traffic.
Reject: log only rejected traffic.
All: log accepted and rejected traffic.
In the Maximum aggregation interval section, select 1 minute.
In the Destination section, select Send to CloudWatch Logs.
In the Destination log group section, provide the destination log group name that you created.
In the IAM role list, select the role name. The selected role name has permissions to publish logs to CloudWatch logs.
The IAM role must include the following permissions:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents", "logs:DescribeLogGroups", "logs:DescribeLogStreams" ], "Resource": "*" } ] }
In the Log record format section, select Custom format for the flow log record.
To add a tag to the flow log, select Add new tag and specify the tag key and value.
Select Create flow log. For more information, see Publish flow logs to Amazon S3.
Amazon S3 can be configured to send the event notifications to Amazon SQS. For more information, see Configuring a bucket for notifications (SNS topic or SQS queue).
IAM user policies are required for Amazon S3 and Amazon SQS if using Amazon SQS (Amazon S3 using Amazon SQS) as a log collection method. For more information, see Using IAM policies with AWS KMS.
Based on the service and region, identify the endpoints for connectivity by referring to the following AWS documentation:
For information about any logging sources, see AWS Identity and Access Management endpoints and quotas.
For information about Amazon S3 logging sources, see Amazon Simple Storage Service endpoints and quotas.
For information about Amazon SQS logging sources, see Amazon Simple Queue Service endpoints and quotas.
For information about Amazon CloudWatch logging sources, see Amazon CloudWatch logs endpoints and quotas.
Configure Google SecOps forwarder and syslog to ingest AWS VPC Flow logs
- Select SIEM Settings > Forwarders.
- Click Add new forwarder.
- Enter a unique name for the Forwarder name.
- Click Submit and then click Confirm. The forwarder is added and the Add collector configuration window appears.
- In the Collector name field, type a name.
- In the Log type field, select AWS VPC Flow or AWS VPC Flow (CSV), depending on your log format.
- In the Collector type field, select Syslog.
- Configure the following mandatory input parameters:
- Protocol: specify the connection protocol the collector will use to listen for syslog data.
- Address: specify the target IP address or hostname where the collector resides and addresses to the syslog data.
- Port: specify the target port where the collector resides and listens for syslog data.
- Click Submit and then click Confirm.
For more information about Google SecOps forwarders, see Google Security Operations forwarders documentation. For information about requirements for each forwarder type, see Forwarder configuration by type.
If you encounter issues when you create forwarders, contact Google Security Operations support.
Field mapping reference
This parser transforms raw AWS VPC Flow logs—in either JSON or CSV format—into structured UDM format. It extracts relevant fields, maps them to match the UDM schema, and enriches the data with additional context like resource type, cloud provider, and labels to support a deeper analysis. The mapping logic is consistent across both formats: the CSV parser relies on a predefined field order to align values with the same UDM fields used in the JSON format.
UDM Mapping table for AWS EC2 VPC Parser
Log Field (Ascending) | UDM Mapping | Logic |
---|---|---|
CidrBlock | event.idm.entity.entity.resource.attribute.labels.cidr_block | Directly mapped from the "CidrBlock" field in the raw log. |
CidrBlock | event.idm.entity.entity.network.ip_subnet_range | Directly mapped from the "CidrBlock" field in the raw log. |
CidrBlockAssociation.AssociationID | event.idm.entity.entity.resource.attribute.labels.cidr_block_association_association_id | Directly mapped from the "AssociationID" field within the "CidrBlockAssociation" array in the raw log. |
CidrBlockAssociation.CidrBlockState.State | event.idm.entity.entity.resource.attribute.labels.cidr_block_association_cidr_block_state_state | Directly mapped from the "State" field within the "CidrBlockState" object of the "CidrBlockAssociation" array in the raw log. |
CidrBlockAssociation.CidrBlockState.StatusMessage | event.idm.entity.entity.resource.attribute.labels.cidr_block_association_cidr_block_state_status_message | Directly mapped from the "StatusMessage" field within the "CidrBlockState" object of the "CidrBlockAssociation" array in the raw log. |
DhcpOptionsID | event.idm.entity.entity.resource.attribute.labels.dhcp_options_id | Directly mapped from the "DhcpOptionsID" field in the raw log. |
ID | event.idm.entity.entity.resource.product_object_id | Directly mapped from the "ID" field in the raw log, which is renamed to "VpcID" in the parser. |
ID | event.idm.entity.metadata.product_entity_id | Directly mapped from the "ID" field in the raw log, which is renamed to "VpcID" in the parser. |
InstanceTenancy | event.idm.entity.entity.resource.attribute.labels.instance_tenancy | Directly mapped from the "InstanceTenancy" field in the raw log. |
IsDefault | event.idm.entity.entity.resource.attribute.labels.is_default | Directly mapped from the "IsDefault" field in the raw log. |
Ipv6CidrBlockAssociationSet.AssociationID | event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_association_id | Directly mapped from the "AssociationID" field within the "Ipv6CidrBlockAssociationSet" array in the raw log. |
Ipv6CidrBlockAssociationSet.Ipv6CidrBlock | event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_cidr_block | Directly mapped from the "Ipv6CidrBlock" field within the "Ipv6CidrBlockAssociationSet" array in the raw log. |
Ipv6CidrBlockAssociationSet.Ipv6CidrBlockState.State | event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_cidr_block_state_state | Directly mapped from the "State" field within the "Ipv6CidrBlockState" object of the "Ipv6CidrBlockAssociationSet" array in the raw log. |
Ipv6CidrBlockAssociationSet.Ipv6CidrBlockState.StatusMessage | event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_cidr_block_state_status_message | Directly mapped from the "StatusMessage" field within the "Ipv6CidrBlockState" object of the "Ipv6CidrBlockAssociationSet" array in the raw log. |
Ipv6CidrBlockAssociationSet.Ipv6Pool | event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_pool | Directly mapped from the "Ipv6Pool" field within the "Ipv6CidrBlockAssociationSet" array in the raw log. |
Ipv6CidrBlockAssociationSet.NetworkBorderGroup | event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_network_border_group | Directly mapped from the "NetworkBorderGroup" field within the "Ipv6CidrBlockAssociationSet" array in the raw log. |
OwnerID | event.idm.entity.entity.resource.attribute.labels.owner_id | Directly mapped from the "OwnerID" field in the raw log. |
State | event.idm.entity.entity.resource.attribute.labels.state | Directly mapped from the "State" field in the raw log. |
TagSet.Key | event.idm.entity.entity.resource.attribute.labels.key | Directly mapped from the "Key" field within the "TagSet" array in the raw log. This creates a new label for each tag in the "TagSet". |
TagSet.Value | event.idm.entity.entity.resource.attribute.labels.value | Directly mapped from the "Value" field within the "TagSet" array in the raw log. This populates the value for each corresponding label created from the "Key" field. |
N/A | event.idm.entity.entity.resource.attribute.cloud.environment | Hardcoded to "AMAZON_WEB_SERVICES" in the parser code. |
N/A | event.idm.entity.entity.resource.resource_type | Hardcoded to "VPC_NETWORK" in the parser code. |
N/A | event.idm.entity.metadata.collected_timestamp | Populated with the event timestamp, which is derived from the "collection_time" field in the raw log. |
N/A | event.idm.entity.metadata.entity_type | Hardcoded to "RESOURCE" in the parser code. |
N/A | event.idm.entity.metadata.product_name | Hardcoded to "Amazon VPC" in the parser code. |
N/A | event.idm.entity.metadata.vendor_name | Hardcoded to "AWS" in the parser code. |
N/A | events.timestamp | Populated with the event timestamp, which is derived from the "collection_time" field in the raw log. |
Need more help? Get answers from Community members and Google SecOps professionals.