Collect AWS VPC Flow logs

Supported in:

This document describes how you can collect AWS VPC Flow logs by using a Google Security Operations forwarder.

For more information, see Data ingestion to Google SecOps.

An ingestion label identifies the parser that normalizes raw log data into structured UDM format. The information in this document applies to the parser with the AWS_VPC_FLOW ingestion label.

Supported AWS VPC Flow Log Formats

Google SecOps supports the ingestion of AWS VPC Flow Logs in two primary text formats:

  • JSON Format: The AWS_VPC_FLOW log type parses logs in JSON format. In this format, each log entry includes both a key and its corresponding value, making the data self-describing.

  • CSV Format: Google SecOps also provides a parser for AWS VPC Flow Logs in CSV format. This format lists field keys only once in the header row, with subsequent rows containing only comma-separated values.

To ingest AWS VPC Flow Logs in CSV format, specify the log type as AWS_VPC_FLOW_CSV when configuring your forwarder. For setup instructions, see Configure Google SecOps forwarder and syslog to ingest AWS VPC Flow logs.

Because the CSV format doesn't include field keys in each log entry, the AWS_VPC_FLOW_CSV parser relies on a strict, predefined order of values. Your CSV files must adhere to the following field order for correct parsing:

   Version,Account_id,Interface_id,Srcaddr,Dstaddr,Srcport,Dstport,Protocol,Packets,Bytes,Start,End,Action,Log_status,Vpc_id,Subnet_id,Instance_id,Tcp_flags,Type,Pkt_srcaddr,Pkt_dstaddr,Region,Az_id,Sublocation_type,Sublocation_id,Pkt_src_aws_service,Pkt_dst_aws_service,Flow_direction,Traffic_path,Ecs_cluster_arn,Ecs_cluster_name,Ecs_container_instance_arn,Ecs_container_instance_id,Ecs_container_id,Ecs_second_container_id,Ecs_service_name,Ecs_task_definition_arn,Ecs_task_arn,Ecs_task_id

The following is an example of a CSV log line:

   7,369096419186,eni-0520bb5efed19d33a,10.119.32.34,10.119.223.3,51256,16020,6,14,3881,1723542839,1723542871,ACCEPT,OK,vpc-0769a6844ce873a6a,subnet-0cf9b2cb32f49f258,i-088d6080f45f5744f,0,IPv4,10.119.32.34,10.119.223.3,ap-northeast-1,apne1-az4,-,-,-,-,ingress,,-,-,-,-,-,-,-,-,-,-

For fields where no value is available, an empty value (for example, , ,) should be passed to maintain the correct positional order within the CSV row.

Before you begin

Configure AWS VPC Flow

Configure AWS VPC Flow based on whether you forward the logs to Amazon S3 or to Amazon CloudWatch.

Configure flow logs to forward logs to Amazon S3

After you create and configure the Amazon S3 bucket, you can create flow logs for your network interfaces, subnets, and VPCs networks.

Create a flow log for a network interface

  1. Sign in to the Amazon EC2 console.
  2. In the navigation pane, select Network Interfaces.
  3. Select one or more network interfaces.
  4. Select Actions > Create flow log.
  5. Configure the flow log settings. For more information, see Configure flow log settings section of this document.

Create a flow log for a subnet

  1. Sign in to the Amazon VPC console.
  2. In the navigation pane, select Subnets.
  3. Select one or more subnets.
  4. Select Actions > Create flow log.
  5. Configure the flow log settings. For more information, see Configure flow log settings section of this document.

Create a flow log for a VPC

  1. Sign in to the Amazon VPC console.
  2. In the navigation pane, select VPCs.
  3. Select one or more VPCs.
  4. Select Actions > Create flow log.
  5. Configure the flow log settings. For more information, see Configure flow log settings section of this document.

Configure flow log settings

  1. In the Filter section, specify the IP traffic to log:

    • Accept: log only accepted traffic.

    • Reject: log only rejected traffic.

    • All: log accepted and rejected traffic.

  2. In the Maximum aggregation interval section, select 1 minute.

  3. In the Destination section, select Send to an Amazon S3 bucket.

  4. In the S3 bucket ARN section, specify the ARN of an Amazon S3 bucket.

  5. In the Log record format section, specify the following formats for the flow log record:

    1. To use the default flow log record format, select AWS default format.
    2. To create a custom format, select Custom format.
  6. Configure the VPC log flow with the custom AWS log format to use MSS true IP features.

  7. In the Log format list, select all the attributes.

  8. In the Format preview section, review the custom format.

  9. In the Log file format section, select Text (default).

  10. In the Hive-compatible S3 prefix section, keep the Enable checkbox unchecked.

  11. In the Partition logs by time section, select Every 1 hour (60 mins).

  12. To add a tag to the flow log, select Add new tag and specify the tag key and value.

  13. Select Create flow log. For more information, see Publish flow logs to Amazon S3.

Configure flow logs to the Amazon CloudWatch

You can configure flow log either from VPCs, subnets, or network interfaces.

  1. In the Filter section, specify the type of IP traffic to log:

    • Accept: log only accepted traffic.

    • Reject: log only rejected traffic.

    • All: log accepted and rejected traffic.

  2. In the Maximum aggregation interval section, select 1 minute.

  3. In the Destination section, select Send to CloudWatch Logs.

  4. In the Destination log group section, provide the destination log group name that you created.

  5. In the IAM role list, select the role name. The selected role name has permissions to publish logs to CloudWatch logs.

    The IAM role must include the following permissions:

       {
         "Version": "2012-10-17",
         "Statement": [
           {
             "Effect": "Allow",
             "Action": [
               "logs:CreateLogGroup",
               "logs:CreateLogStream",
               "logs:PutLogEvents",
               "logs:DescribeLogGroups",
               "logs:DescribeLogStreams"
           ],
           "Resource": "*"
         }
        ]
       }
    
  6. In the Log record format section, select Custom format for the flow log record.

  7. To add a tag to the flow log, select Add new tag and specify the tag key and value.

  8. Select Create flow log. For more information, see Publish flow logs to Amazon S3.

Amazon S3 can be configured to send the event notifications to Amazon SQS. For more information, see Configuring a bucket for notifications (SNS topic or SQS queue).

IAM user policies are required for Amazon S3 and Amazon SQS if using Amazon SQS (Amazon S3 using Amazon SQS) as a log collection method. For more information, see Using IAM policies with AWS KMS.

Based on the service and region, identify the endpoints for connectivity by referring to the following AWS documentation:

Configure Google SecOps forwarder and syslog to ingest AWS VPC Flow logs

  1. Select SIEM Settings > Forwarders.
  2. Click Add new forwarder.
  3. Enter a unique name for the Forwarder name.
  4. Click Submit and then click Confirm. The forwarder is added and the Add collector configuration window appears.
  5. In the Collector name field, type a name.
  6. In the Log type field, select AWS VPC Flow or AWS VPC Flow (CSV), depending on your log format.
  7. In the Collector type field, select Syslog.
  8. Configure the following mandatory input parameters:
    • Protocol: specify the connection protocol the collector will use to listen for syslog data.
    • Address: specify the target IP address or hostname where the collector resides and addresses to the syslog data.
    • Port: specify the target port where the collector resides and listens for syslog data.
  9. Click Submit and then click Confirm.

For more information about Google SecOps forwarders, see Google Security Operations forwarders documentation. For information about requirements for each forwarder type, see Forwarder configuration by type.

If you encounter issues when you create forwarders, contact Google Security Operations support.

Field mapping reference

This parser transforms raw AWS VPC Flow logs—in either JSON or CSV format—into structured UDM format. It extracts relevant fields, maps them to match the UDM schema, and enriches the data with additional context like resource type, cloud provider, and labels to support a deeper analysis. The mapping logic is consistent across both formats: the CSV parser relies on a predefined field order to align values with the same UDM fields used in the JSON format.

UDM Mapping table for AWS EC2 VPC Parser

Log Field (Ascending) UDM Mapping Logic
CidrBlock event.idm.entity.entity.resource.attribute.labels.cidr_block Directly mapped from the "CidrBlock" field in the raw log.
CidrBlock event.idm.entity.entity.network.ip_subnet_range Directly mapped from the "CidrBlock" field in the raw log.
CidrBlockAssociation.AssociationID event.idm.entity.entity.resource.attribute.labels.cidr_block_association_association_id Directly mapped from the "AssociationID" field within the "CidrBlockAssociation" array in the raw log.
CidrBlockAssociation.CidrBlockState.State event.idm.entity.entity.resource.attribute.labels.cidr_block_association_cidr_block_state_state Directly mapped from the "State" field within the "CidrBlockState" object of the "CidrBlockAssociation" array in the raw log.
CidrBlockAssociation.CidrBlockState.StatusMessage event.idm.entity.entity.resource.attribute.labels.cidr_block_association_cidr_block_state_status_message Directly mapped from the "StatusMessage" field within the "CidrBlockState" object of the "CidrBlockAssociation" array in the raw log.
DhcpOptionsID event.idm.entity.entity.resource.attribute.labels.dhcp_options_id Directly mapped from the "DhcpOptionsID" field in the raw log.
ID event.idm.entity.entity.resource.product_object_id Directly mapped from the "ID" field in the raw log, which is renamed to "VpcID" in the parser.
ID event.idm.entity.metadata.product_entity_id Directly mapped from the "ID" field in the raw log, which is renamed to "VpcID" in the parser.
InstanceTenancy event.idm.entity.entity.resource.attribute.labels.instance_tenancy Directly mapped from the "InstanceTenancy" field in the raw log.
IsDefault event.idm.entity.entity.resource.attribute.labels.is_default Directly mapped from the "IsDefault" field in the raw log.
Ipv6CidrBlockAssociationSet.AssociationID event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_association_id Directly mapped from the "AssociationID" field within the "Ipv6CidrBlockAssociationSet" array in the raw log.
Ipv6CidrBlockAssociationSet.Ipv6CidrBlock event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_cidr_block Directly mapped from the "Ipv6CidrBlock" field within the "Ipv6CidrBlockAssociationSet" array in the raw log.
Ipv6CidrBlockAssociationSet.Ipv6CidrBlockState.State event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_cidr_block_state_state Directly mapped from the "State" field within the "Ipv6CidrBlockState" object of the "Ipv6CidrBlockAssociationSet" array in the raw log.
Ipv6CidrBlockAssociationSet.Ipv6CidrBlockState.StatusMessage event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_cidr_block_state_status_message Directly mapped from the "StatusMessage" field within the "Ipv6CidrBlockState" object of the "Ipv6CidrBlockAssociationSet" array in the raw log.
Ipv6CidrBlockAssociationSet.Ipv6Pool event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_ipv6_pool Directly mapped from the "Ipv6Pool" field within the "Ipv6CidrBlockAssociationSet" array in the raw log.
Ipv6CidrBlockAssociationSet.NetworkBorderGroup event.idm.entity.entity.resource.attribute.labels.ipv6_cidr_block_association_set_network_border_group Directly mapped from the "NetworkBorderGroup" field within the "Ipv6CidrBlockAssociationSet" array in the raw log.
OwnerID event.idm.entity.entity.resource.attribute.labels.owner_id Directly mapped from the "OwnerID" field in the raw log.
State event.idm.entity.entity.resource.attribute.labels.state Directly mapped from the "State" field in the raw log.
TagSet.Key event.idm.entity.entity.resource.attribute.labels.key Directly mapped from the "Key" field within the "TagSet" array in the raw log. This creates a new label for each tag in the "TagSet".
TagSet.Value event.idm.entity.entity.resource.attribute.labels.value Directly mapped from the "Value" field within the "TagSet" array in the raw log. This populates the value for each corresponding label created from the "Key" field.
N/A event.idm.entity.entity.resource.attribute.cloud.environment Hardcoded to "AMAZON_WEB_SERVICES" in the parser code.
N/A event.idm.entity.entity.resource.resource_type Hardcoded to "VPC_NETWORK" in the parser code.
N/A event.idm.entity.metadata.collected_timestamp Populated with the event timestamp, which is derived from the "collection_time" field in the raw log.
N/A event.idm.entity.metadata.entity_type Hardcoded to "RESOURCE" in the parser code.
N/A event.idm.entity.metadata.product_name Hardcoded to "Amazon VPC" in the parser code.
N/A event.idm.entity.metadata.vendor_name Hardcoded to "AWS" in the parser code.
N/A events.timestamp Populated with the event timestamp, which is derived from the "collection_time" field in the raw log.

Need more help? Get answers from Community members and Google SecOps professionals.