Collect Proofpoint On-Demand logs

Supported in:

This parser extracts fields from Proofpoint On-Demand JSON logs and transforms them into the UDM format. It handles two primary log formats: one containing email metadata and the other containing SMTP transaction details, using conditional logic to parse fields appropriately and populate the UDM fields based on the available data. The parser also performs data cleaning, such as removing extraneous characters and converting timestamps.

Before you begin

  • Ensure that you have a Google Security Operations instance.
  • Ensure that you have the Proofpoint On Demand Remote Syslog license.
  • Ensure that you have privileged access to Proofpoint.

Configure Proofpoint on Demand API access

  1. Sign in to the Proofpoint Admin portal.
  2. Copy your Cluster ID, which is displayed in your management interface next to the release number.
  3. Select Settings > API Key Management.
  4. Click Create New to display the Create New API Key dialog.
  5. Enter a unique Name (for example, Google SecOps).
  6. Generate the API Key.
  7. Select View Details from the menu on the new API Key.
  8. Copy the API Key.

Configure a feed in Google SecOps to ingest Proofpoint On Demand (PoD) logs

  1. Go to SIEM Settings > Feeds.
  2. Click Add new.
  3. In the Feed name field, enter a name for the feed (for example, PoD Logs).
  4. Select Third Party API as the Source type.
  5. Select Proofpoint On Demand as the log type.
  6. Click Next.
  7. Specify values for the following input parameters:
    • Authentication HTTP headers: enter the Proofpoint API Key in a key:<value> format.
    • Cluster ID: enter the Proofpoint Cluster ID that you copied earlier.
    • Asset namespace: the asset namespace.
    • Ingestion labels: the label applied to the events from this feed.
  8. Click Next.
  9. Review the feed configuration in the Finalize screen, and then click Submit.

UDM Mapping Table

Log Field UDM Mapping Logic
classification security_result.detection_fields.classification The value comes directly from the classification field in the raw log.
cluster security_result.detection_fields.cluster The value comes directly from the cluster field in the raw log.
completelyRewritten security_result.detection_fields.completelyRewritten The value comes directly from the completelyRewritten field in the raw log.
connection.country principal.location.country_or_region The value comes directly from the connection.country field in the raw log, unless it is "**".
connection.host principal.hostname The value comes directly from the connection.host field in the raw log.
connection.ip principal.ip The value comes directly from the connection.ip field in the raw log, if it is a valid IPv4 address. It is also merged with senderIP if present.
connection.protocol network.application_protocol The protocol part before the colon in connection.protocol is extracted using gsub and mapped. For example, "smtp:smtp" becomes "SMTP".
connection.tls.inbound.cipher network.tls.cipher The value comes directly from the connection.tls.inbound.cipher field in the raw log, unless it is "NONE".
connection.tls.inbound.version network.tls.version The value comes directly from the connection.tls.inbound.version field in the raw log, unless the cipher is "NONE".
envelope.from network.email.from The value comes directly from the envelope.from field in the raw log. It is also replaced by sm.from or fromAddress if present.
envelope.rcpts network.email.to The email addresses in envelope.rcpts are merged into the network.email.to field if they are valid email addresses. It is also merged with sm.to or toAddresses if present.
envelope.rcptsHashed read_only_udm.additional.fields The hashed email addresses in envelope.rcptsHashed are added as additional fields with keys like "toHashed_0", "toHashed_1", etc.
eventTime @timestamp The value is parsed from the eventTime field using the ISO8601 or RFC 3339 format.
eventType security_result.summary The value comes directly from the eventType field in the raw log.
filter.disposition security_result.action_details The value comes directly from the filter.disposition field in the raw log, unless tls.verify is present.
filter.modules.av.virusNames.0 security_result.threat_name The value comes directly from the filter.modules.av.virusNames.0 field in the raw log.
filter.modules.dmarc.authResults read_only_udm.additional.fields The method and result from each entry in filter.modules.dmarc.authResults are added as additional fields with keys like "authResultsMethod_0", "authResults_result_0", "authResultsMethod_1", etc.
filter.modules.spam.langs read_only_udm.additional.fields Each language in filter.modules.spam.langs is added as an additional field with keys like "lang_0", "lang_1", etc.
filter.modules.spam.safeBlockedListMatches.0.listType security_result.detection_fields.safeBlockedListMatches_listType The value comes directly from the filter.modules.spam.safeBlockedListMatches.0.listType field in the raw log.
filter.modules.spam.safeBlockedListMatches.0.rule security_result.detection_fields.safeBlockedListMatches_rule The value comes directly from the filter.modules.spam.safeBlockedListMatches.0.rule field in the raw log.
filter.modules.spam.scores.classifiers.adult security_result.detection_fields.adult The value comes directly from the filter.modules.spam.scores.classifiers.adult field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.bulk security_result.detection_fields.bulk The value comes directly from the filter.modules.spam.scores.classifiers.bulk field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.impostor security_result.detection_fields.impostor The value comes directly from the filter.modules.spam.scores.classifiers.impostor field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.lowpriority security_result.detection_fields.lowpriority The value comes directly from the filter.modules.spam.scores.classifiers.lowpriority field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.malware security_result.detection_fields.malware The value comes directly from the filter.modules.spam.scores.classifiers.malware field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.mlx security_result.detection_fields.mlx The value comes directly from the filter.modules.spam.scores.classifiers.mlx field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.mlxlog security_result.detection_fields.mlxlog The value comes directly from the filter.modules.spam.scores.classifiers.mlxlog field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.phish security_result.detection_fields.phish The value comes directly from the filter.modules.spam.scores.classifiers.phish field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.spam security_result.detection_fields.spam The value comes directly from the filter.modules.spam.scores.classifiers.spam field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.classifiers.suspect security_result.detection_fields.suspect The value comes directly from the filter.modules.spam.scores.classifiers.suspect field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.engine security_result.detection_fields.engine The value comes directly from the filter.modules.spam.scores.engine field in the raw log, if it is not empty or 0.
filter.modules.spam.scores.overall security_result.detection_fields.overall The value comes directly from the filter.modules.spam.scores.overall field in the raw log, if it is not empty or 0.
filter.modules.spam.version.definitions security_result.summary The value comes directly from the filter.modules.spam.version.definitions field in the raw log.
filter.modules.spam.version.engine metadata.product_version The value comes directly from the filter.modules.spam.version.engine field in the raw log.
filter.modules.urldefense.counts.rewritten read_only_udm.additional.fields.urldefenseCountsRewritten The value comes directly from the filter.modules.urldefense.counts.rewritten field in the raw log.
filter.modules.urldefense.counts.total security_result.detection_fields.urldefense_total The value comes directly from the filter.modules.urldefense.counts.total field in the raw log.
filter.modules.zerohour.score read_only_udm.additional.fields.zeroHourScore The value comes directly from the filter.modules.zerohour.score field in the raw log.
filter.origGuid read_only_udm.additional.fields.origGuid The value comes directly from the filter.origGuid field in the raw log.
filter.qid read_only_udm.additional.fields.filterQid The value comes directly from the filter.qid field in the raw log.
filter.quarantine.folder security_result.detection_fields.filter_quarantine_folder The value comes directly from the filter.quarantine.folder field in the raw log.
filter.quarantine.folderId security_result.detection_fields.filter_quarantine_folderId The value comes directly from the filter.quarantine.quarantine.folderId field in the raw log.
filter.quarantine.module security_result.detection_fields.filter_quarantine_module The value comes directly from the filter.quarantine.module field in the raw log.
filter.quarantine.rule security_result.detection_fields.filter_quarantine_rule The value comes directly from the filter.quarantine.rule field in the raw log.
filter.quarantine.type security_result.detection_fields.filter_quarantine_type The value comes directly from the filter.quarantine.type field in the raw log.
filter.routeDirection network.direction If filter.routeDirection is "inbound", network.direction is set to "INBOUND". If filter.routeDirection is "outbound", network.direction is set to "OUTBOUND".
filter.routes read_only_udm.additional.fields Each route in filter.routes is added as an additional field with keys like "filterRoutes_0", "filterRoutes_1", etc.
fromAddress network.email.from The email addresses in fromAddress are replaced into the network.email.from field if they are valid email addresses.
guid metadata.product_log_id The value comes directly from the guid field in the raw log.
GUID metadata.product_log_id The value comes directly from the GUID field in the raw log.
headerFrom network.email.from The value comes directly from the headerFrom field in the raw log.
impostorScore security_result.detection_fields.impostorScore The value comes directly from the impostorScore field in the raw log.
malwareScore security_result.detection_fields.malwareScore The value comes directly from the malwareScore field in the raw log.
messageID network.email.mail_id The value comes directly from the messageID field in the raw log.
messageSize security_result.detection_fields.messageSize The value comes directly from the messageSize field in the raw log.
messageTime @timestamp The value is parsed from the messageTime field using the ISO8601 or RFC 3339 format.
metadata.customerId principal.labels.customerId The value comes directly from the metadata.customerId field in the raw log.
metadata.origin.data.agent network.http.user_agent The value comes directly from the metadata.origin.data.agent field in the raw log.
metadata.origin.data.cid principal.user.userid The value comes directly from the metadata.origin.data.cid field in the raw log.
metadata.origin.data.version metadata.product_version The value comes directly from the metadata.origin.data.version field in the raw log.
msg.header.from read_only_udm.additional.fields.msgHeaderFrom The value comes directly from the msg.header.from.0 field in the raw log.
msg.header.reply-to network.email.reply_to The email address enclosed in <> in msg.header.reply-to.0 is extracted and mapped.
msg.header.subject network.email.subject The value comes directly from the msg.header.subject field in the raw log.
msg.header.to read_only_udm.additional.fields.msgHeaderTo The value comes directly from the msg.header.to field in the raw log.
msg.normalizedHeader.subject network.email.subject The value comes directly from the msg.normalizedHeader.subject field in the raw log.
msg.parsedAddresses.cc network.email.cc The email addresses in msg.parsedAddresses.cc are merged into the network.email.cc field if they are valid email addresses.
msg.parsedAddresses.ccHashed read_only_udm.additional.fields The hashed email addresses in msg.parsedAddresses.ccHashed are added as additional fields with keys like "ccHashed_0", "ccHashed_1", etc.
msg.parsedAddresses.from read_only_udm.additional.fields.msgParsedAddressesFrom The value comes directly from the msg.parsedAddresses.from.0 field in the raw log.
msg.parsedAddresses.from.0 principal.user.email_addresses The value comes directly from the msg.parsedAddresses.from.0 field in the raw log.
msg.parsedAddresses.fromHashed read_only_udm.additional.fields.fromHashed The value comes directly from the msg.parsedAddresses.fromHashed.0 field in the raw log.
msg.parsedAddresses.to target.user.email_addresses The email addresses in msg.parsedAddresses.to are merged into the target.user.email_addresses field if they are valid email addresses.
msgParts read_only_udm.about Multiple about objects are created, one for each entry in msgParts. File hashes, MIME type, size, and other metadata are extracted.
QID security_result.detection_fields.QID The value comes directly from the QID field in the raw log.
recipient target.user.email_addresses The email addresses in recipient are merged into the target.user.email_addresses field if they are valid email addresses.
replyToAddress network.email.reply_to The email addresses in replyToAddress are replaced into the network.email.reply_to field if they are valid email addresses.
sender principal.user.email_addresses The value comes directly from the sender field in the raw log, if it is a valid email address.
senderIP principal.ip The value comes directly from the senderIP field in the raw log.
sm.from network.email.from The value comes directly from the sm.from field in the raw log.
sm.msgid network.email.mail_id The value comes directly from the sm.msgid field in the raw log, after removing "<" and ">".
sm.proto network.application_protocol The value comes directly from the sm.proto field in the raw log.
sm.qid security_result.detection_fields.QUID The value comes directly from the sm.qid field in the raw log.
sm.relay intermediary.hostname, intermediary.ip The hostname and IP address are extracted from sm.relay using grok.
sm.stat security_result.detection_fields.Stat The first word of sm.stat is extracted using grok and mapped.
sm.to network.email.to The email addresses in sm.to are merged into the network.email.to field if they are valid email addresses.
spamScore security_result.detection_fields.spamScore The value comes directly from the spamScore field in the raw log.
subject network.email.subject The value comes directly from the subject field in the raw log.
threat security_result.detection_fields.threat The value comes directly from the threat field in the raw log.
threatsInfoMap security_result.detection_fields Key-value pairs from each entry in threatsInfoMap are added as detection fields.
threatType security_result.detection_fields.threatType The value comes directly from the threatType field in the raw log.
tls.cipher network.tls.cipher The value comes directly from the tls.cipher field in the raw log, unless it is "NONE".
tls.verify security_result.action_details The value comes directly from the tls.verify field in the raw log.
tls.version network.tls.version The value comes directly from the tls.version field in the raw log, unless the cipher is "NONE".
toAddresses network.email.to The email addresses in toAddresses are merged into the network.email.to field if they are valid email addresses.
ts @timestamp The value is parsed from the ts field using the ISO8601 or RFC 3339 format, after some preprocessing to handle extra fractional seconds.

Changes

2024-11-28

  • Enhancement:
  • Mapped "msgParts.metadata.company" and "msgParts.metadata.author" to "security_result.detection_fields".
  • Removed duplicate mapping of "email.subject".
  • Changed mapping of "filter.modules.dmarc.authResults." fields from "additional.fields" to "security_result.detection_fields".

2024-08-28

  • Enhancement:
  • Changed "security_result.detection_fields" key from "filterQid" to "qid".

2024-08-21

  • Enhancement:
  • Mapped "metadata.origin.data.cid" to "additional.fields".

2024-07-22

  • Enhancement:
  • If "about.file.size" is a valid Unsigned Integer, then mapped "msgPart.detectedSizeBytes" to "about.file.size".
  • Added support for new pattern of SYSLOG logs.

2024-07-09

  • Enhancement:
  • Mapped "msg.header.x-mailer" to "additional.fields".

2024-07-09

  • Enhancement:
  • Mapped "msg.header.x-mailer" to "additional.fields".

2023-11-13

  • Mapped "subject" to "network.email.subject".
  • Mapped "messageID" to "network.email.mail_id".
  • Mapped "fromAddress" to "network.email.from".
  • Mapped "ccAddresses" to "network.email.cc".
  • Mapped "replyToAddress" to "network.email.reply_to".
  • Mapped "toAddresses" to "network.email.to".
  • Mapped "sender" to "principal.user.email_addresses".
  • Mapped "senderIP" to "principal.ip".
  • Mapped "recipient" to "target.user.email_addresses".
  • Mapped "spamScore", "phishScore", "threatsInfoMap", "impostorScore", "malwareScore", "" to "security_result.detection_fields".

2023-10-26

  • Mapped "msg.headeparsedAddressesr.from.0" to "principal.user.email_addresses".
  • Modified mappings from using deprecated UDM fields to alternative fields.
  • Added mapping from "about.labels" to "about.resource.attribute.labels".
  • Added mapping from "principal.labels" to "principal.resource.attribute.labels".

2023-06-05

  • Added check to "msg.header.reply-to.0" prior mapping to UDM to check if it is any array of emails.
  • Added not "-1" check to "msgPart.detectedSizeBytes" prior mapping to UDM.

2022-07-14

  • Enhancement to map following element to UDM element:
  • Mapped langs to additional.fields.
  • Mapped definitions to security_result.summary.
  • Mapped engine to metadata.product_version.

2022-06-29

  • Enhancement - Added gsub to remove '<>' from the fields 'sm.msgid' and 'msg.header.message-id.0' mapped to 'network.email.mail_id'.

2022-05-20

  • Enhancement to map following elements to UDM elements:
  • Mapped 'tls.verify', 'filter.disposition' to 'security_result.action_details'.
  • Mapped 'filter.modules.dmarc.authResults.result' to 'additional.fields'.
  • Mapped 'filter.quarantine.module', 'filter.quarantine.folder', 'filter.quarantine.type', 'filter.quarantine.folderId', 'filter.modules.spam.scores.overall', 'filter.modules.spam.scores.engine', 'filter.modules.spam.scores.classifiers.spam', 'filter.modules.spam.scores.classifiers.mlxlog', 'filter.modules.spam.scores.classifiers.phish', 'filter.modules.spam.scores.classifiers.impostor', 'filter.modules.spam.scores.classifiers.lowpriority', 'filter.modules.spam.scores.classifiers.mlx', 'filter.modules.spam.scores.classifiers.bulk', 'filter.modules.spam.scores.classifiers.suspect', 'filter.modules.spam.scores.classifiers.malware', 'filter.modules.spam.scores.classifiers.adult' to 'security_result.detection_fields'.

2022-04-13

  • Enhancement to map following element to UDM element:
  • Mapped filter.modules.av.virusNames to 'security_result.threat_name'.