Collect Proofpoint On-Demand logs
This parser extracts fields from Proofpoint On-Demand JSON logs and transforms them into the UDM format. It handles two primary log formats: one containing email metadata and the other containing SMTP transaction details, using conditional logic to parse fields appropriately and populate the UDM fields based on the available data. The parser also performs data cleaning, such as removing extraneous characters and converting timestamps.
Before you begin
- Ensure that you have a Google Security Operations instance.
- Ensure that you have the Proofpoint On Demand Remote Syslog license.
- Ensure that you have privileged access to Proofpoint.
Configure Proofpoint on Demand API access
- Sign in to the Proofpoint Admin portal.
- Copy your Cluster ID, which is displayed in your management interface next to the release number.
- Select Settings > API Key Management.
- Click Create New to display the Create New API Key dialog.
- Enter a unique Name (for example, Google SecOps).
- Generate the API Key.
- Select View Details from the menu on the new API Key.
- Copy the API Key.
Configure a feed in Google SecOps to ingest Proofpoint On Demand (PoD) logs
- Go to SIEM Settings > Feeds.
- Click Add new.
- In the Feed name field, enter a name for the feed (for example, PoD Logs).
- Select Third Party API as the Source type.
- Select Proofpoint On Demand as the log type.
- Click Next.
- Specify values for the following input parameters:
- Authentication HTTP headers: enter the Proofpoint API Key in a
key:<value>
format. - Cluster ID: enter the Proofpoint Cluster ID that you copied earlier.
- Asset namespace: the asset namespace.
- Ingestion labels: the label applied to the events from this feed.
- Authentication HTTP headers: enter the Proofpoint API Key in a
- Click Next.
- Review the feed configuration in the Finalize screen, and then click Submit.
UDM Mapping Table
Log Field | UDM Mapping | Logic |
---|---|---|
classification |
security_result.detection_fields.classification |
The value comes directly from the classification field in the raw log. |
cluster |
security_result.detection_fields.cluster |
The value comes directly from the cluster field in the raw log. |
completelyRewritten |
security_result.detection_fields.completelyRewritten |
The value comes directly from the completelyRewritten field in the raw log. |
connection.country |
principal.location.country_or_region |
The value comes directly from the connection.country field in the raw log, unless it is "**". |
connection.host |
principal.hostname |
The value comes directly from the connection.host field in the raw log. |
connection.ip |
principal.ip |
The value comes directly from the connection.ip field in the raw log, if it is a valid IPv4 address. It is also merged with senderIP if present. |
connection.protocol |
network.application_protocol |
The protocol part before the colon in connection.protocol is extracted using gsub and mapped. For example, "smtp:smtp" becomes "SMTP". |
connection.tls.inbound.cipher |
network.tls.cipher |
The value comes directly from the connection.tls.inbound.cipher field in the raw log, unless it is "NONE". |
connection.tls.inbound.version |
network.tls.version |
The value comes directly from the connection.tls.inbound.version field in the raw log, unless the cipher is "NONE". |
envelope.from |
network.email.from |
The value comes directly from the envelope.from field in the raw log. It is also replaced by sm.from or fromAddress if present. |
envelope.rcpts |
network.email.to |
The email addresses in envelope.rcpts are merged into the network.email.to field if they are valid email addresses. It is also merged with sm.to or toAddresses if present. |
envelope.rcptsHashed |
read_only_udm.additional.fields |
The hashed email addresses in envelope.rcptsHashed are added as additional fields with keys like "toHashed_0", "toHashed_1", etc. |
eventTime |
@timestamp |
The value is parsed from the eventTime field using the ISO8601 or RFC 3339 format. |
eventType |
security_result.summary |
The value comes directly from the eventType field in the raw log. |
filter.disposition |
security_result.action_details |
The value comes directly from the filter.disposition field in the raw log, unless tls.verify is present. |
filter.modules.av.virusNames.0 |
security_result.threat_name |
The value comes directly from the filter.modules.av.virusNames.0 field in the raw log. |
filter.modules.dmarc.authResults |
read_only_udm.additional.fields |
The method and result from each entry in filter.modules.dmarc.authResults are added as additional fields with keys like "authResultsMethod_0", "authResults_result_0", "authResultsMethod_1", etc. |
filter.modules.spam.langs |
read_only_udm.additional.fields |
Each language in filter.modules.spam.langs is added as an additional field with keys like "lang_0", "lang_1", etc. |
filter.modules.spam.safeBlockedListMatches.0.listType |
security_result.detection_fields.safeBlockedListMatches_listType |
The value comes directly from the filter.modules.spam.safeBlockedListMatches.0.listType field in the raw log. |
filter.modules.spam.safeBlockedListMatches.0.rule |
security_result.detection_fields.safeBlockedListMatches_rule |
The value comes directly from the filter.modules.spam.safeBlockedListMatches.0.rule field in the raw log. |
filter.modules.spam.scores.classifiers.adult |
security_result.detection_fields.adult |
The value comes directly from the filter.modules.spam.scores.classifiers.adult field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.bulk |
security_result.detection_fields.bulk |
The value comes directly from the filter.modules.spam.scores.classifiers.bulk field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.impostor |
security_result.detection_fields.impostor |
The value comes directly from the filter.modules.spam.scores.classifiers.impostor field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.lowpriority |
security_result.detection_fields.lowpriority |
The value comes directly from the filter.modules.spam.scores.classifiers.lowpriority field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.malware |
security_result.detection_fields.malware |
The value comes directly from the filter.modules.spam.scores.classifiers.malware field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.mlx |
security_result.detection_fields.mlx |
The value comes directly from the filter.modules.spam.scores.classifiers.mlx field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.mlxlog |
security_result.detection_fields.mlxlog |
The value comes directly from the filter.modules.spam.scores.classifiers.mlxlog field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.phish |
security_result.detection_fields.phish |
The value comes directly from the filter.modules.spam.scores.classifiers.phish field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.spam |
security_result.detection_fields.spam |
The value comes directly from the filter.modules.spam.scores.classifiers.spam field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.classifiers.suspect |
security_result.detection_fields.suspect |
The value comes directly from the filter.modules.spam.scores.classifiers.suspect field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.engine |
security_result.detection_fields.engine |
The value comes directly from the filter.modules.spam.scores.engine field in the raw log, if it is not empty or 0. |
filter.modules.spam.scores.overall |
security_result.detection_fields.overall |
The value comes directly from the filter.modules.spam.scores.overall field in the raw log, if it is not empty or 0. |
filter.modules.spam.version.definitions |
security_result.summary |
The value comes directly from the filter.modules.spam.version.definitions field in the raw log. |
filter.modules.spam.version.engine |
metadata.product_version |
The value comes directly from the filter.modules.spam.version.engine field in the raw log. |
filter.modules.urldefense.counts.rewritten |
read_only_udm.additional.fields.urldefenseCountsRewritten |
The value comes directly from the filter.modules.urldefense.counts.rewritten field in the raw log. |
filter.modules.urldefense.counts.total |
security_result.detection_fields.urldefense_total |
The value comes directly from the filter.modules.urldefense.counts.total field in the raw log. |
filter.modules.zerohour.score |
read_only_udm.additional.fields.zeroHourScore |
The value comes directly from the filter.modules.zerohour.score field in the raw log. |
filter.origGuid |
read_only_udm.additional.fields.origGuid |
The value comes directly from the filter.origGuid field in the raw log. |
filter.qid |
read_only_udm.additional.fields.filterQid |
The value comes directly from the filter.qid field in the raw log. |
filter.quarantine.folder |
security_result.detection_fields.filter_quarantine_folder |
The value comes directly from the filter.quarantine.folder field in the raw log. |
filter.quarantine.folderId |
security_result.detection_fields.filter_quarantine_folderId |
The value comes directly from the filter.quarantine.quarantine.folderId field in the raw log. |
filter.quarantine.module |
security_result.detection_fields.filter_quarantine_module |
The value comes directly from the filter.quarantine.module field in the raw log. |
filter.quarantine.rule |
security_result.detection_fields.filter_quarantine_rule |
The value comes directly from the filter.quarantine.rule field in the raw log. |
filter.quarantine.type |
security_result.detection_fields.filter_quarantine_type |
The value comes directly from the filter.quarantine.type field in the raw log. |
filter.routeDirection |
network.direction |
If filter.routeDirection is "inbound", network.direction is set to "INBOUND". If filter.routeDirection is "outbound", network.direction is set to "OUTBOUND". |
filter.routes |
read_only_udm.additional.fields |
Each route in filter.routes is added as an additional field with keys like "filterRoutes_0", "filterRoutes_1", etc. |
fromAddress |
network.email.from |
The email addresses in fromAddress are replaced into the network.email.from field if they are valid email addresses. |
guid |
metadata.product_log_id |
The value comes directly from the guid field in the raw log. |
GUID |
metadata.product_log_id |
The value comes directly from the GUID field in the raw log. |
headerFrom |
network.email.from |
The value comes directly from the headerFrom field in the raw log. |
impostorScore |
security_result.detection_fields.impostorScore |
The value comes directly from the impostorScore field in the raw log. |
malwareScore |
security_result.detection_fields.malwareScore |
The value comes directly from the malwareScore field in the raw log. |
messageID |
network.email.mail_id |
The value comes directly from the messageID field in the raw log. |
messageSize |
security_result.detection_fields.messageSize |
The value comes directly from the messageSize field in the raw log. |
messageTime |
@timestamp |
The value is parsed from the messageTime field using the ISO8601 or RFC 3339 format. |
metadata.customerId |
principal.labels.customerId |
The value comes directly from the metadata.customerId field in the raw log. |
metadata.origin.data.agent |
network.http.user_agent |
The value comes directly from the metadata.origin.data.agent field in the raw log. |
metadata.origin.data.cid |
principal.user.userid |
The value comes directly from the metadata.origin.data.cid field in the raw log. |
metadata.origin.data.version |
metadata.product_version |
The value comes directly from the metadata.origin.data.version field in the raw log. |
msg.header.from |
read_only_udm.additional.fields.msgHeaderFrom |
The value comes directly from the msg.header.from.0 field in the raw log. |
msg.header.reply-to |
network.email.reply_to |
The email address enclosed in <> in msg.header.reply-to.0 is extracted and mapped. |
msg.header.subject |
network.email.subject |
The value comes directly from the msg.header.subject field in the raw log. |
msg.header.to |
read_only_udm.additional.fields.msgHeaderTo |
The value comes directly from the msg.header.to field in the raw log. |
msg.normalizedHeader.subject |
network.email.subject |
The value comes directly from the msg.normalizedHeader.subject field in the raw log. |
msg.parsedAddresses.cc |
network.email.cc |
The email addresses in msg.parsedAddresses.cc are merged into the network.email.cc field if they are valid email addresses. |
msg.parsedAddresses.ccHashed |
read_only_udm.additional.fields |
The hashed email addresses in msg.parsedAddresses.ccHashed are added as additional fields with keys like "ccHashed_0", "ccHashed_1", etc. |
msg.parsedAddresses.from |
read_only_udm.additional.fields.msgParsedAddressesFrom |
The value comes directly from the msg.parsedAddresses.from.0 field in the raw log. |
msg.parsedAddresses.from.0 |
principal.user.email_addresses |
The value comes directly from the msg.parsedAddresses.from.0 field in the raw log. |
msg.parsedAddresses.fromHashed |
read_only_udm.additional.fields.fromHashed |
The value comes directly from the msg.parsedAddresses.fromHashed.0 field in the raw log. |
msg.parsedAddresses.to |
target.user.email_addresses |
The email addresses in msg.parsedAddresses.to are merged into the target.user.email_addresses field if they are valid email addresses. |
msgParts |
read_only_udm.about |
Multiple about objects are created, one for each entry in msgParts . File hashes, MIME type, size, and other metadata are extracted. |
QID |
security_result.detection_fields.QID |
The value comes directly from the QID field in the raw log. |
recipient |
target.user.email_addresses |
The email addresses in recipient are merged into the target.user.email_addresses field if they are valid email addresses. |
replyToAddress |
network.email.reply_to |
The email addresses in replyToAddress are replaced into the network.email.reply_to field if they are valid email addresses. |
sender |
principal.user.email_addresses |
The value comes directly from the sender field in the raw log, if it is a valid email address. |
senderIP |
principal.ip |
The value comes directly from the senderIP field in the raw log. |
sm.from |
network.email.from |
The value comes directly from the sm.from field in the raw log. |
sm.msgid |
network.email.mail_id |
The value comes directly from the sm.msgid field in the raw log, after removing "<" and ">". |
sm.proto |
network.application_protocol |
The value comes directly from the sm.proto field in the raw log. |
sm.qid |
security_result.detection_fields.QUID |
The value comes directly from the sm.qid field in the raw log. |
sm.relay |
intermediary.hostname , intermediary.ip |
The hostname and IP address are extracted from sm.relay using grok. |
sm.stat |
security_result.detection_fields.Stat |
The first word of sm.stat is extracted using grok and mapped. |
sm.to |
network.email.to |
The email addresses in sm.to are merged into the network.email.to field if they are valid email addresses. |
spamScore |
security_result.detection_fields.spamScore |
The value comes directly from the spamScore field in the raw log. |
subject |
network.email.subject |
The value comes directly from the subject field in the raw log. |
threat |
security_result.detection_fields.threat |
The value comes directly from the threat field in the raw log. |
threatsInfoMap |
security_result.detection_fields |
Key-value pairs from each entry in threatsInfoMap are added as detection fields. |
threatType |
security_result.detection_fields.threatType |
The value comes directly from the threatType field in the raw log. |
tls.cipher |
network.tls.cipher |
The value comes directly from the tls.cipher field in the raw log, unless it is "NONE". |
tls.verify |
security_result.action_details |
The value comes directly from the tls.verify field in the raw log. |
tls.version |
network.tls.version |
The value comes directly from the tls.version field in the raw log, unless the cipher is "NONE". |
toAddresses |
network.email.to |
The email addresses in toAddresses are merged into the network.email.to field if they are valid email addresses. |
ts |
@timestamp |
The value is parsed from the ts field using the ISO8601 or RFC 3339 format, after some preprocessing to handle extra fractional seconds. |
Changes
2024-11-28
- Enhancement:
- Mapped "msgParts.metadata.company" and "msgParts.metadata.author" to "security_result.detection_fields".
- Removed duplicate mapping of "email.subject".
- Changed mapping of "filter.modules.dmarc.authResults." fields from "additional.fields" to "security_result.detection_fields".
2024-08-28
- Enhancement:
- Changed "security_result.detection_fields" key from "filterQid" to "qid".
2024-08-21
- Enhancement:
- Mapped "metadata.origin.data.cid" to "additional.fields".
2024-07-22
- Enhancement:
- If "about.file.size" is a valid Unsigned Integer, then mapped "msgPart.detectedSizeBytes" to "about.file.size".
- Added support for new pattern of SYSLOG logs.
2024-07-09
- Enhancement:
- Mapped "msg.header.x-mailer" to "additional.fields".
2024-07-09
- Enhancement:
- Mapped "msg.header.x-mailer" to "additional.fields".
2023-11-13
- Mapped "subject" to "network.email.subject".
- Mapped "messageID" to "network.email.mail_id".
- Mapped "fromAddress" to "network.email.from".
- Mapped "ccAddresses" to "network.email.cc".
- Mapped "replyToAddress" to "network.email.reply_to".
- Mapped "toAddresses" to "network.email.to".
- Mapped "sender" to "principal.user.email_addresses".
- Mapped "senderIP" to "principal.ip".
- Mapped "recipient" to "target.user.email_addresses".
- Mapped "spamScore", "phishScore", "threatsInfoMap", "impostorScore", "malwareScore", "" to "security_result.detection_fields".
2023-10-26
- Mapped "msg.headeparsedAddressesr.from.0" to "principal.user.email_addresses".
- Modified mappings from using deprecated UDM fields to alternative fields.
- Added mapping from "about.labels" to "about.resource.attribute.labels".
- Added mapping from "principal.labels" to "principal.resource.attribute.labels".
2023-06-05
- Added check to "msg.header.reply-to.0" prior mapping to UDM to check if it is any array of emails.
- Added not "-1" check to "msgPart.detectedSizeBytes" prior mapping to UDM.
2022-07-14
- Enhancement to map following element to UDM element:
- Mapped langs to additional.fields.
- Mapped definitions to security_result.summary.
- Mapped engine to metadata.product_version.
2022-06-29
- Enhancement - Added gsub to remove '<>' from the fields 'sm.msgid' and 'msg.header.message-id.0' mapped to 'network.email.mail_id'.
2022-05-20
- Enhancement to map following elements to UDM elements:
- Mapped 'tls.verify', 'filter.disposition' to 'security_result.action_details'.
- Mapped 'filter.modules.dmarc.authResults.result' to 'additional.fields'.
- Mapped 'filter.quarantine.module', 'filter.quarantine.folder', 'filter.quarantine.type', 'filter.quarantine.folderId', 'filter.modules.spam.scores.overall', 'filter.modules.spam.scores.engine', 'filter.modules.spam.scores.classifiers.spam', 'filter.modules.spam.scores.classifiers.mlxlog', 'filter.modules.spam.scores.classifiers.phish', 'filter.modules.spam.scores.classifiers.impostor', 'filter.modules.spam.scores.classifiers.lowpriority', 'filter.modules.spam.scores.classifiers.mlx', 'filter.modules.spam.scores.classifiers.bulk', 'filter.modules.spam.scores.classifiers.suspect', 'filter.modules.spam.scores.classifiers.malware', 'filter.modules.spam.scores.classifiers.adult' to 'security_result.detection_fields'.
2022-04-13
- Enhancement to map following element to UDM element:
- Mapped filter.modules.av.virusNames to 'security_result.threat_name'.