Collect Box Collaboration JSON logs
This document explains how to ingest Box Collaboration JSON logs to Google Security Operations using AWS S3 using Lambda and EventBridge schedule. The parser processes Box event logs in JSON format, mapping them to a unified data model (UDM). It extracts relevant fields from the raw logs, performs data transformations like renaming and merging, and enriches the data with intermediary information before outputting the structured event data.
Before you begin
- Google SecOps instance
- Privileged access to Box (Admin + Developer Console)
- Privileged access to AWS (S3, IAM, Lambda, EventBridge) in the same Region where you plan to store the logs
Configure Box Developer Console (Client Credentials)
- Sign in to Box Developer Console.
- Create a Custom App with Server Authentication (Client Credentials Grant).
- Set Application Access = App + Enterprise Access.
- In Application Scopes, enable Manage enterprise properties.
- In Admin Console > Apps > Custom Apps Manager, Authorize the app by Client ID.
- Copy and save the Client ID and * Client Secret in a secure location.
- Go to Admin Console > Account & Billing > Account Information.
- Copy and save the Enterprise ID in a secure location.
Configure AWS S3 bucket and IAM for Google SecOps
- Create Amazon S3 bucket following this user guide: Creating a bucket
- Save bucket Name and Region for future reference (for example,
box-collaboration-logs
). - Create a user following this user guide: Creating an IAM user.
- Select the created User.
- Select the Security credentials tab.
- Click Create Access Key in the Access Keys section.
- Select Third-party service as the Use case.
- Click Next.
- Optional: add a description tag.
- Click Create access key.
- Click Download CSV file to save the Access Key and Secret Access Key for later use.
- Click Done.
- Select the Permissions tab.
- Click Add permissions in the Permissions policies section.
- Select Add permissions.
- Select Attach policies directly
- Search for and select the AmazonS3FullAccess policy.
- Click Next.
- Click Add permissions.
Configure the IAM policy and role for S3 uploads
- In the AWS console, go to IAM > Policies > Create policy > JSON tab.
Enter the following policy:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowPutBoxObjects", "Effect": "Allow", "Action": ["s3:PutObject"], "Resource": "arn:aws:s3:::box-collaboration-logs/*" }, { "Sid": "AllowGetStateObject", "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::box-collaboration-logs/box/collaboration/state.json" } ] }
- Replace
box-collaboration-logs
if you entered a different bucket name.
- Replace
Click Next > Create policy.
Go to IAM > Roles > Create role > AWS service > Lambda.
Attach the newly created policy.
Name the role
WriteBoxToS3Role
and click Create role.
Create the Lambda function
- In the AWS Console, go to Lambda > Functions > Create function.
- Click Author from scratch.
Provide the following configuration details:
Setting Value Name box_collaboration_to_s3
Runtime Python 3.13 Architecture x86_64 Execution role WriteBoxToS3Role
After the function is created, open the Code tab, delete the stub and enter the following code (
box_collaboration_to_s3.py
):#!/usr/bin/env python3 # Lambda: Pull Box Enterprise Events to S3 (no transform) import os, json, time, urllib.parse from urllib.request import Request, urlopen from urllib.error import HTTPError, URLError import boto3 TOKEN_URL = "https://api.box.com/oauth2/token" EVENTS_URL = "https://api.box.com/2.0/events" CID = os.environ["BOX_CLIENT_ID"] CSECRET = os.environ["BOX_CLIENT_SECRET"] ENT_ID = os.environ["BOX_ENTERPRISE_ID"] STREAM_TYPE = os.environ.get("STREAM_TYPE", "admin_logs_streaming") LIMIT = int(os.environ.get("LIMIT", "500")) BUCKET = os.environ["S3_BUCKET"] PREFIX = os.environ.get("S3_PREFIX", "box/collaboration/") STATE_KEY = os.environ.get("STATE_KEY", "box/collaboration/state.json") s3 = boto3.client("s3") def get_state(): try: obj = s3.get_object(Bucket=BUCKET, Key=STATE_KEY) data = json.loads(obj["Body"].read()) return data.get("stream_position") except Exception: return None def put_state(pos): body = json.dumps({"stream_position": pos}, separators=(",", ":")).encode("utf-8") s3.put_object(Bucket=BUCKET, Key=STATE_KEY, Body=body, ContentType="application/json") def get_token(): body = urllib.parse.urlencode({ "grant_type": "client_credentials", "client_id": CID, "client_secret": CSECRET, "box_subject_type": "enterprise", "box_subject_id": ENT_ID, }).encode() req = Request(TOKEN_URL, data=body, method="POST") req.add_header("Content-Type", "application/x-www-form-urlencoded") with urlopen(req, timeout=30) as r: tok = json.loads(r.read().decode()) return tok["access_token"] def fetch_events(token, stream_position=None, timeout=60, max_retries=5): params = {"stream_type": STREAM_TYPE, "limit": LIMIT, "stream_position": stream_position or "now"} qs = urllib.parse.urlencode(params) attempt, backoff = 0, 1.0 while True: try: req = Request(f"{EVENTS_URL}?{qs}", method="GET") req.add_header("Authorization", f"Bearer {token}") with urlopen(req, timeout=timeout) as r: return json.loads(r.read().decode()) except HTTPError as e: if e.code == 429 and attempt < max_retries: ra = e.headers.get("Retry-After") delay = int(ra) if (ra and ra.isdigit()) else int(backoff) time.sleep(max(1, delay)); attempt += 1; backoff *= 2; continue if 500 <= e.code <= 599 and attempt < max_retries: time.sleep(backoff); attempt += 1; backoff *= 2; continue raise except URLError: if attempt < max_retries: time.sleep(backoff); attempt += 1; backoff *= 2; continue raise def write_chunk(data): ts = time.strftime("%Y/%m/%d/%H%M%S", time.gmtime()) key = f"{PREFIX}/{ts}-box-events.json" s3.put_object(Bucket=BUCKET, Key=key, Body=json.dumps(data, separators=(",", ":")).encode("utf-8"), ContentType="application/json") return key def lambda_handler(event=None, context=None): token = get_token() pos = get_state() total, idx = 0, 0 while True: page = fetch_events(token, pos) entries = page.get("entries") or [] if not entries: next_pos = page.get("next_stream_position") or pos if next_pos and next_pos != pos: put_state(next_pos) break # уникальный ключ ts = time.strftime("%Y/%m/%d/%H%M%S", time.gmtime()) key = f"{PREFIX}/{ts}-box-events-{idx:03d}.json" s3.put_object(Bucket=BUCKET, Key=key, Body=json.dumps(page, separators=(",", ":")).encode("utf-8"), ContentType="application/json") idx += 1 total += len(entries) pos = page.get("next_stream_position") or pos if pos: put_state(pos) if len(entries) < LIMIT: break return {"ok": True, "written": total, "next_stream_position": pos}
Go to Configuration > Environment variables > Edit > Add new environment variable.
Enter the following environment variables, replacing with your values:
Key Example S3_BUCKET
box-collaboration-logs
S3_PREFIX
box/collaboration/
STATE_KEY
box/collaboration/state.json
BOX_CLIENT_ID
Enter Box Client ID BOX_CLIENT_SECRET
Enter Box Client Secret BOX_ENTERPRISE_ID
Enter Box Enterprise ID STREAM_TYPE
admin_logs_streaming
LIMIT
500
After the function is created, stay on its page (or open Lambda > Functions > your-function).
Select the Configuration tab.
In the General configuration panel, click Edit.
Change Timeout to 10 minutes (600 seconds) and click Save.
Schedule the Lambda function (EventBridge Scheduler)
- Go to Amazon EventBridge > Scheduler > Create schedule.
- Provide the following configuration details:
- Recurring schedule: Rate (
15 min
). - Target: your Lambda function.
- Name:
box-collaboration-schedule-15min
.
- Recurring schedule: Rate (
- Click Create schedule.
Configure a feed in Google SecOps to ingest Box logs
- Go to SIEM Settings > Feeds.
- Click Add New Feed.
- In the Feed name field, enter a name for the feed (for example,
Box Collaboration
). - Select Amazon S3 V2 as the Source type.
- Select Box as the Log type.
- Click Next.
- Specify values for the following input parameters:
- S3 URI: The bucket URI (the format should be:
s3://box-collaboration-logs/box/collaboration/
). Replacebox-collaboration-logs
: Use the actual name of the bucket. - Source deletion options: Select the deletion option according to your preference.
- Maximum File Age: Include files modified in the last number of days. Default is 180 Days.
- Access Key ID: User access key with access to the S3 bucket.
- Secret Access Key: User secret key with access to the S3 bucket.
- Asset namespace: The asset namespace.
- Ingestion labels: The label to be applied to the events from this feed.
- S3 URI: The bucket URI (the format should be:
- Click Next.
- Review your new feed configuration in the Finalize screen, and then click Submit.
UDM Mapping Table
Log field | UDM mapping | Logic |
---|---|---|
additional_details.ekm_id | additional.fields | Value taken from additional_details.ekm_id |
additional_details.service_id | additional.fields | Value taken from additional_details.service_id |
additional_details.service_name | additional.fields | Value taken from additional_details.service_name |
additional_details.shared_link_id | additional.fields | Value taken from additional_details.shared_link_id |
additional_details.size | target.file.size | Value taken from additional_details.size |
additional_details.version_id | additional.fields | Value taken from additional_details.version_id |
created_at | metadata.event_timestamp | Value taken from created_at |
created_by.id | principal.user.userid | Value taken from created_by.id |
created_by.login | principal.user.email_addresses | Value taken from created_by.login |
created_by.name | principal.user.user_display_name | Value taken from created_by.name |
event_id | metadata.product_log_id | Value taken from event_id |
event_type | metadata.product_event_type | Value taken from event_type |
ip_address | principal.ip | Value taken from ip_address |
source.item_id | target.file.product_object_id | Value taken from source.item_id |
source.item_name | target.file.full_path | Value taken from source.item_name |
source.item_type | Not mapped | |
source.login | target.user.email_addresses | Value taken from source.login |
source.name | target.user.user_display_name | Value taken from source.name |
source.owned_by.id | target.user.userid | Value taken from source.owned_by.id |
source.owned_by.login | target.user.email_addresses | Value taken from source.owned_by.login |
source.owned_by.name | target.user.user_display_name | Value taken from source.owned_by.name |
source.parent.id | Not mapped | |
source.parent.name | Not mapped | |
source.parent.type | Not mapped | |
source.type | Not mapped | |
type | metadata.log_type | Value taken from type |
metadata.vendor_name | Hardcoded value | |
metadata.product_name | Hardcoded value | |
security_result.action | Derived from event_type. If event_type is FAILED_LOGIN then BLOCK, if event_type is USER_LOGIN then ALLOW, otherwise UNSPECIFIED. | |
extensions.auth.type | Derived from event_type. If event_type is USER_LOGIN or ADMIN_LOGIN then MACHINE, otherwise UNSPECIFIED. | |
extensions.auth.mechanism | Derived from event_type. If event_type is USER_LOGIN or ADMIN_LOGIN then USERNAME_PASSWORD, otherwise UNSPECIFIED. |
Need more help? Get answers from Community members and Google SecOps professionals.