Collect Box Collaboration JSON logs

Supported in:

This document explains how to ingest Box Collaboration JSON logs to Google Security Operations using AWS S3 using Lambda and EventBridge schedule. The parser processes Box event logs in JSON format, mapping them to a unified data model (UDM). It extracts relevant fields from the raw logs, performs data transformations like renaming and merging, and enriches the data with intermediary information before outputting the structured event data.

Before you begin

  • Google SecOps instance
  • Privileged access to Box (Admin + Developer Console)
  • Privileged access to AWS (S3, IAM, Lambda, EventBridge) in the same Region where you plan to store the logs

Configure Box Developer Console (Client Credentials)

  1. Sign in to Box Developer Console.
  2. Create a Custom App with Server Authentication (Client Credentials Grant).
  3. Set Application Access = App + Enterprise Access.
  4. In Application Scopes, enable Manage enterprise properties.
  5. In Admin Console > Apps > Custom Apps Manager, Authorize the app by Client ID.
  6. Copy and save the Client ID and * Client Secret in a secure location.
  7. Go to Admin Console > Account & Billing > Account Information.
  8. Copy and save the Enterprise ID in a secure location.

Configure AWS S3 bucket and IAM for Google SecOps

  1. Create Amazon S3 bucket following this user guide: Creating a bucket
  2. Save bucket Name and Region for future reference (for example, box-collaboration-logs).
  3. Create a user following this user guide: Creating an IAM user.
  4. Select the created User.
  5. Select the Security credentials tab.
  6. Click Create Access Key in the Access Keys section.
  7. Select Third-party service as the Use case.
  8. Click Next.
  9. Optional: add a description tag.
  10. Click Create access key.
  11. Click Download CSV file to save the Access Key and Secret Access Key for later use.
  12. Click Done.
  13. Select the Permissions tab.
  14. Click Add permissions in the Permissions policies section.
  15. Select Add permissions.
  16. Select Attach policies directly
  17. Search for and select the AmazonS3FullAccess policy.
  18. Click Next.
  19. Click Add permissions.

Configure the IAM policy and role for S3 uploads

  1. In the AWS console, go to IAM > Policies > Create policy > JSON tab.
  2. Enter the following policy:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "AllowPutBoxObjects",
          "Effect": "Allow",
          "Action": ["s3:PutObject"],
          "Resource": "arn:aws:s3:::box-collaboration-logs/*"
        },
        {
          "Sid": "AllowGetStateObject",
          "Effect": "Allow",
          "Action": ["s3:GetObject"],
          "Resource": "arn:aws:s3:::box-collaboration-logs/box/collaboration/state.json"
        }
      ]
    }
    
    
    • Replace box-collaboration-logs if you entered a different bucket name.
  3. Click Next > Create policy.

  4. Go to IAM > Roles > Create role > AWS service > Lambda.

  5. Attach the newly created policy.

  6. Name the role WriteBoxToS3Role and click Create role.

Create the Lambda function

  1. In the AWS Console, go to Lambda > Functions > Create function.
  2. Click Author from scratch.
  3. Provide the following configuration details:

    Setting Value
    Name box_collaboration_to_s3
    Runtime Python 3.13
    Architecture x86_64
    Execution role WriteBoxToS3Role
  4. After the function is created, open the Code tab, delete the stub and enter the following code (box_collaboration_to_s3.py):

    #!/usr/bin/env python3
    # Lambda: Pull Box Enterprise Events to S3 (no transform)
    
    import os, json, time, urllib.parse
    from urllib.request import Request, urlopen
    from urllib.error import HTTPError, URLError
    import boto3
    
    TOKEN_URL = "https://api.box.com/oauth2/token"
    EVENTS_URL = "https://api.box.com/2.0/events"
    
    CID         = os.environ["BOX_CLIENT_ID"]
    CSECRET     = os.environ["BOX_CLIENT_SECRET"]
    ENT_ID      = os.environ["BOX_ENTERPRISE_ID"]
    STREAM_TYPE = os.environ.get("STREAM_TYPE", "admin_logs_streaming")
    LIMIT       = int(os.environ.get("LIMIT", "500"))
    BUCKET      = os.environ["S3_BUCKET"]
    PREFIX      = os.environ.get("S3_PREFIX", "box/collaboration/")
    STATE_KEY   = os.environ.get("STATE_KEY", "box/collaboration/state.json")
    
    s3 = boto3.client("s3")
    
    def get_state():
        try:
            obj = s3.get_object(Bucket=BUCKET, Key=STATE_KEY)
            data = json.loads(obj["Body"].read())
            return data.get("stream_position")
        except Exception:
            return None
    
    def put_state(pos):
        body = json.dumps({"stream_position": pos}, separators=(",", ":")).encode("utf-8")
        s3.put_object(Bucket=BUCKET, Key=STATE_KEY, Body=body, ContentType="application/json")
    
    def get_token():
        body = urllib.parse.urlencode({
            "grant_type": "client_credentials",
            "client_id": CID,
            "client_secret": CSECRET,
            "box_subject_type": "enterprise",
            "box_subject_id": ENT_ID,
        }).encode()
        req = Request(TOKEN_URL, data=body, method="POST")
        req.add_header("Content-Type", "application/x-www-form-urlencoded")
        with urlopen(req, timeout=30) as r:
            tok = json.loads(r.read().decode())
        return tok["access_token"]
    
    def fetch_events(token, stream_position=None, timeout=60, max_retries=5):
        params = {"stream_type": STREAM_TYPE, "limit": LIMIT, "stream_position": stream_position or "now"}
        qs = urllib.parse.urlencode(params)
        attempt, backoff = 0, 1.0
        while True:
            try:
                req = Request(f"{EVENTS_URL}?{qs}", method="GET")
                req.add_header("Authorization", f"Bearer {token}")
                with urlopen(req, timeout=timeout) as r:
                    return json.loads(r.read().decode())
            except HTTPError as e:
                if e.code == 429 and attempt < max_retries:
                    ra = e.headers.get("Retry-After")
                    delay = int(ra) if (ra and ra.isdigit()) else int(backoff)
                    time.sleep(max(1, delay)); attempt += 1; backoff *= 2; continue
                if 500 <= e.code <= 599 and attempt < max_retries:
                    time.sleep(backoff); attempt += 1; backoff *= 2; continue
                raise
            except URLError:
                if attempt < max_retries:
                    time.sleep(backoff); attempt += 1; backoff *= 2; continue
                raise
    
    def write_chunk(data):
        ts = time.strftime("%Y/%m/%d/%H%M%S", time.gmtime())
        key = f"{PREFIX}/{ts}-box-events.json"  
        s3.put_object(Bucket=BUCKET, Key=key,
                      Body=json.dumps(data, separators=(",", ":")).encode("utf-8"),
                      ContentType="application/json")  
        return key
    
    def lambda_handler(event=None, context=None):
        token = get_token()
        pos = get_state()
        total, idx = 0, 0
        while True:
            page = fetch_events(token, pos)
            entries = page.get("entries") or []
            if not entries:
                next_pos = page.get("next_stream_position") or pos
                if next_pos and next_pos != pos:
                    put_state(next_pos)
                break
    
            # уникальный ключ
            ts = time.strftime("%Y/%m/%d/%H%M%S", time.gmtime())
            key = f"{PREFIX}/{ts}-box-events-{idx:03d}.json"
            s3.put_object(Bucket=BUCKET, Key=key,
                          Body=json.dumps(page, separators=(",", ":")).encode("utf-8"),
                          ContentType="application/json")
            idx += 1
            total += len(entries)
    
            pos = page.get("next_stream_position") or pos
            if pos:
                put_state(pos)
    
            if len(entries) < LIMIT:
                break
    
        return {"ok": True, "written": total, "next_stream_position": pos}
    
    
  5. Go to Configuration > Environment variables > Edit > Add new environment variable.

  6. Enter the following environment variables, replacing with your values:

    Key Example
    S3_BUCKET box-collaboration-logs
    S3_PREFIX box/collaboration/
    STATE_KEY box/collaboration/state.json
    BOX_CLIENT_ID Enter Box Client ID
    BOX_CLIENT_SECRET Enter Box Client Secret
    BOX_ENTERPRISE_ID Enter Box Enterprise ID
    STREAM_TYPE admin_logs_streaming
    LIMIT 500
  7. After the function is created, stay on its page (or open Lambda > Functions > your-function).

  8. Select the Configuration tab.

  9. In the General configuration panel, click Edit.

  10. Change Timeout to 10 minutes (600 seconds) and click Save.

Schedule the Lambda function (EventBridge Scheduler)

  1. Go to Amazon EventBridge > Scheduler > Create schedule.
  2. Provide the following configuration details:
    • Recurring schedule: Rate (15 min).
    • Target: your Lambda function.
    • Name: box-collaboration-schedule-15min.
  3. Click Create schedule.

Configure a feed in Google SecOps to ingest Box logs

  1. Go to SIEM Settings > Feeds.
  2. Click Add New Feed.
  3. In the Feed name field, enter a name for the feed (for example, Box Collaboration).
  4. Select Amazon S3 V2 as the Source type.
  5. Select Box as the Log type.
  6. Click Next.
  7. Specify values for the following input parameters:
    • S3 URI: The bucket URI (the format should be: s3://box-collaboration-logs/box/collaboration/). Replace box-collaboration-logs: Use the actual name of the bucket.
    • Source deletion options: Select the deletion option according to your preference.
    • Maximum File Age: Include files modified in the last number of days. Default is 180 Days.
    • Access Key ID: User access key with access to the S3 bucket.
    • Secret Access Key: User secret key with access to the S3 bucket.
    • Asset namespace: The asset namespace.
    • Ingestion labels: The label to be applied to the events from this feed.
  8. Click Next.
  9. Review your new feed configuration in the Finalize screen, and then click Submit.

UDM Mapping Table

Log field UDM mapping Logic
additional_details.ekm_id additional.fields Value taken from additional_details.ekm_id
additional_details.service_id additional.fields Value taken from additional_details.service_id
additional_details.service_name additional.fields Value taken from additional_details.service_name
additional_details.shared_link_id additional.fields Value taken from additional_details.shared_link_id
additional_details.size target.file.size Value taken from additional_details.size
additional_details.version_id additional.fields Value taken from additional_details.version_id
created_at metadata.event_timestamp Value taken from created_at
created_by.id principal.user.userid Value taken from created_by.id
created_by.login principal.user.email_addresses Value taken from created_by.login
created_by.name principal.user.user_display_name Value taken from created_by.name
event_id metadata.product_log_id Value taken from event_id
event_type metadata.product_event_type Value taken from event_type
ip_address principal.ip Value taken from ip_address
source.item_id target.file.product_object_id Value taken from source.item_id
source.item_name target.file.full_path Value taken from source.item_name
source.item_type Not mapped
source.login target.user.email_addresses Value taken from source.login
source.name target.user.user_display_name Value taken from source.name
source.owned_by.id target.user.userid Value taken from source.owned_by.id
source.owned_by.login target.user.email_addresses Value taken from source.owned_by.login
source.owned_by.name target.user.user_display_name Value taken from source.owned_by.name
source.parent.id Not mapped
source.parent.name Not mapped
source.parent.type Not mapped
source.type Not mapped
type metadata.log_type Value taken from type
metadata.vendor_name Hardcoded value
metadata.product_name Hardcoded value
security_result.action Derived from event_type. If event_type is FAILED_LOGIN then BLOCK, if event_type is USER_LOGIN then ALLOW, otherwise UNSPECIFIED.
extensions.auth.type Derived from event_type. If event_type is USER_LOGIN or ADMIN_LOGIN then MACHINE, otherwise UNSPECIFIED.
extensions.auth.mechanism Derived from event_type. If event_type is USER_LOGIN or ADMIN_LOGIN then USERNAME_PASSWORD, otherwise UNSPECIFIED.

Need more help? Get answers from Community members and Google SecOps professionals.