Deploying a Cloud CDN origin authentication proxy

Last reviewed 2021-09-16 UTC

This tutorial shows you how to deploy Cloud CDN with a private Amazon Simple Storage Service (S3) origin bucket. The deployment uses Cloud Run and an authentication proxy to sign CDN cache fill requests and forward them to the Amazon S3 origin bucket. This tutorial is intended for people who deploy and manage Cloud CDN infrastructure. It assumes that you're familiar with CDN concepts, Cloud CDN, Amazon S3, and Terraform.

The following architectural diagram shows the components used in this tutorial. An external Application Load Balancer is created with a Cloud CDN-enabled backend service. The backend service is connected to a Cloud Run service running the authentication proxy. The authentication proxy performs AWS Signature Version 4 signing, using credentials stored in Secret Manager. The authentication proxy then forwards signed requests to the origin bucket and returns the response to the Cloud CDN backend, which delivers the response to the client.

Diagram of the architecture that supports object caching.

The following request flow diagram shows the data flow of a client performing two requests for image.png from the CDN endpoint. In the first client request, the CDN doesn't have image.png in its cache and performs a cache fill using the Cloud Run authentication proxy. In the second client request, the CDN has image.png in its cache and returns it to the client.

Path diagram of an object request before and after the object is cached.

Objectives

  • Provision the Cloud CDN authentication proxy.
  • Test the Cloud CDN authentication proxy.
  • Validate the Cloud CDN cache.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

The solution also requires an Amazon S3 bucket to serve as the CDN origin. Optionally, for testing, the deployment can provide a Cloud Storage bucket to simulate an Amazon S3 bucket.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Google Cloud project.

Preparing the environment

You can complete this tutorial using Cloud Shell or your local host. Cloud Shell has Terraform pre-installed and authenticates with Google Cloud.

Cloud Shell

  • Clone the GitHub source repository in Cloud Shell.

    Open in Cloud Shell

    Cloud Shell is launched in a separate browser tab, and the GitHub source repository is cloned into the $HOME/cloudshell_open directory of your Cloud Shell environment.

Local host

Complete the following steps:

  1. Install Terraform version 0.14.11 or later.

  2. Clone the GitHub source repository.

    git clone https://github.com/GoogleCloudPlatform/cdn-auth-proxy.git
  3. To change directories to the directory containing the cloned source code for this solution, use the following command:

    cd cdn-auth-proxy

Provision the CDN authentication proxy

In this tutorial, you use Terraform to configure and provision the solution. In this section, you configure Terraform variables and have Terraform provision the following resources:

This solution deploys an external Application Load Balancer. A production deployment would deploy an HTTPS frontend and the required SSL certificates. For more information, see SSL certificates overview.

Create the terraform.tfvars file

To demonstrate and test the authentication proxy, you need a storage bucket to serve as the CDN content origin. You can configure the deployment to use an Amazon S3 bucket, or to create a Cloud Storage bucket to simulate an Amazon S3 bucket. You configure the deployment by setting Terraform variables in the terraform.tfvars file.

To use the Amazon S3 bucket as the origin, follow the instructions in the Use an Amazon S3 bucket as the origin section. To use Terraform to provision a Cloud Storage bucket to serve as the Amazon S3 origin bucket, follow the instructions in the Use Cloud Storage as the CDN content origin section.

Choose one method or the other—not both.

Use an Amazon S3 bucket as the CDN content origin

  1. In Cloud Shell, copy the Amazon S3 Terraform template:

    cp docs/terraform.tfvars.example-s3 terraform.tfvars
    
  2. Edit the terraform.tfvars file and replace the following variables:

    • PROJECT_ID: The Google Cloud project ID used to deploy the authentication proxy
    • REGION: The Google Cloud region in which the authentication proxy is deployed
    • S3_ORIGIN_BUCKET_NAME: Your Amazon S3 origin bucket name
    • S3_ORIGIN_BUCKET_REGION: The Amazon S3 region in which your origin bucket resides
    • ACCESS_KEY and ACCESS_SECRET: AWS IAM credentials with s3:GetObject permission on objects in the Amazon S3 origin bucket
  3. To save your changes, press Control+S (or Command+S on Mac).

  4. Upload the reference image to the Amazon S3 bucket and name the object image.png.

    If you are unfamiliar with uploading files to an Amazon S3 bucket, see Uploading objects.

Continue to the Provision Terraform resources section.

Use Cloud Storage as the CDN content origin

If you are using an Amazon S3 bucket, skip this section.

  1. In Cloud Shell, copy the Cloud Storage Terraform template:

    cp docs/terraform.tfvars.example-gcs terraform.tfvars
    
  2. Edit the terraform.tfvars file and set the following variables. Leave the gcs_interoperability = true setting unchanged.

    • PROJECT_ID: The Google Cloud project ID used to deploy the authentication proxy
    • REGION: The Google Cloud region in which the authentication proxy is deployed
  3. To save your changes, press Control+S (or Command+S on Mac).

Provision Terraform resources

In this section, you initialize Terraform, plan the Terraform deployment, and apply the deployment plan.

  1. In Cloud Shell, initialize Terraform:

    terraform init
    

    The previous command instructs Terraform to parse the configuration and install the necessary Terraform provider plugins. The output resembles the following:

    Initializing provider plugins...
    - Finding hashicorp/google versions matching "~> 3.61"...
    …
    - Installed hashicorp/time v0.7.2 (signed by HashiCorp)
    …
    Terraform has been successfully initialized!
    …
    
  2. Plan the Terraform deployment:

    terraform plan -out=tfplan
    

    The terraform plan command does the following:

    1. Parses the Terraform configuration, building a list of resources to provision.
    2. Refreshes the current state of resources already provisioned in Google Cloud.
    3. Creates a plan to make the currently provisioned resources match the parsed configuration.

    The output is similar to the following. It shows what resources to add, change, or destroy.

    …
    
    Plan: 26 to add, 0 to change, 0 to destroy.
    
    ------------------------------------------------------------------------
    
    This plan was saved to: tfplan
    
    To perform exactly these actions, run the following command to apply:
        terraform apply "tfplan"
    
  3. Apply the Terraform plan:

    terraform apply tfplan
    

    The terraform apply command tells Terraform to execute the plan and provision Google Cloud resources. This step takes several minutes to complete. When the terraform apply command finishes, it outputs details of the deployment along with an Outputs section. The Outputs section includes useful configuration values.

    The output is similar to the following:

    Apply complete! Resources: 26 added, 0 changed, 0 destroyed.
    
    The state of your infrastructure has been saved to the path
    Following. This state is required to modify and destroy your
    infrastructure, so keep it safe. To inspect the complete state
    use the `terraform show` command.
    
    State path: terraform.tfstate
    
    Outputs:
    
    authn_proxy_url = https://authn-proxy-u1af2bc3de-fg.a.run.app
    cdn_public_ip = 34.120.78.255
    forwarding_rule_name = authn-proxy-lb
    project_id = cdn-test-12345
    project_number = 123456789012
    region = us-central1
    

Test the Cloud CDN authentication proxy

In this section, you test the Cloud CDN authentication proxy. To perform the test, you run multiple curl commands to request an object from Cloud CDN.

Set environment variables

In Cloud Shell, set the S3_OBJECT environment variable to the name of the object you stored in the origin bucket. If you are using the Terraform-created Cloud Storage bucket, set the variable name to image.png.

export S3_OBJECT="image.png"

Use curl to make an object request

The curl command generates HTTP requests to the public IP address assigned to the provisioned external Application Load Balancer.

  1. In Cloud Shell, request the object:

    curl -w 'Time:\t%{time_total} s\n' \
    -s -O -D - "http://$(terraform output -raw cdn_public_ip)/$S3_OBJECT"
    
    • The -w Time:\t%{total_time} s\n option prints the total request time in seconds.
    • The -s option silences the download progress meter.
    • The -O option writes the output with the object's remote name.
    • The -D option outputs the HTTP response headers to stdout.

    The output resembles the following:

    HTTP/1.1 404 Not Found
    Content-Type: text/html; charset=UTF-8
    Referrer-Policy: no-referrer
    Content-Length: 1570
    Date: Thu, 01 Apr 2021 05:40:50 GMT
    
    Time:   0.004059 s
    

    This output example returned an HTTP/1.1 404 Not Found response. That's because even though Terraform provisioned the external Application Load Balancer, it took time to begin serving traffic.

  2. Repeat the curl command until you receive an HTTP/1.1 200 OK response.

    curl -w 'Time:\t%{time_total} s\n' \
      -s -O -D - "http://$(terraform output -raw cdn_public_ip)/$S3_OBJECT"
    

    After receiving an HTTP/1.1 200 OK response, the output resembles the following:

    HTTP/1.1 200 OK
    last-modified: Wed, 24 Mar 2021 02:08:33 GMT
    etag: "383d9cd6cafb4c40e15356ad54689d1f"
    accept-ranges: bytes
    content-type: image/png
    X-Cloud-Trace-Context: 0ad8875a80df7ca9bd2e0c99e3786a72;o=1
    Date: Thu, 01 Apr 2021 05:42:58 GMT
    Content-Length: 2641
    Via: 1.1 google
    Cache-Control: public,max-age=86400
    
    Time:   2.092146 s
    

The first successful response doesn't have an Age header. It's also a Cloud CDN cache miss.

To fulfill the request, Cloud CDN requested the object from the authentication proxy. The request triggered Cloud Run to start the authentication proxy. The authentication proxy signed the request, forwarded the request to the origin bucket, and returned the response to Cloud CDN. This request took ~2 seconds to complete.

Investigate the CDN cache

  1. In Cloud Shell, make another request for the object:

    curl -w 'Time:\t%{time_total} s\n' \
      -s -O -D - "http://$(terraform output -raw cdn_public_ip)/$S3_OBJECT"
    

    The output resembles the following:

    HTTP/1.1 200 OK
    last-modified: Wed, 24 Mar 2021 02:08:33 GMT
    etag: "383d9cd6cafb4c40e15356ad54689d1f"
    accept-ranges: bytes
    content-type: image/png
    X-Cloud-Trace-Context: f11fa095c8e78a1fb906f421aa631b86;o=1
    Date: Thu, 01 Apr 2021 05:44:11 GMT
    Server: Google Frontend
    Content-Length: 2641
    Via: 1.1 google
    Cache-Control: public,max-age=86400
    
    Time:   0.251068 s
    

    In this example, there was no Age header. Your response might have one. Google routes client requests to the Cloud CDN serving infrastructure closest to the client. The Cloud CDN serving infrastructure is made up of multiple serving instances. In this example, the Cloud CDN instance serving the request was different from the instance that served the request in the previous example. That means it required a cache fill.

    If your response contained the Age header, it means Cloud CDN served the request from the same instance that served the first request.

    Even though the request displayed in the previous output sample was a cache miss, the request was faster, taking only ~0.25 s. The response was faster because the authentication proxy was already running.

  2. Continue requesting the object until you get a response that contains an Age header.

    curl -w 'Time:\t%{time_total} s\n' \
      -s -O -D - "http://$(terraform output -raw cdn_public_ip)/$S3_OBJECT"
    

    The output contains an Age header and resembles the following:

    HTTP/1.1 200 OK
    Last-Modified: Wed, 24 Mar 2021 02:08:33 GMT
    ETag: "383d9cd6cafb4c40e15356ad54689d1f"
    accept-ranges: bytes
    Content-Type: image/png
    X-Cloud-Trace-Context: 3ca83775cf581b643ed128a50e1f2bcf
    Date: Thu, 01 Apr 2021 05:46:21 GMT
    Server: Google Frontend
    Content-Length: 2641
    Via: 1.1 google
    Age: 5
    Cache-Control: public,max-age=86400
    
    Time:   0.041623 s
    

    The request was served in ~0.04 seconds. The Age: 5 response shows that the object was served from the Cloud CDN cache, and that it's been in the cache for 5 seconds.

View logging requests with Cloud CDN logging

The client can determine whether the response is cached by looking at the Age header. To view the disposition of requests to Cloud CDN, you can use Cloud CDN logging.

  1. In Cloud Shell, generate thousands of requests. Those requests generate Cloud CDN logs.

    hey -n 5000 "http://$(terraform output -raw cdn_public_ip)/$S3_OBJECT"
    

    The output resembles the following:

    Summary:
      Total:        0.2857 secs
      Slowest:      0.1829 secs
      Fastest:      0.0005 secs
      Average:      0.0020 secs
      Requests/sec: 17503.7307
    
      Total data:   13205000 bytes
      Size/request: 2641 bytes
    
    Response time histogram:
      0.000 [1]     |
      0.019 [4972]  |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
      0.037 [0]     |
      0.055 [0]     |
      0.073 [0]     |
      0.092 [0]     |
      0.110 [0]     |
      0.128 [0]     |
      0.146 [0]     |
      0.165 [0]     |
      0.183 [27]    |
    
    Latency distribution:
      10% in 0.0006 secs
      25% in 0.0007 secs
      50% in 0.0009 secs
      75% in 0.0011 secs
      90% in 0.0017 secs
      95% in 0.0024 secs
      99% in 0.0060 secs
    
    Details (average, fastest, slowest):
      DNS+dialup:   0.0000 secs, 0.0005 secs, 0.1829 secs
      DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
      req write:    0.0000 secs, 0.0000 secs, 0.0036 secs
      resp wait:    0.0018 secs, 0.0004 secs, 0.1804 secs
      resp read:    0.0001 secs, 0.0000 secs, 0.0072 secs
    
    Status code distribution:
      [200] 5000 responses
    

    The hey HTTP load generation tool generated 5,000 object requests.

    The hey command collected request statistics and created a response-time histogram and latency distribution.

    The hey command also indicated that all 5,000 requests completed successfully, that all requests were processed in ~0.29 seconds, and that the ninety-ninth percentile latency was 0.0060 seconds.

    To retrieve details on the most recent Cloud CDN cache hit, you use Cloud Logging in the next step.

  2. Get the most recent Cloud CDN request log from the authn-proxy-lb proxy server:

    gcloud logging read \
      "resource.type=http_load_balancer AND \
      resource.labels.forwarding_rule_name=authn-proxy-lb \
      jsonPayload.statusDetails=response_from_cache" \
        --limit=1 --format=json --project="$(terraform output -raw project_id)"
    

    For a cache hit, Cloud CDN logging sets the jsonPayload.statusDetails field to either byte_range_caching or response_from_cache. Because the hey command requested the whole object, you can filter on the response_from_cache value.

    The command also uses the gcloud logging read parameter to output the most recent Cloud CDN log generated by authn-proxy-lb and with a response_from_cache status.

    The output resembles the following:

    [
      {
        "httpRequest": {
          "cacheHit": true,
          "cacheLookup": true,
          "latency": "0.000142s",
          "remoteIp": "35.238.224.141",
          "requestMethod": "GET",
          "requestSize": "120",
          "requestUrl": "http://34.117.126.218/image.png",
          "responseSize": "2998",
          "status": 200,
          "userAgent": "hey/0.0.1"
        },
        "insertId": "e5kteqfen7kh2",
        "jsonPayload": {
          "@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
          "cacheId": "CBF-6041a7",
          "statusDetails": "response_from_cache"
        },
        "logName": "projects/cdn-test-306000/logs/requests",
        "receiveTimestamp": "2021-04-01T06:00:11.829756617Z",
        "resource": {
          "labels": {
            "backend_service_name": "",
            "forwarding_rule_name": "authn-proxy-lb",
            "project_id": "cdn-test-306000",
            "target_proxy_name": "authn-proxy-http-proxy",
            "url_map_name": "authn-proxy-urlmap",
            "zone": "global"
          },
          "type": "http_load_balancer"
        },
        "severity": "INFO",
        "spanId": "8430d1b40bd5a872",
        "timestamp": "2021-04-01T06:0