Viewing Airflow logs

Cloud Composer 1 | Cloud Composer 2 | Cloud Composer 3

This page describes how to access and view the Apache Airflow logs for Cloud Composer.

Log types

Cloud Composer has the following Airflow logs:

  • Airflow logs: These logs are associated with single DAG tasks. You can view the task logs in the Cloud Storage logs folder associated with the Cloud Composer environment. You can also view the logs in the Airflow web interface.
  • Streaming logs: These logs are a superset of the logs in Airflow. To access streaming logs, you can go to the logs tab of Environment details page in Google Cloud console, use the Cloud Logging, or use Cloud Monitoring.

    Logging and Monitoring quotas apply.

    To learn about Cloud Logging and Cloud Monitoring for your Cloud Composer environment, see Monitoring environments.

Logs in Cloud Storage

When you create an environment, Cloud Composer creates a Cloud Storage bucket and associates the bucket with your environment. Cloud Composer stores logs for single DAG tasks in logs folder in the bucket.

Log folder directory structure

The logs folder includes folders for each workflow that has run in the environment. Each workflow folder includes a folder for its DAGs and sub-DAGs. Each folder contains log files for each task. The task filename indicates when the task started.

The following example shows the logs directory structure for an environment.

   |   │
   |   |   dag_1
   |   |   dag_2
   |   |   ...
       |   │
       |   └───task_1
       |   |   │   datefile_1
       |   |   │   datefile_2
       |   |   │   ...
       |   |
       |   └───task_2
       |       │   datefile_1
       |       │   datefile_2
       |       │   ...
           │   ...

Log retention

To prevent data loss, logs saved in environment's bucket are not deleted after you delete your environment. You must manually delete logs from your environment's bucket.

Logs stored in environment's bucket use the policy of the bucket. Cloud Composer creates buckets with the default policy that keeps data forever.

For logs stored in Cloud Logging, Cloud Composer uses _Default and User-defined logs retention periods.

Before you begin

You must have a role that can view objects in environment buckets. For more information, see Access control.

Viewing task logs in Cloud Storage

To view the log files for DAG tasks:

  1. To view log files, enter the following command, replacing the VARIABLES with appropriate values:

    gsutil ls -r gs://BUCKET/logs

  2. (Optional) To copy a single log or a subfolder, enter the following command, replacing the VARIABLES with appropriate values:


Viewing streaming logs in the Google Cloud console

Cloud Composer produces the following logs:

  • airflow: The uncategorized logs that Airflow pods generate.
  • airflow-upgrade-db: The logs Airflow database initialization job generates (previously airflow-database-init-job).
  • airflow-scheduler: The logs the Airflow scheduler generates.
  • dag-processor-manager: The logs of the DAG processor manager (the part of the scheduler that processes DAG files).
  • airflow-triggerer: The logs the Airflow triggerer generates.
  • airflow-webserver: The logs the Airflow web interface generates.
  • airflow-worker: The logs generated as part of workflow and DAG execution.
  • The logs Admin Activity generates.
  • composer-agent: The logs generated as part of create and update environment operations.
  • gcs-syncd: The logs generated by the file syncing processes.
  • build-log-worker-scheduler: The logs from the local build of the Airflow worker image (during upgrades and Python package installation).
  • build-log-webserver: The logs from the build of the Airflow webserver image (during upgrades and python package installation).
  • airflow-monitoring: The logs that Airflow monitoring generates.

These logs can be viewed on the logs tab of the Environment details page or in the Cloud Logging.

To view the streaming logs on the Environment details page:

  1. In the Google Cloud console, go to the Environments page.

    Go to Environments

  2. Find the name of the environment you want to inspect in the list. Click the environment name to open the Environment details page, then select the Logs tab.

  3. Select the subcategory of the logs you want to see and choose the time interval to inspect with the time-range selector in the upper-left corner.

To view the streaming logs in Cloud Logging:

  1. Go to the Logs Explorer in the Google Cloud console.

    Go to Logs Explorer

  2. Select the logs you want to see.

    You can filter by properties such as log file and level, predefined label, task name, workflow, and execution date. For more information about selecting and filtering logs, see Using the Logs Explorer.

    To learn about exporting logs, see Configure and manage sinks.

What's next