Use Cloud Storage FUSE file caching

The Cloud Storage FUSE file cache is a client-based read cache that lets repeat file reads to be served from a faster cache storage of your choice. This page describes how to enable and use Cloud Storage FUSE file caching.

Benefits of file caching

File caching provides the following benefits:

  • Improves performance for small and random I/Os: to learn more about file caching and its benefits, see Benefits of file caching. In all cases, the data, or individual large file, must fit within the file cache directory's available capacity which is controlled using the max-size-mb property.

  • Leverages parallel downloads: parallel downloads are enabled automatically on Cloud Storage FUSE versions 2.12 and later when the file cache is enabled. Parallel downloads utilize multiple workers to download a file in parallel using the file cache directory as a prefetch buffer, which can result in up to nine times faster model load time. We recommend that you use parallel downloads for single-threaded read scenarios that load large files such as model serving and checkpoint restores.

Parallel downloads

Parallel downloads can improve read performance by using multiple workers to download multiple parts of a file in parallel using the file cache directory as a prefetch buffer. The enable-parallel-downloads property is automatically set to true when you enable file caching. We recommend using parallel downloads for read scenarios that load large files such as model serving, checkpoint restores, and training on large objects.

Use cases for enabling file caching with parallel downloads include the following:

Use case type Description
Training Enable file caching if the data you want to access is read multiple times, whether the same file multiple times, or different offsets of the same file. If the dataset is larger than the file cache, the file cache should remain disabled, and instead, enable cache-file-for-range-read.
Serving model weights and checkpoint reads Enable file caching with parallel downloads to be able to utilize parallel downloads, which loads large files much faster than if file caching and parallel downloads aren't used.

Before you begin

The file cache requires a directory path to be used to cache files. You can create a new directory on an existing file system or create a new file system on provisioned storage. If you are provisioning new storage to be used, use the following instructions to create a new file system:

  1. For Google Cloud Hyperdisk, see Create a new Google Cloud Hyperdisk volume.

  2. For Persistent Disk, see Create a new Persistent Disk volume.

  3. For Local SSDs, see Add a Local SSD to your VM.

  4. For in-memory RAM disks, see Creating in-memory RAM disks.

Enable and configure file caching behavior

Enable and configure file caching using either a Cloud Storage FUSE configuration file or Cloud Storage FUSE CLI options. You can also use sample configurations, see Sample configuration for enabling file caching and parallel downloads.

  1. Specify the cache directory you want to use using the file-cache:cache-dir field or the --cache-dir option, which are used to enable the file cache for non-Google Kubernetes Engine deployments. If you're using a Google Kubernetes Engine deployment, specify the file-cache:max-size-mb property.

  2. To limit the total capacity the Cloud Storage FUSE cache can use within its mounted directory, adjust the max-size-mb property, which is automatically set to -1 when you set the cache-dir property. You can also specify a value in MiB or GiB to limit the cache size.

  3. Set the metadata-cache:ttl-secs option to -1 to bypass the TTL expiration of cached entries and serve file metadata from the cache if available. The default is 60 seconds, and a value of -1 sets it to unlimited. You can also specify a high value based on your requirements. We recommend that you set the ttl-secs value to as high as your workload lets you. For more information about setting a TTL for cached entries, see Time to live.

  4. Set the file-cache:cache-file-for-range-read option to true to asynchronously load the entire file into the cache if a file's first read operation starts from anywhere other than offset 0, so that subsequent reads of different offsets from the same file can be served from the cache.

  5. Optional: configure stat caching and type caching using the metadata-cache. To learn more about stat and type caches, see Overview of type caching or Overview of stat caching.

  6. Manually run the ls -R command on your mounted bucket before you run your workload to pre-populate metadata to ensure the type cache populates ahead of the first read in a faster, batched method. For more information about how to improve first time read performance, see Improve first-time reads.

Once you enable file caching, parallel downloads are enabled automatically on Cloud Storage FUSE versions 2.12 and later. If you're using an older version of Cloud Storage FUSE, set the enable-parallel-downloads option to true.

Configure supporting properties for parallel downloads

You can optionally configure the following supporting properties for parallel downloads:

  • parallel-downloads-per-file: the number of maximum workers that can be spawned per file to download the object from Cloud Storage into the file cache. The default value is 16.

  • max-parallel-downloads: the number of maximum workers that can be spawned at any given time across all file download jobs. The default is set to twice the number of CPU cores on your machine. To specify no limit, enter a value of -1.

  • download-chunk-size-mb: the size of each read request in MiB that each worker makes to Cloud Storage when downloading the object into the file cache. The default size is 200 MiB. Note that a parallel download is only triggered if the file being read is the specified size.

Disable parallel downloads

To disable parallel downloads, set the enable-parallel-downloads property to false in your Cloud Storage FUSE configuration file.

What's next