The Cloud Storage FUSE file cache is a client-based read cache that lets repeat file reads to be served from a faster cache storage of your choice. This page describes how to enable and use Cloud Storage FUSE file caching.
Benefits of file caching
File caching provides the following benefits:
Improves performance for small and random I/Os: to learn more about file caching and its benefits, see Benefits of file caching. In all cases, the data, or individual large file, must fit within the file cache directory's available capacity which is controlled using the
max-size-mb
property.Leverages parallel downloads: parallel downloads are enabled automatically on Cloud Storage FUSE versions 2.12 and later when the file cache is enabled. Parallel downloads utilize multiple workers to download a file in parallel using the file cache directory as a prefetch buffer, which can result in up to nine times faster model load time. We recommend that you use parallel downloads for single-threaded read scenarios that load large files such as model serving and checkpoint restores.
Parallel downloads
Parallel downloads can improve read performance by using
multiple workers to download multiple parts of a file in parallel using the
file cache directory as a prefetch buffer. The enable-parallel-downloads
property is automatically set to true
when you enable file caching. We
recommend using parallel downloads for read scenarios that load large files
such as model serving, checkpoint restores, and training on large objects.
Use cases for enabling file caching with parallel downloads include the following:
Use case type | Description |
---|---|
Training | Enable file caching if the data you want to access is read multiple times,
whether the same file multiple times, or different offsets of the same file.
If the dataset is larger than the file cache, the file cache should remain
disabled, and instead, enable cache-file-for-range-read .
|
Serving model weights and checkpoint reads | Enable file caching with parallel downloads to be able to utilize parallel downloads, which loads large files much faster than if file caching and parallel downloads aren't used. |
Before you begin
The file cache requires a directory path to be used to cache files. You can create a new directory on an existing file system or create a new file system on provisioned storage. If you are provisioning new storage to be used, use the following instructions to create a new file system:
For Google Cloud Hyperdisk, see Create a new Google Cloud Hyperdisk volume.
For Persistent Disk, see Create a new Persistent Disk volume.
For Local SSDs, see Add a Local SSD to your VM.
For in-memory RAM disks, see Creating in-memory RAM disks.
Enable and configure file caching behavior
Enable and configure file caching using either a Cloud Storage FUSE configuration file or Cloud Storage FUSE CLI options. You can also use sample configurations, see Sample configuration for enabling file caching and parallel downloads.
Specify the cache directory you want to use using the
file-cache:cache-dir
field or the--cache-dir
option, which are used to enable the file cache for non-Google Kubernetes Engine deployments. If you're using a Google Kubernetes Engine deployment, specify thefile-cache:max-size-mb
property.To limit the total capacity the Cloud Storage FUSE cache can use within its mounted directory, adjust the
max-size-mb
property, which is automatically set to-1
when you set thecache-dir
property. You can also specify a value in MiB or GiB to limit the cache size.Set the
metadata-cache:ttl-secs
option to-1
to bypass the TTL expiration of cached entries and serve file metadata from the cache if available. The default is 60 seconds, and a value of-1
sets it to unlimited. You can also specify a high value based on your requirements. We recommend that you set thettl-secs
value to as high as your workload lets you. For more information about setting a TTL for cached entries, see Time to live.Set the
file-cache:cache-file-for-range-read
option totrue
to asynchronously load the entire file into the cache if a file's first read operation starts from anywhere other thanoffset 0
, so that subsequent reads of different offsets from the same file can be served from the cache.Optional: configure stat caching and type caching using the
metadata-cache
. To learn more about stat and type caches, see Overview of type caching or Overview of stat caching.Manually run the
ls -R
command on your mounted bucket before you run your workload to pre-populate metadata to ensure the type cache populates ahead of the first read in a faster, batched method. For more information about how to improve first time read performance, see Improve first-time reads.
Once you enable file caching, parallel downloads are enabled automatically
on Cloud Storage FUSE versions 2.12 and later. If you're using an older version of
Cloud Storage FUSE, set the enable-parallel-downloads
option to true
.
Configure supporting properties for parallel downloads
You can optionally configure the following supporting properties for parallel downloads:
parallel-downloads-per-file
: the number of maximum workers that can be spawned per file to download the object from Cloud Storage into the file cache. The default value is16
.max-parallel-downloads
: the number of maximum workers that can be spawned at any given time across all file download jobs. The default is set to twice the number of CPU cores on your machine. To specify no limit, enter a value of-1
.download-chunk-size-mb
: the size of each read request in MiB that each worker makes to Cloud Storage when downloading the object into the file cache. The default size is 200 MiB. Note that a parallel download is only triggered if the file being read is the specified size.
Disable parallel downloads
To disable parallel downloads, set the enable-parallel-downloads
property to
false
in your Cloud Storage FUSE configuration file.
What's next
Learn how to improve Cloud Storage FUSE performance.