This page provides guidance on configuring your Parallelstore environment to obtain the best performance.
General recommendations
Remove any alias from
ls
for improved default performance. On many systems, it is aliased tols -color=auto
which is much slower with the default Parallelstore configuration.If the performance of list operations is slow, consider enabling caching for the dfuse mount.
Interception library
The libioil
library can be used to improve the performance of read and write
operations to DFuse from applications which use libc. The library bypasses the
kernel by intercepting POSIX read and write calls from the application so as to
service them directly in user-space. See
Interception library for
more details.
In most cases, we recommend using the interception library on a per-process or per-application invocation.
Situations in which you may not want or need to use the interception library include the following:
- Only applications built with libc can use the interception library.
- If you have a workload that benefits from caching, such as accessing the same files repeatedly, we recommend not using the interception library.
- If your workload is metadata-intensive, such as working with many small files, or a very large directory listing, the interception library likely won't improve performance.
The LD_PRELOAD
invocation can be set as an environment variable in your
shell environment, but doing so can sometimes cause problems. We recommend
instead specifying it with each command.
Alternatively, it's possible to link the interception library into your
application at compile time with the -lioil
flag.
dfuse
caching
Caching is enabled in dfuse
by default.
There are two cache-related flags used by dfuse
when mounting a
Parallelstore instance:
--disable-wb-cache
uses write-through rather than write-back caching.--disable-caching
disables all caching.
The following suggestions apply to caching and performance:
- If you're using the interception library, write-back caching is bypassed. We
recommend specifying
--disable-wb-cache
when using the interception library. - If your workload involves reading many files once, you should disable caching.
- For workloads that involve many clients modifying files, and the updates need to be available immediately to other clients, you must disable caching.
- If your workload is reading the same files repeatedly, caching can improve
performance. This is particularly true if the files fit into your clients'
memory.
dfuse
uses the Linux page cache for its caching. For workloads which consist of small I/Os to large files, in addition to enabling caching, increasing dfuse read-ahead may be beneficial. To increase dfuse read-ahead, after
dfuse
has been mounted, run the following commands:echo 4096 > /sys/class/bdi/\$(mountpoint -d /mnt)/read_ahead_kb echo 100 > /sys/class/bdi/\$(mountpoint -d /mnt)/max_ratio
If your workloads involve a mixture of the preceding scenarios, you can mount the same Parallelstore instance to multiple mount points with different caching settings.
Thread count and event queue count
When mounting your Parallelstore instance, we recommend the following values
for --thread-count
and --eq-count
:
- The thread count value should not exceed the number of vCPU cores.
- The maximum recommended thread count value is between 16 and 20. Beyond this number, there is little or no performance benefit, regardless of the number of available cores.
- The event queue value should be half of the thread count value.
If your workload involves a very high number of small file operations and heavy metadata access, you can experiment with increasing the numbers beyond these recommendations.
File striping setting
File striping is a data storage technique where a file is divided into blocks, or stripes, and distributed across multiple storage targets. File striping can increase performance by allowing parallel reads and writes to more than one storage target backing the instance.
When creating your Parallelstore instance, you can specify one of three file striping settings:
- Minimum
- Balanced
- Maximum
These settings can have a significant import on your Parallelstore performance. For most workloads, we recommend the balanced setting, which should be a reasonable compromise for most workloads. If the performance with the balanced setting is not acceptable:
The minimum setting may improve performance for workloads with many small files, particularly when the average file size is less than 256KB.
The maximum setting may improve performance for workloads with very large files, generally greater than 8GB, especially when many clients are sharing access to the same files.
For advanced tuning, the daos
tool provides per-file or per-directory
settings. Experimenting with advanced tuning comes with performance-related
risks and is generally not recommended. See
Understanding Data Redundancy and Sharding in DAOS for more
details.
Directory striping setting
When creating your Parallelstore instance, you can specify one of three directory striping settings:
- Minimum
- Balanced
- Maximum
For most workloads, we recommend the maximum setting.
For workloads which involve lots of listing of large directories, the balanced or minimum settings can result in better list performance. However, the performance of other operations, particularly file creation, may suffer.
multi-user
When using the dfuse
tool to mount your Parallelstore instance, we recommend
specifying the --multi-user
flag. This flag tells the kernel to make the file
system available to all users on a client, rather than only the user running
the DFuse process. DFuse then appears like a generic multi-user file
system and the standard chown
and chgrp
calls are enabled. All file
system entries are owned by the user that created them as is normal in a
POSIX file system,
When specifying the --multi-user
flag, you must also update /etc/fuse.conf
as root by adding the following line:
user_allow_other
There doesn't appear to be a performance implication to mounting your instance as multi-user.
Erasure coding setting
Erasure coding is set to 2+1. This setting cannot be changed. Any I/O that doesn't use EC2+1 is rejected.
Google Kubernetes Engine sidecar container resource allocation
In most cases, unsatisfactory performance with Google Kubernetes Engine and Parallelstore is caused by insufficient CPU or memory allocated to the Parallelstore sidecar container. To properly allocate resources, consider the following suggestions:
Read the considerations highlighted in Configure resources for the sidecar container. You will learn about why you might need to increase the resource allocation, and how to configure the sidecar container resource allocation using Pod annotations.
You can use the value
0
to turn off any resource limits or requests on Standard clusters. For example, by settinggke-parallelstore/cpu-limit: 0
andgke-parallelstore/memory-limit: 0
, the sidecar container's CPU and memory limits will be empty, and the default requests will be used. This setting is useful when you don't know how many resources dfuse needs for your workloads and want it to use all available resources on a node. Once you've figured out how many resources dfuse needs based on your workload metrics, you can set appropriate limits.On Autopilot clusters, you cannot use value
0
to unset the sidecar container resource limits and requests. You have to explicitly set a larger resource limit for the sidecar container on Autopilot clusters, and rely on Google Cloud metrics to decide whether increasing the resource limit is needed.