Performance considerations

This page provides guidance on configuring your Parallelstore environment to obtain the best performance.

General recommendations

  • Remove any alias from ls for improved default performance. On many systems, it is aliased to ls -color=auto which is much slower with the default Parallelstore configuration.

  • If the performance of list operations is slow, consider enabling caching for the dfuse mount.

Interception library

The libioil library can be used to improve the performance of read and write operations to DFuse from applications which use libc. The library bypasses the kernel by intercepting POSIX read and write calls from the application so as to service them directly in user-space. See Interception library for more details.

In most cases, we recommend using the interception library on a per-process or per-application invocation.

Situations in which you may not want or need to use the interception library include the following:

  • Only applications built with libc can use the interception library.
  • If you have a workload that benefits from caching, such as accessing the same files repeatedly, we recommend not using the interception library.
  • If your workload is metadata-intensive, such as working with many small files, or a very large directory listing, the interception library likely won't improve performance.

The LD_PRELOAD invocation can be set as an environment variable in your shell environment, but doing so can sometimes cause problems. We recommend instead specifying it with each command.

Alternatively, it's possible to link the interception library into your application at compile time with the -lioil flag.

dfuse caching

Caching is enabled in dfuse by default.

There are two cache-related flags used by dfuse when mounting a Parallelstore instance:

  • --disable-wb-cache uses write-through rather than write-back caching.
  • --disable-caching disables all caching.

The following suggestions apply to caching and performance:

  • If you're using the interception library, write-back caching is bypassed. We recommend specifying --disable-wb-cache when using the interception library.
  • If your workload involves reading many files once, you should disable caching.
  • For workloads that involve many clients modifying files, and the updates need to be available immediately to other clients, you must disable caching.
  • If your workload is reading the same files repeatedly, caching can improve performance. This is particularly true if the files fit into your clients' memory. dfuse uses the Linux page cache for its caching.
  • For workloads which consist of small I/Os to large files, in addition to enabling caching, increasing dfuse read-ahead may be beneficial. To increase dfuse read-ahead, after dfuse has been mounted, run the following commands:

    echo 4096 > /sys/class/bdi/\$(mountpoint -d /mnt)/read_ahead_kb
    echo 100 > /sys/class/bdi/\$(mountpoint -d /mnt)/max_ratio
    

If your workloads involve a mixture of the preceding scenarios, you can mount the same Parallelstore instance to multiple mount points with different caching settings.

Thread count and event queue count

When mounting your Parallelstore instance, we recommend the following values for --thread-count and --eq-count:

  • The thread count value should not exceed the number of vCPU cores.
  • The maximum recommended thread count value is between 16 and 20. Beyond this number, there is little or no performance benefit, regardless of the number of available cores.
  • The event queue value should be half of the thread count value.

If your workload involves a very high number of small file operations and heavy metadata access, you can experiment with increasing the numbers beyond these recommendations.

File striping setting

File striping is a data storage technique where a file is divided into blocks, or stripes, and distributed across multiple storage targets. File striping can increase performance by allowing parallel reads and writes to more than one storage target backing the instance.

When creating your Parallelstore instance, you can specify one of three file striping settings:

  • Minimum
  • Balanced
  • Maximum

These settings can have a significant import on your Parallelstore performance. For most workloads, we recommend the balanced setting, which should be a reasonable compromise for most workloads. If the performance with the balanced setting is not acceptable:

  • The minimum setting may improve performance for workloads with many small files, particularly when the average file size is less than 256KB.

  • The maximum setting may improve performance for workloads with very large files, generally greater than 8GB, especially when many clients are sharing access to the same files.

For advanced tuning, the daos tool provides per-file or per-directory settings. Experimenting with advanced tuning comes with performance-related risks and is generally not recommended. See Understanding Data Redundancy and Sharding in DAOS for more details.

Directory striping setting

When creating your Parallelstore instance, you can specify one of three directory striping settings:

  • Minimum
  • Balanced
  • Maximum

For most workloads, we recommend the maximum setting.

For workloads which involve lots of listing of large directories, the balanced or minimum settings can result in better list performance. However, the performance of other operations, particularly file creation, may suffer.

multi-user

When using the dfuse tool to mount your Parallelstore instance, we recommend specifying the --multi-user flag. This flag tells the kernel to make the file system available to all users on a client, rather than only the user running the DFuse process. DFuse then appears like a generic multi-user file system and the standard chown and chgrp calls are enabled. All file system entries are owned by the user that created them as is normal in a POSIX file system,

When specifying the --multi-user flag, you must also update /etc/fuse.conf as root by adding the following line:

user_allow_other

There doesn't appear to be a performance implication to mounting your instance as multi-user.

Erasure coding setting

Erasure coding is set to 2+1. This setting cannot be changed. Any I/O that doesn't use EC2+1 is rejected.

Google Kubernetes Engine sidecar container resource allocation

In most cases, unsatisfactory performance with Google Kubernetes Engine and Parallelstore is caused by insufficient CPU or memory allocated to the Parallelstore sidecar container. To properly allocate resources, consider the following suggestions:

  • Read the considerations highlighted in Configure resources for the sidecar container. You will learn about why you might need to increase the resource allocation, and how to configure the sidecar container resource allocation using Pod annotations.

  • You can use the value 0 to turn off any resource limits or requests on Standard clusters. For example, by setting gke-parallelstore/cpu-limit: 0 and gke-parallelstore/memory-limit: 0, the sidecar container's CPU and memory limits will be empty, and the default requests will be used. This setting is useful when you don't know how many resources dfuse needs for your workloads and want it to use all available resources on a node. Once you've figured out how many resources dfuse needs based on your workload metrics, you can set appropriate limits.

  • On Autopilot clusters, you cannot use value 0 to unset the sidecar container resource limits and requests. You have to explicitly set a larger resource limit for the sidecar container on Autopilot clusters, and rely on Google Cloud metrics to decide whether increasing the resource limit is needed.