[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-18。"],[],[],null,["# Best practices for file system transfers\n\nThis page describes best practices for file system transfers.\n\nPerformance best practices\n--------------------------\n\nThe following are best practices for ensuring good transfer performance:\n\n- [Maximize your transfer agent performance](#sizing-agents).\n\n- Benchmark your performance by transferring a large data corpus, typically at\n least 100 GB in size.\n\n Storage Transfer Service is a large-scale, throughput-optimized\n service, so your performance on very small test data sets is not indicative\n of your performance on large data sets in production.\n- Limit individual source folders to 1 million files. Directories\n containing millions of files can slow down the entire transfer.\n\n- Run agents in separate virtual machines (VMs) so that you can scale your\n resource consumption more effectively.\n\n- Verify that the network interface on the agent machines is sized for the\n read/write bandwidth you need.\n\n For example, if you intend to fully utilize a 20 Gbps wide-area network\n (WAN), your agent machine's network interface must support 20 Gbps to\n read data from your networked file system, and another 20 Gbps to\n transfer data to Cloud Storage, or 40 Gbps of total bandwidth.\n- Monitor the CPU, memory, and network on agent machines to ensure that the\n machines aren't overwhelmed by other workloads, as this can negatively\n affect performance. Refer to the\n [agent hardware requirements](/storage-transfer/docs/on-prem-set-up#requirements)\n for suggested memory and CPU figures.\n\nMultipart uploads\n-----------------\n\nIf your transfer is from a POSIX file system to Cloud Storage, or between\nPOSIX file systems, consider enabling\n[multipart uploads](/storage/docs/multipart-uploads).\nMultipart uploads can speed up transfers that include large files by up to 300%\nby breaking large files (\\\u003e1 GiB) into smaller parts and uploading those\nparts in parallel.\n\nHDFS and S3-compatible file systems do not support multipart uploads.\n\n### Enable multipart uploads\n\nTo enable multipart uploads:\n\n- You must [grant the required permissions](/storage-transfer/docs/file-system-permissions#multipart_uploads) to\n the account authorizing transfer agents, either a user account or a service\n account.\n\n- The destination or intermediate bucket must not have a retention policy or\n object hold.\n\nOnce enabled, Storage Transfer Service uses multipart uploading automatically when\ndoing so is likely to speed up a transfer.\n\n### Configure multipart object lifecycle rules\n\nYou can use Cloud Storage Object Lifecycle Management to abort an incomplete\nmultipart upload and delete the associated parts. See\n[Abort incomplete multipart uploads](/storage/docs/lifecycle#abort-mpu) in the\nCloud Storage documentation.\n\nWe recommend setting an `age` value of 7 days.\n\n### Disable multipart uploads\n\nTo disable multipart uploads, reinstall the transfer agents using `docker run`\nand pass `--enable-multipart=false`: \n\n```\nsudo docker run --ulimit memlock=64000000 -d --rm \\\n-v /usr/local/research:/usr/local/research \\\ngcr.io/cloud-ingest/tsop-agent:latest \\\n--project-id=PROJECT_ID \\\n--agent-pool=AGENT_POOL \\\n--creds-file=CREDENTIAL_FILE \\\n--hostname=$(hostname) \\\n--enable-multipart=false\n```\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e specifies the project ID that is hosting the transfer.\n- \u003cvar translate=\"no\"\u003eCREDENTIAL_FILE\u003c/var\u003e: if the transfer agent is using a service account for authentication, specify the path to a JSON-formatted service account credential file.\n\nAlternatively, [revoke the required permissions](/storage-transfer/docs/file-system-permissions#multipart_uploads) from\nthe account authorizing transfer agents, either a user account or a service\naccount.\n\nMaximize transfer agent performance\n-----------------------------------\n\nYour transfer performance is affected by the following variables:\n\n- File system capabilities.\n\n- Underlying hardware limitations.\n\n The hard drive media type, input/output bus, and local area network (LAN)\n connectivity all affect performance.\n- WAN throughput and utilization.\n\n A slower or highly utilized WAN slows performance.\n- File characteristics.\n\n For example, many large files have a higher network\n throughput than many small files due to networking overhead.\n\nBecause of these variables, we can't predict actual performance or provide an\noptimal number of agents to use.\n\nAt a minimum, we recommend that you use three agents, across different machines\nif possible, so that your transfer remains fault-tolerant. You can add transfer\nagents while transfers are running, as performance dynamically increases.\n\nTo observe the impact of adding agents, and to choose the number of agents that\nworks best for your environment, do the following:\n\n1. Start a large transfer that takes at least 1 hour to run. For example,\n start a transfer that contains at least 100k files and is at least\n 100 GB in total size.\n\n2. Use\n [Cloud Monitoring to observe the overall agent throughput](https://cloud.google.com/storage-transfer/docs/managing-on-prem-agents#monitor-agent-stackdriver).\n\n3. Wait for the throughput to level off, and determine if you are limited by\n your WAN capacity or your bandwidth cap.\n\n4. If you haven't saturated your WAN capacity, and you haven't reached your\n desired transfer limit, add another agent. The additional agent\n automatically increases transfer throughput. Wait approximately 3 minutes to\n throughput stabilize in Cloud Monitoring.\n\nRepeat steps 3 and 4, adding one agent at a time until you reach your desired\nlimit. As long as computational, file system, and network resources are\navailable, you can run up to 100 agents concurrently per agent pool.\n\nIf you saturate your outbound bandwidth before you reach your desired limit, you\ncan do any of the following:\n\n- [Control the bandwidth used by Storage Transfer Service](/storage-transfer/docs/obtaining-bandwidth-on-prem).\n- [Obtain more network bandwidth](/storage-transfer/docs/obtaining-bandwidth-on-prem#moar_bandwidth).\n\nIf you've added agents, but the throughput isn't increasing and your WAN isn't\nsaturated, [investigate the file system throughput](/storage-transfer/docs/troubleshooting-on-prem#slow-transfer-speed).\nIn rare cases the file system throughput is saturated, hampering your ability\nto increase your transfer performance.\n\nNaming agents\n-------------\n\nWhen naming agents, we recommend that you do the following:\n\n- Always include the hostname in your agent. This helps you find the\n machine an agent is running on. We recommend that you pass\n `--hostname=$(hostname)` to the Docker `run` command.\n\n- Choose an agent prefix scheme that helps you identify agents in the context\n of your monitoring and infrastructure organization. For example:\n\n - If you have three separate transfer projects, you may want to include\n the team name in your agent. For example, `logistics`.\n\n - If you are running two different transfer projects for two different\n data centers, you may want to include the data center name in the agent\n prefix. For example, `omaha`."]]