[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-11。"],[],[],null,["# TPU v3\n======\n\nThis document describes the architecture and supported configurations of\nCloud TPU v3.\n\nSystem architecture\n-------------------\n\nEach v3 TPU chip contains two TensorCores. Each TensorCore has two matrix-multiply units (MXUs), a\nvector unit, and a scalar unit. The following table shows the key specifications\nand their values for a v3 TPU Pod.\n\nThe following diagram illustrates a TPU v3 chip.\n\nArchitectural details and performance characteristics of TPU v3 are available in\n[A Domain Specific Supercomputer for Training Deep Neural Networks](https://dl.acm.org/doi/pdf/10.1145/3360307).\n\n### Performance benefits of TPU v3 over v2\n\nThe increased FLOPS per TensorCore and memory capacity in TPU v3 configurations\ncan improve the performance of your models in the following ways:\n\n- TPU v3 configurations provide significant performance benefits per\n TensorCore for compute-bound models. Memory-bound models on TPU v2\n configurations might not achieve this same performance improvement if they\n are also memory-bound on TPU v3 configurations.\n\n- In cases where data does not fit into memory on TPU v2 configurations, TPU\n v3 can provide improved performance and reduced recomputation of\n intermediate values (rematerialization).\n\n- TPU v3 configurations can run new models with batch sizes that did not fit\n on TPU v2 configurations. For example, TPU v3 might allow deeper ResNet models and\n larger images with RetinaNet.\n\nModels that are nearly input-bound (\"infeed\") on TPU v2 because training steps\nare waiting for input might also be input-bound with Cloud TPU v3. The\npipeline performance guide can help you resolve infeed issues.\n\nConfigurations\n--------------\n\nA TPU v3 Pod is composed of 1024 chips interconnected with high-speed links. To\ncreate a TPU v3 device or slice, use the `--accelerator-type`\nflag in the TPU creation command (`gcloud compute tpus tpu-vm`). You specify the accelerator type by specifying the\nTPU version and the number of TPU cores. For example, for a single v3 TPU, use\n`--accelerator-type=v3-8`. For a v3 slice with 128 TensorCores, use\n`--accelerator-type=v3-128`.\n\nThe following table lists the supported v3 TPU types:\n\nThe following command shows how to create a v3 TPU slice with 128 TensorCores: \n\n```bash\n $ gcloud compute tpus tpu-vm create tpu-name \\\n --zone=europe-west4-a \\\n --accelerator-type=v3-128 \\\n --version=tpu-ubuntu2204-base\n```\n\nFor more information about managing TPUs, see [Manage TPUs](/tpu/docs/managing-tpus-tpu-vm).\nFor more information about the system architecture of Cloud TPU, see\n[System architecture](/tpu/docs/system-architecture)."]]