Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Solução de problemas do PyTorch – TPU
Neste guia, apresentamos informações sobre como identificar e resolver
que podem surgir ao treinar modelos PyTorch
no Cloud TPU. Para um guia mais geral sobre como começar a
usar o Cloud TPU, consulte o
Guia de início rápido do PyTorch.
Solução de problemas de desempenho lento de treinamento
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[],[],null,["# Troubleshooting PyTorch - TPU\n=============================\n\nThis guide provides troubleshooting information to\nhelp you identify and resolve problems you might encounter while training\nPyTorch models on Cloud TPU. For a more general guide to\ngetting started with Cloud TPU, see the\n[PyTorch quickstart](/tpu/docs/run-calculation-pytorch).\n| **Note:** If you aren't able to resolve your issue using this guide, see [Getting Support](/tpu/docs/getting-support) for further assistance.\n\nTroubleshooting slow training performance\n-----------------------------------------\n\nIf your model trains slowly, [generate and review a metrics report.](https://pytorch.org/xla/release/r2.6/learn/troubleshoot.html#get-a-metrics-report)\n\nTo automatically analyze the metrics report and provide a summary, run\nyour workload with PT_XLA_DEBUG=1.\n\nFor more information about issues that might cause your model to train slowly,\nsee [Known performance caveats](https://pytorch.org/xla/release/r2.6/learn/troubleshoot.html#known-performance-caveats).\n\nPerformance profiling\n---------------------\n\nTo profile your workload in-depth to discover bottlenecks, review these resources:\n\n- [PyTorch/XLA performance profiling](https://cloud.google.com/tpu/docs/pytorch-xla-performance-profiling-tpu-vm)\n- [Sample MNIST training script with profiling](https://github.com/pytorch/xla/blob/master/test/test_profile_mp_mnist.py)\n\nMore debugging tools\n--------------------\n\nYou can specify [environment variables](https://pytorch.org/xla/release/r2.6/learn/troubleshoot.html#environment-variables)\nto control the behavior of the PyTorch/XLA software stack.\n\nIf you encounter an unexpected bug and need help, [file a GitHub issue](https://github.com/pytorch/xla).\n\nManaging XLA tensors\n--------------------\n\n[XLA tensor Quirks](https://pytorch.org/xla/release/r2.6/learn/troubleshoot.html#xla-tensor-quirks)\ndescribes what you should and shouldn't do when working with XLA tensors and\nshared weights."]]