coredump-debug
Debug segfaults and crashes in JAX/XLA/ROCm training workloads using coredump analysis. Use when the user has a coredump file, SIGSEGV, segfault, crash dump, or core file to analyze. Covers GDB backtrace extraction, identifying the crash cause from registers and disassembly, finding and cloning the correct source code versions, and reading the relevant code to determine the root cause.
更新日志: Source: GitHub https://github.com/AMD-AGI/maxtext-slurm
还没有评论,快来第一个发言吧。