如果出现如下问题:
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.
可以尝试修改性能设置:默认是尝试使用xFormers改成自动,重启在尝试。
评论区