Conversation
|
for me to be verbose, this waits on the mentioned nvfuser pr |
2ab964e to
080d391
Compare
|
@jjsjann123 what would be the nvfuser version that ships the SM89 support? |
You caught me! I forgot to bump nvfuser version in the PR. I'll go back and do that. So if you are making a version guard, make it 0.2.24 (nvfuser is currently at 0.2.23, and the PR is already in). |
thunder/executors/nvfuserex_impl.py
Outdated
| cuda_major, _ = torch.cuda.get_device_capability() | ||
| return cuda_major > 8 | ||
| cuda_major, cuda_minor = torch.cuda.get_device_capability() | ||
| return (cuda_major, cuda_minor) >= (8, 9) |
There was a problem hiding this comment.
I realize that this is an ugly bit...
I think the full logic here should copy this: https://github.com/NVIDIA/Fuser/blob/6fa084312d7eec5c69d59f3eb3cbdd9fa72a1600/csrc/device_lower/analysis/device_version.cpp#L24-L39
But that's a lot... We should have a generic API on nvfuser side that does is_dtype_support_on_device(dtype, device_index)
There was a problem hiding this comment.
A Python function exposed by nvFuser for this logic would be great!
There was a problem hiding this comment.
(Even if we don't do that in this PR, an issue for it would be great)
080d391 to
28af668
Compare
Signed-off-by: Masaki Kozuki <[email protected]>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
26b1582 to
c99e048
Compare
for more information, see https://pre-commit.ci
What does this PR do?
With NVIDIA/Fuser#3624, devices >= sm89 get allowed to use nvfuser executor for fp8.