Skip to content

在融合softmax 那一步教程的时候出错,不知道是什么问题 #12

Open
@naonao-cola

Description

@naonao-cola

代码是直接复制的教程。一下是报错信息

RuntimeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 benchmark.run(show_plots=True, print_data=True)

File ~/anaconda3/envs/qqp-env/lib/python3.10/site-packages/triton/testing.py:346, in Mark.run(self, show_plots, print_data, save_path, return_df, **kwargs)
344 html.write("\n")
345 for bench in benchmarks:
--> 346 result_dfs.append(self._run(bench, save_path, show_plots, print_data, **kwargs))
347 if save_path:
348 html.write(f"<image src="{bench.plot_name}.png"/>\n")

File ~/anaconda3/envs/qqp-env/lib/python3.10/site-packages/triton/testing.py:289, in Mark._run(self, bench, save_path, show_plots, print_data, diff_col, save_precision, **kwrags)
287 row_mean, row_min, row_max = [], [], []
288 for y in bench.line_vals:
--> 289 ret = self.fn(**x_args, **{bench.line_arg: y}, **bench.args, **kwrags)
290 try:
291 y_mean, y_min, y_max = ret

Cell In[8], line 29
27 ms = triton.testing.do_bench(lambda: torch.softmax(x, axis=-1))
28 if provider == 'triton':
---> 29 ms = triton.testing.do_bench(lambda: softmax(x))
30 gbps = lambda ms: 2 * x.nelement() * x.element_size() * 1e-9 / (ms * 1e-3)
31 return gbps(ms)
...
File ~/anaconda3/envs/qqp-env/lib/python3.10/site-packages/triton/backends/nvidia/driver.py:365, in CudaLauncher.call(self, *args, **kwargs)
364 def call(self, *args, **kwargs):
--> 365 self.launch(*args, **kwargs)

RuntimeError: Triton Error [CUDA]: out of memory
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions