Description
代码是直接复制的教程。一下是报错信息
RuntimeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 benchmark.run(show_plots=True, print_data=True)
File ~/anaconda3/envs/qqp-env/lib/python3.10/site-packages/triton/testing.py:346, in Mark.run(self, show_plots, print_data, save_path, return_df, **kwargs)
344 html.write("\n")
345 for bench in benchmarks:
--> 346 result_dfs.append(self._run(bench, save_path, show_plots, print_data, **kwargs))
347 if save_path:
348 html.write(f"<image src="{bench.plot_name}.png"/>\n")
File ~/anaconda3/envs/qqp-env/lib/python3.10/site-packages/triton/testing.py:289, in Mark._run(self, bench, save_path, show_plots, print_data, diff_col, save_precision, **kwrags)
287 row_mean, row_min, row_max = [], [], []
288 for y in bench.line_vals:
--> 289 ret = self.fn(**x_args, **{bench.line_arg: y}, **bench.args, **kwrags)
290 try:
291 y_mean, y_min, y_max = ret
Cell In[8], line 29
27 ms = triton.testing.do_bench(lambda: torch.softmax(x, axis=-1))
28 if provider == 'triton':
---> 29 ms = triton.testing.do_bench(lambda: softmax(x))
30 gbps = lambda ms: 2 * x.nelement() * x.element_size() * 1e-9 / (ms * 1e-3)
31 return gbps(ms)
...
File ~/anaconda3/envs/qqp-env/lib/python3.10/site-packages/triton/backends/nvidia/driver.py:365, in CudaLauncher.call(self, *args, **kwargs)
364 def call(self, *args, **kwargs):
--> 365 self.launch(*args, **kwargs)
RuntimeError: Triton Error [CUDA]: out of memory
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...