Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CuDNN library mismatch when running docker #316

Open
aghadjip opened this issue Dec 16, 2022 · 0 comments
Open

CuDNN library mismatch when running docker #316

aghadjip opened this issue Dec 16, 2022 · 0 comments

Comments

@aghadjip
Copy link

Looks like there is a CUDNN lib mismatch in the docker setup.

2022-12-16 16:11:58.828108: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:421] Loaded runtime CuDNN library: 8.4.0 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. 2022-12-16 16:11:58.830602: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:421] Loaded runtime CuDNN library: 8.4.0 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

Results in a lib not founnd

`XlaRuntimeError Traceback (most recent call last)
Cell In[14], line 25
23 encoded_images = encoded_images.sequences[..., 1:]
24 # decode images
---> 25 decoded_images = p_decode(encoded_images, vqgan_params)
26 decoded_images = decoded_images.clip(0.0, 1.0).reshape((-1, 256, 256, 3))
27 for decoded_img in decoded_images:

[... skipping hidden 11 frame]

File /usr/local/lib/python3.8/dist-packages/jax/_src/dispatch.py:1014, in backend_compile(backend, built_c, options, host_callbacks)
1009 return backend.compile(built_c, compile_options=options,
1010 host_callbacks=host_callbacks)
1011 # Some backends don't have host_callbacks option yet
1012 # TODO(sharadmv): remove this fallback when all backends allow compile
1013 # to take in host_callbacks
-> 1014 return backend.compile(built_c, compile_options=options)

XlaRuntimeError: UNKNOWN: Failed to determine best cudnn convolution algorithm for:
%cudnn-conv-bias-activation.219 = (f32[2,256,16,16]{3,2,1,0}, u8[0]{0}) custom-call(f32[2,256,16,16]{3,2,1,0} %bitcast.256, f32[256,256,1,1]{3,2,1,0} %bitcast.263, f32[256]{0} %get-tuple-element.341), window={size=1x1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", metadata={op_name="pmap(p_decode)/jit(main)/VQModule.decode_code/VQModule.decode/post_quant_conv/conv_general_dilated[window_strides=(1, 1) padding=((0, 0), (0, 0)) lhs_dilation=(1, 1) rhs_dilation=(1, 1) dimension_numbers=ConvDimensionNumbers(lhs_spec=(0, 3, 1, 2), rhs_spec=(3, 2, 0, 1), out_spec=(0, 3, 1, 2)) feature_group_count=1 batch_group_count=1 lhs_shape=(2, 16, 16, 256) rhs_shape=(1, 1, 256, 256) precision=None preferred_element_type=None]" source_file="/usr/local/lib/python3.8/dist-packages/flax/linen/linear.py" source_line=438}, backend_config="{"conv_result_scale":1,"activation_mode":"0","side_input_scale":0}"

Original error: UNIMPLEMENTED: DNN library is not found.

To ignore this failure and try to use a fallback algorithm (which may have suboptimal performance), use XLA_FLAGS=--xla_gpu_strict_conv_algorithm_picker=false. Please also file a bug for the root cause of failing autotuning.`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant