Skip to content

Tests failing on BMG Linux with driver 26.01.36711.4 #21096

@sarnex

Description

@sarnex

Describe the bug


Timed Out Tests (1):
  SYCL :: Scheduler/ReleaseResourcesTest.cpp

********************
Failed Tests (2):
  SYCL :: Matrix/joint_matrix_bfloat16_accumulator.cpp
  SYCL :: WorkGroupMemory/basic_usage.cpp


  FAIL: SYCL :: Matrix/joint_matrix_bfloat16_accumulator.cpp (1215 of 1840)
  ******************** TEST 'SYCL :: Matrix/joint_matrix_bfloat16_accumulator.cpp' FAILED ********************
  Exit Code: -6
  
  Command Output (stdout):
  --
  # RUN: at line 21
  env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # executed command: env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # .---command stdout------------
  # | B row major:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | B packed:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # `-----------------------------
  # RUN: at line 21
  env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # executed command: env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # .---command stdout------------
  # | B row major:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | B packed:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # `-----------------------------
  # RUN: at line 22
  env IGC_JointMatrixLoadStoreOpt=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # executed command: env IGC_JointMatrixLoadStoreOpt=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # .---command stdout------------
  # | B row major:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | B packed:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 16 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 1 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 16 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # | Testing: 32 x 64 x 32 [TM x TN x TK]
  # `-----------------------------
  # RUN: at line 22
  env IGC_JointMatrixLoadStoreOpt=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # executed command: env IGC_JointMatrixLoadStoreOpt=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bfloat16_accumulator.cpp.tmp.out
  # .---command stdout------------
  # | B row major:
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 8 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # | Testing: 16 x 16 x 16 [TM x TN x TK]
  # `-----------------------------
  # .---command stderr------------
  # | terminate called after throwing an instance of 'sycl::_V1::exception'
  # |   what():  level_zero backend failed with error: 40 (UR_RESULT_ERROR_OUT_OF_RESOURCES)
  # `-----------------------------
  # error: command failed with exit status: -6
  
  --
  
  ********************
  FAIL: SYCL :: WorkGroupMemory/basic_usage.cpp (1221 of 1840)
  ******************** TEST 'SYCL :: WorkGroupMemory/basic_usage.cpp' FAILED ********************
  Exit Code: -6
  
  Command Output (stdout):
  --
  # RUN: at line 6
  env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/WorkGroupMemory/Output/basic_usage.cpp.tmp.out
  # executed command: env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/WorkGroupMemory/Output/basic_usage.cpp.tmp.out
  # .---command stderr------------
  # | basic_usage.cpp.tmp.out: /__w/llvm/llvm/llvm/sycl/test-e2e/WorkGroupMemory/basic_usage.cpp:48: void swap_scalar(T &, T &) [T = float *]: Assertion `a == old_b && b == old_a && "Incorrect swap!"' failed.
  # `-----------------------------
  # error: command failed with exit status: -6
  
  --
  
  ********************
  TIMEOUT: SYCL :: Scheduler/ReleaseResourcesTest.cpp (1840 of 1840)
  ******************** TEST 'SYCL :: Scheduler/ReleaseResourcesTest.cpp' FAILED ********************
  Exit Code: -9
  Timeout: Reached timeout of 300 seconds
  
  Command Output (stdout):
  --
  # RUN: at line 2
  env SYCL_UR_TRACE=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Scheduler/Output/ReleaseResourcesTest.cpp.tmp.out 2>&1 | /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Scheduler/ReleaseResourcesTest.cpp --check-prefix=CHECK-RELEASE
  # executed command: env SYCL_UR_TRACE=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=0 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Scheduler/Output/ReleaseResourcesTest.cpp.tmp.out
  # note: command had no output on stdout or stderr
  # executed command: /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Scheduler/ReleaseResourcesTest.cpp --check-prefix=CHECK-RELEASE
  # note: command had no output on stdout or stderr
  # RUN: at line 2
  env SYCL_UR_TRACE=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Scheduler/Output/ReleaseResourcesTest.cpp.tmp.out 2>&1 | /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Scheduler/ReleaseResourcesTest.cpp --check-prefix=CHECK-RELEASE
  # executed command: env SYCL_UR_TRACE=2 env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Scheduler/Output/ReleaseResourcesTest.cpp.tmp.out
  # note: command had no output on stdout or stderr
  # error: command failed with exit status: -9
  # error: command reached timeout: True
  # executed command: /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Scheduler/ReleaseResourcesTest.cpp --check-prefix=CHECK-RELEASE
  # note: command had no output on stdout or stderr
  # error: command failed with exit status: -9
  # error: command reached timeout: True
  
  --
  

https://github.com/intel/llvm/actions/runs/21223296366/job/61065496627?pr=21082

To reproduce

  1. Include a code snippet that is as short as possible
  2. Specify the command which should be used to compile the program
  3. Specify the command which should be used to launch the program
  4. Indicate what is wrong and what was expected

Environment

  • OS: [e.g Windows/Linux]
  • Target device and vendor: [e.g. Intel GPU]
  • DPC++ version: [e.g. commit hash or output of clang++ --version]
  • Dependencies version: [e.g. the output of sycl-ls --verbose]

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions