Skip to content

Minor improvements in Orochi 2.00 for Zen4 Raphael APU support and other suggestions.. #83

@oscarbg

Description

@oscarbg

Hi,
first, congrats on getting Orochi 2.0 release out!
comments:

  1. tested your new Orochi 2.0 release on Zen4 Raphael iGPU and get some errors..

1a)first RadixSort test fails on Zen4 due to number_of_blocks becoming 0 due to APU having only 2 compute units..
fix (ParallelPrimitives/RadixSort.cpp):

258c258,259
< 		number_of_blocks = ( number_of_blocks / base ) * base;
---
> 		number_of_blocks = ( number_of_blocks / base ) * base;
> 		if( number_of_blocks == 0 ) number_of_blocks = 4;

1b)VulkanComputeSimple checking of pciBusID is incorrect for APUs as Vulkan reports 0 and ROCM 0x6b or viceversa.. I fixed it main.cpp with:

< 			if (physicalDevicePCIBusInfoProperties.pciDomain == props.pciDomainID &&
< 				physicalDevicePCIBusInfoProperties.pciBus == props.pciBusID &&
< 				physicalDevicePCIBusInfoProperties.pciDevice == props.pciDeviceID) {
< 				physicalDeviceIndex = i;
< 				break;
---
> 			if( physicalDevicePCIBusInfoProperties.pciDomain == props.pciDomainID &&
> 				// physicalDevicePCIBusInfoProperties.pciBus == props.pciBusID &&
> 				physicalDevicePCIBusInfoProperties.pciDevice == props.pciDeviceID )
> 			{
> 				if( physicalDevicePCIBusInfoProperties.pciBus > 16 ) // integrated!!
> 				{
> 					if( props.pciBusID == 0 )
> 					{
> 						physicalDeviceIndex = i;
> 						break;
> 					}
> 				}
> 				else
> 				{
> 					if( physicalDevicePCIBusInfoProperties.pciBus == props.pciBusID )
> 					{
> 						physicalDeviceIndex = i;
> 						break;
> 					}
> 				}
  1. can you fix running samples OK by default after a premake5 build?..
    I mean running executables directly from Orochi-2.00/dist/bin/Debug or Orochi-2.00/dist/bin/Release folders..
    as they seem to try to expect to find kernels or sample data like textures in "../" folder so basically you have to options:

2a) copy or move Debug or Release folder in Orochi-2.0/ folder (out of dist/bin) so samples find required data or

2b) recreate/copy needed files from Orochi-2.0 folder to Orochi-2.00/dist/bin like for example copying:

Orochi-2.00/Test/Texture/texture_test_kernel.hpp
to:
Orochi-2.00/dist/bin/Test/Texture/texture_test_kernel.hpp

3)add Linux support to VulkanComputeSimple RadixSort (Orochi-2.00/Test/VulkanComputeSimple and Orochi-2.00/Test/RadixSort)..
in fact support it's there.. only minor fixes as I added and is very simple..
for Vulkan premake5.lua:

--    buildoptions { "/wd4244" }
      buildoptions { "--std=c++14" }
--      links{ "Pop" }
--      links{ "kernel32", "user32", "gdi32", "winspool", "comdlg32", "advapi32", "shell32", "ole32", "oleaut32", "uuid", "odbc32", "odbccp32", "version" }

  1. finally would be nice if WMMA sample get added NV support.. seems AMD only ATM..
    I started with adding to the kernel "#ifdef CUDA_ARCH" code:
#ifdef __CUDA_ARCH__
#include <mma.h>
//https://developer.nvidia.com/blog/programming-tensor-cores-cuda-9/
using namespace nvcuda;
...
wmma::fragment<wmma::matrix_a, 16, 16, 16, half, wmma::col_major> a_frag2;
	wmma::fragment<wmma::matrix_b, 16, 16, 16, half, wmma::col_major> b_frag2;
	wmma::fragment<wmma::accumulator, 16, 16, 16, float> acc_frag2;
	wmma::fragment<wmma::accumulator, 16, 16, 16, float> c_frag2;
..
#ifdef __CUDA_ARCH__
	nvcuda::wmma::mma_sync( c_frag2, a_frag2, b_frag2, c_frag2 );
#else
	c_frag = __builtin_amdgcn_wmma_f16_16x16x16_f16_w32( a_frag, b_frag, c_frag, false );
#endif

#endif

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @oscarbg

        Issue actions

          Minor improvements in Orochi 2.00 for Zen4 Raphael APU support and other suggestions.. · Issue #83 · GPUOpen-LibrariesAndSDKs/Orochi