Skip to content

CROSSTOOL setup differences between earlier or later of Bazel 0.17.1  #94

@pint1022

Description

@pint1022

Hi, Hugh
I am working on porting OpenCL-coriander to Tensorflow v1.1. BTW, the 0.11 version should work with cudnn (tf-coriander currently under 'pint1022') now.
I just try to address an issue in the compiling process. By checking the calling stack of building process, there are some differences in the traces. Here is the summary of the differences:
tf_gpu_kernel_library rule:
(tf-coriander)

  1. gather_functor_gpu-hostraw.ll (cocl)
  2. gather_functor_gpu-hostpatched.ll (patch_hostside )
  3. gather_functor_gpu.cu.pic.o (llvm-4.0/bin/clang++)
    (tensorflow r1.1)
  4. gather_functor_gpu-hostraw.ll (llvm-4.0/bin/clang++)
  5. gather_functor_gpu-hostpatched.ll (patch_hostside )
  6. gather_functor_gpu.cu.pic.o (llvm-4.0/bin/clang++)
    one calls cocl, the other calls clang++ in step one.

The real issue is that there is a missing symbol error in the linker. for example:
this is in libcwiseop.lo
ArgMax<Eigen::GpuDevice, float, int>::Reduce2(Eigen::GpuDevice const&, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, int>
the caller in python.so looking for
ArgMax<Eigen::GpuDevice, float, int>::Reduce2(Eigen::GpuDevice const&, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>,

'int' vs 'long'.
My theory is that the two stages compiling causes Eigen::Index to be parsed and coded into the different types. The issue in the building traces shows that one calls 'cocl' the other 'clang++'.

do you have any clues or suggestions how it goes or where to look for the root-cause?

thanks,
steven

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions