ARM cross-compilation (tl;dr: use proper SPIR target) #117
Description
As mentionned in mozilla/DeepSpeech#1346 I'm currently investigating how much we can rely on the OpenCL VC4CL [https://github.com/doe300/VC4CL#opencl-support] driver to leverage RPi3's GPU.
So far, I built successfully the driver with a linaro cross-compiler and vc4c's testsuite somehow works. I could also verify that comptecpp_info
can at least see the things.
Now I am facing a dumb issue: how to cross-compile for ARM from SYCL branches. We have setup to cross-compile for ARM and ARMv8 on https://github.com/mozilla/tensorflow, so I blindly did a configure step referencing the ARM version of ComputeCpp:
echo "" | TF_NEED_GCP=0 TF_NEED_GDR=0 TF_NEED_HDFS=0 TF_NEED_S3=0 TF_NEED_JEMALLOC=1 TF_ENABLE_XLA=0 TF_NEED_MKL=0 TF_NEED_VERBS=0 TF_NEED_MPI=0 TF_NEED_CUDA=0 TF_NEED_OPENCL_SYCL=1 TF_NEED_COMPUTECPP=1 COMPUTECPP_TOOLKIT_PATH=../ComputeCpp-CE-0.7.0-Ubuntu-14.04-ARM_32/ TF_USE_DOUBLE_SYCL=0 TF_USE_HALF_SYCL=0 ./configure
And then, I built it:
bazel build --config=sycl -s -j 96 --config=monolithic --config=rpi3 --config=rpi3_opt -c opt --copt=-fvisibility=hidden --copt=-DCTC_DISABLE_OMP --verbose_failures //native_client:libdeepspeech.so //native_client:deepspeech_utils
This do build an ARM lib, linked with libComputeCpp.so
. But at runtime, it does not seems like it runs OpenCL.
Now, I've also stumbled upon TF_SYCL_CROSS_TOOLCHAIN
and TF_SYCL_CROSS_TOOLCHAIN_NAME
, but they lack of documentation, and trying to use them do fail:
$ echo "" | TF_NEED_GCP=0 TF_NEED_GDR=0 TF_NEED_HDFS=0 TF_NEED_S3=0 TF_NEED_JEMALLOC=1 TF_ENABLE_XLA=0 TF_NEED_MKL=0 TF_NEED_VERBS=0 TF_NEED_MPI=0 TF_NEED_CUDA=0 TF_NEED_OPENCL_SYCL=1 TF_NEED_COMPUTECPP=1 COMPUTECPP_TOOLKIT_PATH=ComputeCpp-CE-0.7.0-Ubuntu-14.04-ARM_32/ TF_SYCL_CROSS_TOOLCHAIN=gcc-linaro-4.9.4-2017.01-x86_64_arm-linux-gnueabihf/bin/ TF_SYCL_CROSS_TOOLCHAIN_NAME=arm-linux-gnueabihf- TF_USE_DOUBLE_SYCL=0 TF_USE_HALF_SYCL=0 ./configure
$ bazel build --config=sycl -s -j 96 --config=monolithic -c opt --copt=-fvisibility=hidden --copt=-DCTC_DISABLE_OMP --verbose_failures //native_client:libdeepspeech.so //native_client:deepspeech_utils
[...]
ComputeCpp-CE-0.7.0-Ubuntu-14.04-ARM_32/bin/compute -ffunction-sections -fdata-sections -fPIE -fno-omit-frame-pointer -Wall -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK '-DDISABLE_SKINNY=1' '-fvisibility=hidden' -DCTC_DISABLE_OMP '-std=c++11' -fsycl-ih-last -sycl-driver -Xclang -cl-denorms-are-zero -Xclang -cl-fp32-correctly-rounded-divide-sqrt -Xclang -cl-mad-enable -sycl-target spir64 '-DTENSORFLOW_USE_SYCL=1' '-DEIGEN_USE_SYCL=1' '-DEIGEN_HAS_C99_MATH=1' '-DEIGEN_HAS_CXX11_MATH=1' -Wno-unused-variable -Wno-unused-const-variable '-DTENSORFLOW_SYCL_NO_HALF=1' '-DTENSORFLOW_SYCL_NO_DOUBLE=1' -MD -MF bazel-out/k8-opt/bin/external/double_conversion/_objs/double-conversion/external/double_conversion/double-conversion/diy-fp.pic.d '-frandom-seed=bazel-out/k8-opt/bin/external/double_conversion/_objs/double-conversion/external/double_conversion/double-conversion/diy-fp.pic.o' -fPIC -iquote external/double_conversion -iquote bazel-out/k8-opt/genfiles/external/double_conversion -iquote external/bazel_tools -iquote bazel-out/k8-opt/genfiles/external/bazel_tools -isystem external/double_conversion -isystem bazel-out/k8-opt/genfiles/external/double_conversion -isystem external/bazel_tools/tools/cpp/gcc3 -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -no-canonical-prefixes -c external/double_conversion/double-conversion/diy-fp.cc -o bazel-out/k8-opt/bin/external/double_conversion/_objs/double-conversion/external/double_conversion/double-conversion/diy-fp.pic.o)
ComputeCpp-CE-0.7.0-Ubuntu-14.04-ARM_32/bin/compute: 1 ComputeCpp-CE-0.7.0-Ubuntu-14.04-ARM_32/bin/compute: Syntax error: word unexpected (expecting ")")
which seems expected, given ComputeCpp-CE-0.7.0-Ubuntu-14.04-ARM_32/bin/compute++: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 2.6.32, BuildID[sha1]=df02ab122bb64fc87724de838f7d5a45b8e3f1a5, not stripped
So, what step am I missing to be able to cross-compile ?