Skip to content

running on GPU #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
elbaro opened this issue Apr 15, 2018 · 4 comments
Closed

running on GPU #158

elbaro opened this issue Apr 15, 2018 · 4 comments
Labels

Comments

@elbaro
Copy link

elbaro commented Apr 15, 2018

My code doesn't run on GPU.

I installed libtensorflow with

 TF_TYPE="gpu"
 OS="linux"
 TARGET_DIRECTORY="/usr/local"
 curl -L \
   "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.7.0.tar.gz" |
   sudo tar -C $TARGET_DIRECTORY -xz

python installation uses GPU, but rust code only prints

2018-04-15 19:12:56.959342: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

Is there extra step I need to configure?

@adamcrume
Copy link
Contributor

That log line doesn't say anything about whether it's using GPU or not. It just says you're using a binary that wasn't compiled with SSE, AVX, or FMA, which your CPU supports. This message is normal if you're using a precompiled binary on a modern CPU.

Since you're installing the shared libraries manually, you might want to double-check that Rust is using the libraries you installed and not downloading its own copies, especially since it uses CPU by default (unless you enable the tensorflow_gpu feature). You can do this by running ldd $YOUR_BINARY and checking that libtensorflow.so points to your installed version.

By the way, we now have a new mailing list for questions like this.

@elbaro
Copy link
Author

elbaro commented Apr 16, 2018

Thanks. I will use the mailing list in the future.

The complain message about opt flags is not a problem. The problem is it's the only output.
When gpu is used, tensorflow should print the information about gpu like below.

2018-04-16 21:42:23.331449: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-16 21:42:23.331725: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7465
pciBusID: 0000:01:00.0
totalMemory: 5.93GiB freeMemory: 5.33GiB
2018-04-16 21:42:23.331751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-16 21:42:23.859552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-16 21:42:23.859576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-16 21:42:23.859581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-16 21:42:23.859765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5106 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-16 21:42:23.943425: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered

This log is I think not specific to python (you can see the log format is the same as rust version, so I guess the log comes from c backend).

I did a little experiment and found strange behavior:

  1. ldd /usr/local/lib/libtensorflow.so correctly points to libcublas.so.9.1 and so on.
    (The downloaded one is for CUDA 9.0 so I rebuilt from the source. )
  2. ldd target/release/my_binary-20cb1 points to /usr/local/lib/libtensorflow.so.
cargo test // uses CPU
./target/release/my_binary-20cb1 // manual run uses GPU
LD_LIBRARY_PATH=/usr/local/lib cargo test // uses CPU
  1. Remove /usr/local/lib/libtensorflow.so
cargo test // uses CPU
./target/release/my_binary-20cb1 // complain about missing .so
LD_LIBRARY_PATH=/usr/local/lib cargo test // uses CPU

Here is my lib.rs

#[cfg(test)]
mod tests {
	#[test]
	fn it_works() {
		// create session, graph, tensor, etc.
               ...
	}
}

So my test binary is built against /usr/local/libtensorflow.so (gpu version) but running via cargo test doesn't use it. But my binary is built by cargo test..

@elbaro
Copy link
Author

elbaro commented Apr 16, 2018

Correction:
the binary doesn't points to a specific libtensorflow.so, but uses whatever available on runtime.

The issue is that tensorflow-sys crate ignores my LD_LIBRARY_PATH and downloads its own cpu libtensorflow.

❯ echo $LD_LIBRARY_PATH
:/usr/local/lib
❯ ls /usr/local/lib/libtensorflow*      
/usr/local/lib/libtensorflow.so  /usr/local/lib/libtensorflow_framework.so
❯ rm -rf ~/.cargo/registry/cache/github.1485827954.workers.dev-1ecc6299db9ec823/tensorflow-*
❯ rm -rf ~/.cargo/registry/src/github.1485827954.workers.dev-1ecc6299db9ec823/tensorflow-*  
❯ rm -rf target                                                                     
❯ cargo test --release -- --nocapture
   ...
   Compiling tensorflow-sys v0.11.0
   ...
❯ ls ~/.cargo/registry/src/github.1485827954.workers.dev-1ecc6299db9ec823/tensorflow-sys-0.11.0/target/                                           
libtensorflow-cpu-linux-x86_64-1.6.0  libtensorflow-cpu-linux-x86_64-1.6.0.tar.gz

@elbaro
Copy link
Author

elbaro commented Apr 16, 2018

Solved:
tensorflow-sys looks for tensorflow pkg-config.

If you built from source and copied lib*.so, run the following:
tensorflow/tensorflow/c/generate-pc.sh --prefix=/usr/local --version=1.7.0
This generates tensorflow.pc. Copy this to your PKG_CONFIG_PATH.

test:

extern crate pkg_config;
println!("{:?}", pkg_config::find_library("tensorflow"));

If you download the pre-built binary, you may have to generate yourself.

ramon-garcia pushed a commit to ramon-garcia/tensorflow-rust that referenced this issue May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants