Description
TensorFlow seems to be leaking memory, but I have not yet figured out where this is happening. It's not leaking Julia objects, because whos()
can't account for the memory usage. A graph of the free memory in my system is shown here. You can see my system starting to swap out to disk around 20:00. I killed the Julia process around 20:44.
My best guess is that we're leaking memory within the TensorFlow C library within my train loop. I've tried reproducing this with a smaller example like examples/logistic.jl
but of course it doesn't happen. Using gdb
to look at places where mmap()
is being called, it's all either within Julia's array allocation routines during feed_dict
construction time, or within Eigen
inside of tensorflow.
I would post my code, but there's so much of it it would be unfair to you. Do you have any general debugging tips for tracking something like this down?