segfault on python calls from caffe in container
I have a thorny issue - when I call cv2 functions from a certain code (caffe) in a docker container I hit the following segfault, which does not occur if not running in a container; however the nvidia container claims to be trouble free. I am not expert in reading stacktraces like these so any pointers to solution would be appreciated. I hit the segfault on using cv2 functions like cv2.flip or getrotationmatrix. However if I just call those functions (in the container) from a python command line, everything is ok....
==15165== Syscall param msync(start) points to uninitialised byte(s)
==15165== at 0x721892D: ??? (syscall-template.S:81)
==15165== by 0x121D7123: ??? (in /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1)
==15165== by 0x121D9EF6: ??? (in /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1)
==15165== by 0x121DB151: ??? (in /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1)
==15165== by 0x121DB4E8: ??? (in /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1)
==15165== by 0x121D7A30: _ULx86_64_step (in /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1)
==15165== by 0x5ABA442: google::GetStackTrace(void**, int, int) (in /usr/lib/x86_64-linux-gnu/libglog.so.0.0.0)
==15165== by 0x5ABFB31: ??? (in /usr/lib/x86_64-linux-gnu/libglog.so.0.0.0)
==15165== by 0x715ACAF: ??? (in /lib/x86_64-linux-gnu/libc-2.19.so)
==15165== by 0x11220C45: cv::flip(cv::_InputArray const&, cv::_OutputArray const&, int) (in /usr/lib/x86_64-linux-gnu/libopencv_core.so.2.4.8)
==15165== by 0x82BDEDC3: pyopencv_cv_flip(_object*, _object*, _object*) (in /opencv/build/lib/cv2.so)
==15165== by 0x65E60D3: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== Address 0xffeffd000 is on thread 1's stack
==15165== in frame #6, created by google::GetStackTrace(void**, int, int) (???:)
==15165==
*** SIGSEGV (@0x1010000) received by PID 15165 (TID 0x406abc0) from PID 16842752; stack trace: ***
@ 0x715acb0 (unknown)
@ 0x11220c46 (unknown)
@ 0x82bdedc4 pyopencv_cv_flip()
@ 0x65e60d4 (unknown)
@ 0x65e6059 (unknown)
@ 0x65e754d (unknown)
@ 0x65e5dd8 (unknown)
@ 0x65e6059 (unknown)
@ 0x65e754d (unknown)
@ 0x661c6d0 (unknown)
@ 0x6588d43 (unknown)
@ 0x65147bd (unknown)
@ 0x6588d43 (unknown)
@ 0x6601577 (unknown)
@ 0x6544617 (unknown)
@ 0x715a34d5 caffe::PythonLayer<>::Reshape()
@ 0x50d24b5 caffe::Net<>::Init()
@ 0x50d3345 caffe::Net<>::Net()
@ 0x508066a caffe::Solver<>::InitTrainNet()
@ 0x508187c caffe::Solver<>::Init()
@ 0x5081baa caffe::Solver<>::Solver()
@ 0x5103053 caffe::Creator_SGDSolver<>()
@ 0x411fc6 caffe::SolverRegistry<>::CreateSolver()
@ 0x40af42 train()
@ 0x40897c main
@ 0x7145f45 (unknown)
@ 0x409283 (unknown)
@ 0x0 (unknown)
==15165==
==15165== Process terminating with default action of signal 11 (SIGSEGV)
==15165== at 0x11220C46: cv::flip(cv::_InputArray const&, cv::_OutputArray const&, int) (in /usr/lib/x86_64-linux-gnu/libopencv_core.so.2.4.8)
==15165== by 0x82BDEDC3: pyopencv_cv_flip(_object*, _object*, _object*) (in /opencv/build/lib/cv2.so)
==15165== by 0x65E60D3: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x65E6058: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x65E754C: PyEval_EvalCodeEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x65E5DD7: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x65E6058: PyEval_EvalFrameEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x65E754C: PyEval_EvalCodeEx (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x661C6CF: ??? (in /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0)
==15165== by 0x6588D42: PyObject_Call (in /usr/lib ...