- NVIDIA Jetson Xavier NX
- OpenCV 4.4 built with CUDA support
- Python 3.6.9
I'm having trouble with the Python binding to the HostMem class and can't create PAGE_LOCKED or SHARED host memory. I'm not sure I'm using it correctly and haven't been able to find any examples. I've tried two different ways of creating page-locked host memory so that I can call cv2.cuda image processing methods. Here's what I've tried:
a_mem = cv2.cuda_HostMem(cv2.cuda.HostMem_PAGE_LOCKED)
a_mem.create(num_rows, num_cols, cv2.CV_8UC1)
a_host = a_mem.createMatHeader()
a_dev = cv2.cuda_GpuMat(a_host)
or
a_mem = cv2.cuda_HostMem(num_rows, num_cols, cv2.CV_8UC1, cv2.cuda.HostMem_PAGE_LOCKED)
a_host = a_mem.createMatHeader()
a_dev = cv2.cuda_GpuMat(a_host)
In both cases I get Mat and GpuMat references that I can successfully use to make CUDA calls:
a_dev.upload(a_host)
cv2.cuda.add(a_dev, b_dev, c_dev)
c_dev.download, c_host)
But when I use NVIDIA Visual Profiler to examine the uploads and downloads it tells me that my host memory is Pageable and not Pinned as I would expect for page-locked host memory. I have been able to use cv2.cuda.registerPageLocked() to create (much faster) Pinned host memory so I believe what Visual Profiler is telling me. I've tried this same test with cv2.cuda.HostMem_SHARED and I get the same results.
Can someone please tell me if I'm creating the host memory and the Mat and GpuMat references incorrectly?
Also, when I do succeed in creating SHARED host memory, how do I get a GpuMat reference to it? I feel like I should be using HostMem's createGpuMatHeader() method for this but it doesn't have a Python binding.
Thanks for any help I can get. I've been stuck on this for three days.