Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Parallelizing GPU processing of multiple images

For each frame of a video, I apply some transformations and then write the frame out to an image file. I am using OpenCV's CUDA API for this, so it looks something like this, in a loop:

# read frame from video
_, frame = video.read()

# upload frame to GPU
frame = cv2.cuda_GpuMat(frame)

# create a CUDA stream
stream = cv2.cuda_Stream()

# do things to the frame
# ...

# download the frame to CPU memory
frame = frame.download()

# wait for the stream to complete (CPU memory available)
stream.waitForCompletion()

# save frame out to disk
# ...

Since I send a single frame to the GPU, and then wait for its completion at the end of the loop, I can only process one frame at a time.

What I would like to do is send multiple frames (in multiple streams) to the GPU to be processed at the same time, then save them to disk as the work gets finished.

What is the best way to do this?

Parallelizing GPU processing of multiple images

For each frame of a video, I apply some transformations and then write the frame out to an image file. I am using OpenCV's CUDA API for this, so it looks something like this, in a loop:

# read frame from video
_, frame = video.read()

# upload frame to GPU
frame = cv2.cuda_GpuMat(frame)

# create a CUDA stream
stream = cv2.cuda_Stream()

# do things to the frame
# ...

# download the frame to CPU memory
frame = frame.download()
frame.download(steam=stream)

# wait for the stream to complete (CPU memory available)
stream.waitForCompletion()

# save frame out to disk
# ...

Since I send a single frame to the GPU, and then wait for its completion at the end of the loop, I can only process one frame at a time.

What I would like to do is send multiple frames (in multiple streams) to the GPU to be processed at the same time, then save them to disk as the work gets finished.

What is the best way to do this?