[Python] VideoCapture .read() is way too CPU intensive

asked 2018-07-11 22:56:36 -0500

schmidtbag gravatar image

updated 2018-07-12 22:15:18 -0500

So I'm trying to process a 720p@30FPS video live from a webcam, on a 1.7GHz quad core ARM CPU. Right now, the only thing holding me back is using the .read() feature. In my loop, I have nothing but "camera.read()" (not even assigned to any variables) and some time.time() counters to tell me the framerate. Nothing else is happening, and yet, this maxes out the CPU core. Best case scenario, I get 21FPS.

This doesn't make sense to me - how is it so expensive to grab a frame from the camera and literally do nothing with it? Can anything be done about this?

EDIT: I just came to realize that this problem doesn't appear to be specific to VideoCapture or read() - just using imread() with a 720p JPEG is equally as slow. This leads me to believe that the slowdown is specific to the decoding process.

I tried a little experimentation and took the same JPEG image but converted it into BMP and PNG, using the same imread() command in my loop. I found the PNG dropped down to around 14FPS, while the BMP was in the mid 60s.

My webcam only supports MJPG for 720p@30FPS. Is there any way at all I can speed up this process? I tried using PIL along with the multiprocessing library, but they only slow things down further.

edit retag flag offensive close merge delete


please use time.clock() or cv2.getTickCount(), NOT time.time() for this.

berak gravatar imageberak ( 2018-07-12 01:28:08 -0500 )edit

Frankly, the method of getting framerate is a bit besides the point. The CPU is getting 100% maxed out without doing any processing, and I know for a fact I'm not getting all of the frames. If I run the same code on a much faster x86 PC, I get the ~30FPS I'm looking for. But... even on that faster x86 PC, just using the read() function is crazy CPU intensive. This specifically is the problem I want to address.

schmidtbag gravatar imageschmidtbag ( 2018-07-12 07:36:34 -0500 )edit

Here's the thing. You can get raw images (such as BMP) or encoded images, such as JPEG, PNG, or MJPG. Raw images take no processing to read. The time for the BMP is just the time to read it from the disk and a tiny bit of overhead.

The encoded images, meanwhile, take time to be decoded. It's entirely possible that it does in fact take that long to decode. If you got a webcam that didn't output MJPG and instead gave raw frames, perhaps that would work better.

Are you just initializing and calling read, or do you have setting such that it's re-sizing, altering color spaces, ect?

Tetragramm gravatar imageTetragramm ( 2018-07-13 17:38:08 -0500 )edit

Yes, I discovered last night how the taxing the decoding process of JPEGs are.

The webcam does have raw image support, but only for 640x480 (if I want to retain the 30FPS). Beyond that and the frame rate drops. I don't think most webcams support 720p or higher at 30FPS with raw data.

I am strictly initializing (outside of the loop) and calling read (inside the loop). During the loop, I'm not doing any kind of alterations.

schmidtbag gravatar imageschmidtbag ( 2018-07-13 18:50:00 -0500 )edit