Ask Your Question

ANDROID: Using CameraBridgeViewBase to capture camera frames. Only getting ~3 frames per second when detecting objects with classifier...

asked 2013-08-28 10:30:02 -0500

Max Muster gravatar image

updated 2013-08-28 10:39:38 -0500

hello there again.

what i did so far:

in the link above you can see that i am trying to write an android app to detect and recognize buttons in a car. so far i have successfully trained two CascadeClassifiers and they work smooth on the desktop-programm i wrote to try them. i am using videocapture in the desktop-program to grab frames from my webcam and then detect the objects on these frames. i am getting a smooth and fast videostream, so everything is fine.

the problem:

now i've been trying to write that android app, and therefore i have looked at the source of the facedetection-example that comes with opencv for android. i basically copied most of the stuff from the FdActivity class to my own project, made it load my cascade files and let it run. everything works, and it detects my buttons somewhat good, but the problem is, that i only get around 3 frames per second. im using the CameraBridgeViewBase just like in the example, and im doing all of the detection-realted code in the onCameraFrame()-method. if i comment that code within that method out, im getting between 20 and 30 frames per second. i did not copy any code from the DetectionBasedTracker class, since that is native code and i would prefer not to use any of that. also i did not copy any code from the FdActivity class that is related to the DetectionBasedTracker class.


  • the frames recieved have a size of 800x480 pixels.
  • i tried to downsize them, but the lowest my camera supports is 640x382 pixels, so i can't go any lower without the app saying "it seems that your device does not support camera or it is locked" on startup.
  • i have tried to downsize the Mat in which i store the frame in the onCameraFrame()-method to an even smaller size. that works, but at the end of the method this Mat is returned and is giving me an exception, since this Mat has to be converted back to Bitmap by then (thats what i read from the exception) and it can't do that as the Bitmap has to be the size of the original frame (800*480).
  • therefore i tried to scale the Mat up to 800*480 again before returning it, but that of course gave me a blurred and ugly looking picture.
  • i set the minimumSize within the detection method to 60*60 pixels to prevent it from detecting too many small objects.

so my question:

how is it possible, that the facedetection-app runs that smooth on my device, while my own app only gets around 3 frames per second? @StevenPuttemans since you helped me so great with my last question, maybe you have any idea?

btw, i add my code here when i can access it again, which will be in about 12 hours.

edit retag flag offensive close merge delete


Can you please add your optimized code?

jaredsburrows gravatar imagejaredsburrows ( 2014-01-06 09:36:41 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2013-08-28 11:40:05 -0500

Some suggestions on limiting the enourmous space of your image in which each window has to pass the classifier:

  1. Use the lowest resolution your application allows you too, 640x328 in your case.
  2. Like you said, downsample the matrix to the half size in both dimensions.
  3. Use a detection range, just like a minSize there is also the possibility to use the maxSize attribute. Just take 10 shots, use your computer to define average button size in your setup and then take like range from 0.85 x range to 1.15 x range.
  4. Also if you can first do some segmentation of your space, and remove all areas that you are certain of do not contain the actual object, then create a binary mask and only detect in windows that lie in the smallest bounding box of your blobs of possible candidates.

These tactics will drastically reduce your searchspace and give you a speed advantage. However getting speeds of 20-30 frames will be hard. I think 15 fps should be possible, which is almost realtime.

edit flag offensive delete link more



@StevenPuttemans thanks, im glad you replied that fast to me :) alright, step 1 was done already. for step 2, if i downsample the matrix to the half size in both dimensions, then i have to upsample it again before i return it at the end of the method, otherwise i will get that exception again, since the bitmap to be created is expected to be the same size as the inputframes coming from the CameraBridgeViewBase. or am i getting that wrong? for step 3, alright, so i should set a maxSize aswell, i will do that. for step 4, sounds fair, i will be expecting the user to center the button in the middle of the picture, so i guess i could forget about like 50% of the picture before doing the detection. i will try your suggestions as soon as i can access my code again, thanks :)

Max Muster gravatar imageMax Muster ( 2013-08-28 13:25:18 -0500 )edit

one more question to you though: you know the code from the facedetection-example i am talking about, right? do you see the reason why their code is getting me more frames per second even without doing any downsampling or segmentation / area removing? just curious how they do it. thanks in advance :)

Max Muster gravatar imageMax Muster ( 2013-08-28 13:30:04 -0500 )edit

Ok to answer your new popped up questions. 1) Yes if you want to output the detections on a bitmap, you need to resize your mat again, or you could use the downsampled image for detection, calculate from the bounding boxes, the size of the original bounding boxes and use native code to draw directly on the bitmap. I guess that should be posible. 2) I have no experience with android programming, only with pc programming, but i think they do the same as suggested. It is a common approach to speed up processing.

StevenPuttemans gravatar imageStevenPuttemans ( 2013-08-28 15:25:56 -0500 )edit

@StevenPuttemans thanks :) i now set an upper bound for the maximum resolution that can be used for the frames. this way i prevent too big images. also i now create a submat of the center of every frame i get, so that more than 50% of the picture is not even being considered for detecting. this gave me a boost up to about 15 fps on my htc desire hd. on newer devices, like a google nexus, it even gives more than 20 fps. so conclusion can be, that lowering the framesize and cutting the region with will be used for detection helped me solve that problem so far :) thank you very much for you help.

Max Muster gravatar imageMax Muster ( 2013-08-29 02:10:51 -0500 )edit

Actually if you think about the concept between a cascade classifier, using the principle of a scale pyramid, like illustrated here, then it is quite normal that reducing the scale by half will have drastic influencing ;)

StevenPuttemans gravatar imageStevenPuttemans ( 2013-08-29 02:53:02 -0500 )edit

Question Tools


Asked: 2013-08-28 10:30:02 -0500

Seen: 3,007 times

Last updated: Aug 28 '13