Ask Your Question

What algorithm would be optimal in this situation?

asked 2015-05-21 17:27:14 -0500

archer71 gravatar image

updated 2015-05-22 08:37:52 -0500

Please suggest what algorithms/tutorials/ how can I achieve the following steps:

  1. Extract features from a (one) high-quality image on the web.
  2. Transform into a .xml or .dat file.
  3. Port file to an ARM, ios or android.
  4. Obtain video frames
  5. Apply image recognition, feature extraction etc. to detected object
  6. Get coordinates of objects on every frame scanned Out of scope of openCV but maybe someone can help:
  7. Render a video on top of the coordinates
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2015-05-22 06:49:07 -0500

thdrksdfthmn gravatar image

You can start like this:

  1. Detect features, or maybe using gpu, then extract descriptors; the choosing of the descriptors (and features also) is linked to your application (what are you trying to do). Here you have some explications about the features and descriptors.
  2. Use FileStorage for saving to .xml (and for loading from xml too)
  3. XML should be portable, so no problem using it on different environments
  4. For reading video frames you can use VideoCapture
  5. To detect object you can inspire from this example. But for this you need also to match the descriptors
  6. Maybe also using tracking for not detecting in every whole frame but in a small region
  7. For playing a video inside another I have no example, but you can use 2 VideoCapture and put the frame from one in the detected area of the frame of the other capture. For deforming the inside frame you can use warpAffine (or other geometric transformation you need).

Then you can save the new video of play it directly... Hope it helped. You can ask again after you started something and say what it doesn't work.

edit flag offensive delete link more


From what I have understood from reading the documentation, I should choose the same algorithms for both feature extraction, creating the descriptiors and then detecting the object. So first questions I have is what algorithm is the most optimal for this use-case? I will only be using iPhone 5S and newer phones, so I am expecting very good FPS, but I would want the same algorithm to be used because I understood it gives the best results. So what do you recommend? SURF/SIFT/FAST/ORB/ some deep learning?

archer71 gravatar imagearcher71 ( 2015-05-22 07:59:43 -0500 )edit

I have tried some of these, and it seems that if the descriptors have many info, then they are slower; so I would suggest you to start with SIFT and SURF, and if the FPS is not enough, then try ORB. FAST has no descriptors extractor... More if you are using C++, then you can also use gpu, it will be much faster!

thdrksdfthmn gravatar imagethdrksdfthmn ( 2015-05-22 09:22:56 -0500 )edit

Question Tools

1 follower


Asked: 2015-05-21 17:27:14 -0500

Seen: 184 times

Last updated: May 22 '15