# Logo detection / identification on TV screens

Hi,

I want to do "TV channel" detection by identifying the "channel logo".

Suppose I need to identify two channels: A and B. I grab some sample screenshots, and find out the region of these logos in their respective screenshots as ROI_A and ROI_B.

I'm trying with opencv's feature2d framework (SurfFeatureDetector/SurfDescriptorExtractor and FlannBasedMatcher, but the result is not working (details follow).

Now the detection task is performed as:

1. Given a screenshot of an unknown channel, the image SCENE.
2. For logo A:
• get A's pre-calculated SURF data (keypoints, descriptors)
• calculate SCENE(ROI_A) region's SURF data
• use FlannBasedMatcher upon these descriptors to get some "match" data
• repeat such process for other logos (here the B)
3. I have posted a resultant picture, the right side is the sample scene, the left side has two different logos. The upper one is a random one, the bottom one is the same as the logo on the scene. But the upper one obviously have a lot of "matches", while the bottom one barely has few. i.e., totally wrong.

So my questions are:

1. For my task, the logo's region is fixed for a specific one, and no rotation or scaling need to be considered. So is the feature2d framework appropriate for this, or is there some other opencv feature better fit for this?
2. How do I get a good logo image as the object, currently I find some clean scenes (those with pure color background) and use the logo on that scene as target object.
3. I thought I could count on the number of "matches" to get the correct channel (the logo with the most matches will be regarded as the channel). But the experiment shows this is not reliable. As I only want to know whether it's matched, but not the object's location, how to do this?

I'm expecting to identify about 70 channels, but currently with two sample channels, the result is misleading most time. Any help is appreciated and thanks for your time.

Test result picture:

edit retag close merge delete

1

I would try LBP, but this is only my opinion.

( 2013-07-31 12:09:35 -0500 )edit
2

Doesn't need to be LBP, but indeed, a CascadeClassifier in conjunction with LBP or HOG would have been my first choice here as well...

( 2013-07-31 16:02:26 -0500 )edit

@zeroxia . i am also looking forward to implement Logo identification from scrren shots. here i am getting the images with help of ffmpeg. i need to identify the channels. I am completely new to this. how can i implement this ? I am working on Linux

( 2013-09-02 11:24:19 -0500 )edit

@suku.raghav: Please go through the OpenCV-tutorials and try it yourself, for particular questions ask a new question!

( 2013-09-13 05:48:55 -0500 )edit

Sort by » oldest newest most voted

Several steps could improve your attempt:

• Produce more keypoints, e.g. by lowering the threshold for SURF or by chosing a different keypoint detector.
• Filter your matches via cross-check / ratio-check (you'll find them in this forum if you search for it).
• And finally, the most important: Compute the transformation matrix (i.e. homography matrix via RANSAC) between the matched points and check if the transformation is valid at all, i.e. in your case there should be nearly no rotation/scale/shear.

If this still doesn't help you, you can also try simple template-matching (see http://docs.opencv.org/modules/imgproc/doc/object_detection.html?highlight=matchtemplate#matchtemplate) or, more advanced, train a classifier, e.g. a cascade-classifier w. a texture-feature like HOG/LBP/HAAR (OpenCV provides you all the necessary tools to create your training-set and so on, see http://docs.opencv.org/doc/user_guide/ug_traincascade.html and http://docs.opencv.org/doc/tutorials/objdetect/cascade_classifier/cascade_classifier.html#cascade-classifier and the various topics in this forum).

Good Luck!

more

For the template matching method, the normalized "result matrix" is all values between 0 ~ 1, the "min" or "max" point is selected as the "match", whatever the template is, so I don't know how to find out a "negative" result. Can you provide some hints on this?

And thanks a lot for all above input. I will try your suggestions and update back.

( 2013-07-31 20:33:50 -0500 )edit

If you use CV_TM_CCOEFF_NORMED, then the closer your max-value is to 1, the more probable you found your match. So, for each of your logos, run matchTemplate and pick the one which gives the highest value. Of course this still may give false positive due to different background.

( 2013-08-01 03:00:07 -0500 )edit

==== DELETE, see below "answer" post for the comment ====

( 2013-10-28 02:50:31 -0500 )edit

Hi all,

This is not the answer, I'm using this as my feedback and update to my previous try since the "comment" function cannot accommodate so much text.

As described above, feature2d experiment gave wrong result, I think the TV channel logo is too small to be detected by feature2d algorithms, so I resort to template matching.

First I assume the logo position is fixed, so the exact logo region is cropped out and matched agains a prepared template, this can output promising identification.

But the logo position turns out to be NOT fixed, i.e., there is some shift. I have to match the template against a larger area, and this does not guarantee correct result.

Now I come back to this question, and have to consider the feature2d algorithms again, as this feature2d technique seems a more natural solution for my problem.

@GilLevi & @Guanta: Thank you for the suggestions. I have not came across the "LBP" word as I dig through the opencv docs, can you point me to a link, something like the opencv tutorials would be great.

And for other terms, "cross-check", "ratio-check", "RANSAC", and "transformation matrix", pardon my ignorance, I think I have to dig much deeper, but if you can point me to specific resources I'd be much appreciated.

Thank you again for your time and help.

more

Ratio check: calculate the distance of a descriptor to it's two nearest neighbors, denote it by d1,d2 respectively. if d1/d2<0.8 (or some other threshold) then its a match between the descriptor and it's nearest neighbor.

( 2013-10-28 09:20:06 -0500 )edit

Thank you, I see LBP is for "cascade classifier", which requires a training stage, and that takes much time. I'm currently focusing on object detect techniques (like SURF), but I'm still doubting that the logos are too small to provide sufficient keypoints to justify a positive match (although the screen captures are 720p or even 1080p, but they're scaled up from PAL/NTSC pictures, i.e., 576p or 480p). Anyway I'm trying with additional optimizations you guys have mentioned like cross-check, ratio-check to refine the matching. @GilLevi thanks again for your fast reply and the clarification on "ratio check".

( 2013-10-29 21:56:17 -0500 )edit

I'm not sure keypoints and descriptors are the right approach.The other suggestions in this thread seem more appropriate for the problem, in my opinion.

( 2013-10-30 04:51:54 -0500 )edit

Official site

GitHub

Wiki

Documentation

## Stats

Asked: 2013-07-31 05:56:39 -0500

Seen: 4,775 times

Last updated: Oct 28 '13