ML Opencv Question [closed]

asked 2019-10-24 05:29:22 -0600

holger gravatar image

updated 2019-10-25 09:33:52 -0600

Hello,

I hope my question is not too off topic but its a least a bit opencv related. In my current task i need to determine which label belongs to which inputfield. I have the boxes of all text and inputs and their type(text box and input box and their type(radio, input, etc..)) available. (I uses opencv dnn module for getting these information and also other sources)

So my idea is now to use linear regression(trained on some collected data) to determine the top left(tl) and bottom right(br) points for a label for a given input element.

Now my question is: Does this approach makes sense? Lienar regression only outputs a single variable from what i remember - do i need to train multiple "regressors"(two for tl, and two for br - sounds a bit strange to me)? Is this solvable as classification problem(i would say no at this point - maybe i am wrong)? Can i do linear regression with opencv(its more a computer vision library - but it has that nice dnn module(even with cuda support now on master!))

I read that most neural networks uses regressors for finding the correct bounding box(instead of using sliding window approach(noone does this) or anchor boxes(used in yolo for example)).

Thank you very much, Holger

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by holger
close date 2019-10-25 07:51:01.562652

Comments

1

to determine the top left(tl) and bottom right(br) points for a label for a given input element.

you're not alone with that idea

Can i do linear regression with opencv

sure, using some ml class. the dnn code unfortunately does not cover any network training (you'd need caffe, tf, torch or the like)

berak gravatar imageberak ( 2019-10-24 05:41:57 -0600 )edit

Hmm i just looked into this -its not really what i need i think. I need to determine a relationship between two "boxes" : "Is boxA(text)" related to boxB(input element)".

My idea is to let a ML algorith determine this relationship for me. Hmmm during writing i have this idea: - How about i introduce an arbitrary id feature for each of the boxes and the taks for the regressor is to give me the id of the related box, which will be a single number. My only concern here is that the algorithm maybe learns some pattern on the id feature instead of the bbox and type information. But if i keep id unique and random....

I should maybe just try this out.(I will try on opencv once i trained my model ) Thank you again for making me think a bit :-)

Greetings, Holger

holger gravatar imageholger ( 2019-10-24 06:02:57 -0600 )edit

I had a "bad hair / brain" day yesterday - I wrote " How about i introduce an arbitrary id feature". How are u supposed to predict something which has no correlation at all with the rest of the features. This is just plain b*shit.The prediction will also be random and this is not how linear regression works. But what you can do is introduce a new feature which is a "summary" over all other features acts also as a kind of id. And this you can predict. If you know how you build your summary feature - you can also use this to decode width, height, and so on i think.

Sigh Holger... And about the corner net - its related and i can take maybe some inspirations from there - lets see.

holger gravatar imageholger ( 2019-10-25 07:49:50 -0600 )edit