Ask Your Question

What does BOWTrainer::clear() exactly do?

asked 2014-07-03 03:34:51 -0600

thdrksdfthmn gravatar image

updated 2014-07-03 04:58:22 -0600

berak gravatar image

I have seen a problem in my application of trying to find the best classifier on a db: I create the BOWTrainer before the loop where I train the classifier, so I was always adding the new set of descriptors. I am not sure what BOWTrainer::clear() does: it just clears the descriptors set, or it deletes the whole object, so I need to declare another one?

In fact how is better to do: to declare the BOWTrainer object in the loop (so every iteration), or just call its clear() function inside the loop?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2014-07-03 05:38:32 -0600

Guanta gravatar image
  1. clear() removes the internally saved descriptors, so you'd need to add new ones before training it.

  2. Why do you actually need to train it several times? Actually you need to train your BoWTrainer only once with the whole set of features, then for each image you compute your BoWDescriptor and train a classifier (e.g. SVM) which gives you the class.

edit flag offensive delete link more


And if I have many images like 1000 per class? Isn't it better to use just the best 500? I choose randomly the images and every time I get a different error rate, So, I pick just the best classifier. Thanks the advice.

thdrksdfthmn gravatar imagethdrksdfthmn ( 2014-07-03 05:58:27 -0600 )edit

In general: the more images you have the better for the classifier (e.g. SVM) training of the BoW-descriptors. What do you mean with you just use the 500 best? And what do you mean with randomly choosing images and error rate? You have one BoWTrainer which trains you the vocabulary which you need to compute the BoW-Descriptors (one for each image), finally you classify them.

Guanta gravatar imageGuanta ( 2014-07-03 06:16:04 -0600 )edit

The more photos, the best train is false. That is why the term overfitting was invented. Am I wrong: in to many images you will find a lot of variations?

thdrksdfthmn gravatar imagethdrksdfthmn ( 2014-07-03 06:43:48 -0600 )edit

You can overfit your training model if you don't test it on a seperate test set. So yes, you can of course overfit, if your classifier doesn't generalize well on unseen data, but I wouldn't worry for it if you have separate train/validation/test sets.

Guanta gravatar imageGuanta ( 2014-07-03 06:51:55 -0600 )edit

I do not have. And for each train picking randomly the images (the remaining ones I use for test) I get a different error rate

thdrksdfthmn gravatar imagethdrksdfthmn ( 2014-07-03 07:02:23 -0600 )edit

Okay, this would be one option, but then you need many runs and average the results, however I'd rather do either cross-validation or select and define a train/(validation/)test set, e.g. select randomly 60% of each class for training, 20% for validation (the set which you use to calibrate the best classifier parameters) and a test set for the final results.

Guanta gravatar imageGuanta ( 2014-07-03 07:15:40 -0600 )edit

Yes, I was using a number (fixed) of images for training and the rest I use for testing

thdrksdfthmn gravatar imagethdrksdfthmn ( 2014-07-03 07:34:11 -0600 )edit

Question Tools


Asked: 2014-07-03 03:34:51 -0600

Seen: 237 times

Last updated: Jul 03 '14