Bag of words training Image conditions

I'm planing to use the BOW approach , how much the angle / view should be differ between two Image of the same sene/object of interest ? In other world , considering these two images:

is the two image a good candidate for the training phase ?or one of them are sufficient in case of BOW with SURF Feature & descriptor Extractor Notice: to see the different between the two image , look at the upper right corner.

One of the images should suffice, since the local descriptors of both images won't be very different and thus the global one (the BoW descriptor) won't be very different, too. Choose larger differences (distance, angle of view) and different lighting conditions. See for example the Oxford5k dataset (although there are also several pictures with only subtle changes).

Thanks M.r Guanta .

