Using HOG features to update bag of words background model is too computationally intensive
I am trying to build a visual bag of words model using HOG descriptors. This is part of patch-level background model for foreground-background selection in cluttered video. I am following this paper here. The paper describes patches of 32 * 32, but is vague as to the rest of the hog properties (see below).
I am using opencv 3.2 binary on windows, python 3.5.
I first create a BOW class with a dictionary size 128:
self.BOW=cv2.BOWKMeansTrainer(128)
I'll be using HOG descriptors with the following parameters (but see below).
#HOG descriptor
winSize = (32,32)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 32
self.calc_HOG = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
I am using MOG background subtraction to generate my initial background proposal, which I then feed into the BOW model for patch-level features.
An image has the following shape
self.bg_image.shape
(509, 905, 3)
So for each frame, if its been classified as background by MOG I add the HOG feature to the BOW model
self.BOW.add(self.calc_HOG.compute(self.bg_image))
and I'll go along like that until a potential foreground object needs to be checked.
When a frame is potentially foreground, I cluster the BOW descriptors
self.background_vocab=self.BOW.cluster()
Generate an extractor
self.extract_bow = cv2.BOWImgDescriptorExtractor(self.calc_HOG, self.matcher)
create a vocabulary
self.extract_bow.setVocabulary(self.background_vocab)
and compare the histogram of the current crop to the corresponding part of the background image using the HOG-clustered visual words.
current_BOWhist=self.extract_bow.compute(current)
print("Extract background HOG feature")
background_BOWhist=self.extract_bow.compute(background)
print("Compare Background and Foreground Masks")
BOW_dist =cv2.compareHist(current_BOWhist, background_BOWhist, method=cv2.HISTCMP_CHISQR)
The memory use/performance of this often cited strategy seems completely unusable. the BOW.cluster() method, documented here is incredibly memory intensive and basically locks up the computer. Even on just a few descriptors it takes 10 to 20 seconds per call. This is especially perplexing because the above paper, and many similar papers, specify that the background model is continuously updated, that is, many calls of cluster() and generating new a vocabulary every time a frame is classified as background, its hog features are added.
So in my example file, I go through 30 frames of adding HOG features
a=self.BOW.getDescriptors()
len(a)
30
But this has just a huge number of descriptors.
self.BOW.descriptorsCount()
576979200
I think is the problem, this seems like way too many points for kmeans to compute. Is there something wrong with my HOG descriptor properties? From the cited paper above
"The BoW dictionary size is 128. During the verification step, the MSDPs are resized according to their aspect ratios to make sure all patches contain roughly the same amount of pixels. In order to tackle with the various aspect ratios, which is inevitable during segmentation, we let ...
amazing, that you got that far, even. opencv's Bow functions are meant to be used with feature2d extractors/descriptors, not at all with HOG.
it's an absolute mystery to me, why this did not crash (LOUD):
problem 1 might be here:
that's the (509, 905, 3) image ? (not a 32x32 grayscale patch ?)
if so, it will generate insanely long features (also in the wrong shape)
what's the shape of self.background_vocab ?
Thanks @berak, would it be best for me to just make my own BOW routine for a hog feature? And yes you've hit on an ongoing question I have. The authors refer to 32 * 32 "patches", whether those "patches" are actually the "windows" in the HOG feature, or if you first cut the images into patches and perform hog calculations on each patch.