Ask Your Question

Revision history [back]

BOWKMeansTrainer vocabulary has incorrect dimensions?

I am porting some code over from the Python scikit library to OpenCV and I'm trying to use the BOWKMeansTrainer class to cluster SIFT feature descriptors into a vocabulary. The vocabulary returned does not seem to be the correct dimensions, however.

I have added a set of descriptors to my trainer and used the cluster function with K = 50 (gives best performance for my dataset based on my tests using scikit). What is puzzling me is that the vocabulary returned is 50x1. Shouldn't it be 50x128?

I am using OpenCV 3.0.0-dev.

Here's some code so you can see what I'm doing:

bow_trainer = cv2.BOWKMeansTrainer(50)
sift = cv2.DescriptorExtractor_create("SIFT")
dense = cv2.FeatureDetector_create("Dense")

# read SIFT descriptors from image set and add to BoW trainer
for img_filename in image_name_set:
    # read and prepare image for SIFT
    img = cv2.imread(img_filename)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img = cv2.equalizeHist(img)

    # extract keypoints using dense keypoint detector
    keypoints = dense.detect(img)
    keypoints, descriptors = sift.compute(img, keypoints)

    # add descriptors to BoW trainer
    for desc in descriptors:
        bow_trainer.add(desc)

# cluster descriptors and create BoW vocabulary
vocab = bow_trainer.cluster  # returns a 50x1 ndarray instead of expect 50x128 ndarray