Is multithreading handled differently in Python and C++?

asked 2020-10-02 02:39:32 -0600

tim-hilt
16 ●1 ●1

Hi there! I recently built a simple OpenCV-application in Python, that uses ORB and a BruteForce-Matcher to compare image-pairs that live in two separate datasets. The code for that was as follows:

def compute_descriptors(path):
    nfeatures = 500
    files = [x for x in path.iterdir() if x.is_file()]
    orb = cv2.ORB_create(nfeatures)
    descs = np.empty((nfeatures, 32, len(files)), dtype=np.uint8)

    for i, file in enumerate(files):
        img = cv2.imread(str(file), 0)
        img = cv2.equalizeHist(img)  # Histogram equalization to bring out more detail
        kpts, desc = orb.detectAndCompute(img, None)
        descs[:,:,i] = desc

query_path = pathlib.Path("/PATH/TO/QUERY-DATASET")
train_path = pathlib.Path("/PATH/TO/TRAIN-DATASET")
query_descriptors = compute_descriptors(query_path)
train_descriptors = compute_descriptors(train_path)

nfeatures = query_descriptors.shape[0]
scores = np.zeros((query_descriptors.shape[-1], train_descriptors.shape[-1]))

matcher = cv2.BFMatcher_create(cv2.NORM_HAMMING)
for i in range(scores.shape[0]):
    for j in range(scores.shape[1]):
        matches = matcher.knnMatch(query_descs[:,:,i], train_descs[:,:,j], k=2)
        good = 0
        # Apply ratio-test
        for m, n in matches:
            if m.distance < threshold * n.distance:
                good += 1
        scores[i,j] = good / nfeatures

The result is a 2d-array of shape (n, m), where n is the count of images in the query-dataset and m is the count of images in the train-dataset. The matching image for the current query-image is chosen by finding the maximum proportion of good features to the total nfeatures.

When i run this code i see two active threads. Looking at the source code for BFMatcher there is batch_distance that calculates the actual distances of two descriptors. Furthermore, there is parallel_for_(https://github.com/opencv/opencv/blob...), which (to my understanding) is able to run the function in parallel, i.e. in multiple threads which would explain the two active threads.

When i ran cv2.getNumThreads() however, i got back 4, even though only 2 were run.

The issue got even more weird, when porting the above code to C++. The code was roughly the following:

#include <vector>
#include <string>
#include <filesystem>

#include <opencv2/features2d.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>


using namespace cv;

std::vector<Mat> computeDescriptors(const std::string &path, const int _nfeatures=500) {
    // Creating ORB-object
    Ptr<Feature2D> orb {ORB::create(_nfeatures)};

    // Declaring needed variables
    Mat tmp, descriptors;
    std::vector<KeyPoint> kpts;
    std::vector<Mat> descriptors_vec;

    // Computation
    for (const auto & entry : std::filesystem::directory_iterator(path)) {
        Mat img {imread(entry.path(), IMREAD_GRAYSCALE)};
        equalizeHist(img, tmp);
        orb->detectAndCompute(tmp, noArray(), kpts, descriptors);
        descriptors_vec.push_back(descriptors);
    }
    return descriptors_vec;
}

Mat computeScores(const std::vector<Mat> &queryDescriptors_vec,  const std::vector<Mat> &trainDescriptors_vec) {

    Mat_<float> scores(queryDescriptors_vec.size(), trainDescriptors_vec.size(), 0.0f);
    Ptr<DescriptorMatcher> matcher;
    auto nfeatures {queryDescriptors_vec[0].rows};
    matcher = BFMatcher::create(NORM_HAMMING);

    std::vector<std::vector<DMatch>> knn_matches;

    for (size_t i=0; i < queryDescriptors_vec.size(); ++i) {
        for (size_t j = 0; j < trainDescriptors_vec.size(); ++j) {
            matcher->knnMatch(queryDescriptors_vec[i], trainDescriptors_vec[j], knn_matches, 2);
            // Filter matches using Lowes ratio test
            unsigned int good_matches{0};
            for (auto &matches : knn_matches)
                if (matches[0].distance < 0.6f * matches[1].distance)
                    good_matches++;
            // Write score to scores-Mat
            const float ratio {(float) good_matches / (float) nfeatures};
            scores.at<float>(i, j ...

(more)

edit retag flag offensive close merge delete

add a comment

answered 2020-10-05 09:33:02 -0600

tim-hilt
16 ●1 ●1

There were two bugs in the above code, that i found in the meantime:

When i calculate the vector of feature-descriptors, i do descriptors_vec.push_back(descriptors). However i should write descriptors_vec.push_back(descriptors.clone()), otherwise the reference get's appended and all descriptors in descriptors_vec have the same content.
Having std::vector<std::vector<DMatch>> knn_matches declared outside of the for-loop makes the matcher append to it every time new matches are computed.

Resolving these two issues made the code run as i intended it to. As a bonus the code now runs on all 4 threads all by itself. :-)

edit flag offensive delete link

add a comment

Is multithreading handled differently in Python and C++?

1 answer

Links

Question Tools

Stats

Related questions

Is multithreading handled differently in Python and C++? edit

1 answer

Links

Question Tools

Stats

Related questions

Is multithreading handled differently in Python and C++?