Well, your code is quite complicated, I didn't test it, and I don't understand all of it either. This is more a comment (but it's too long, so I have to put it as an answer). So a few ideas:

  • keep your code simple, it's much easier to find a bug.
  • use opencv routines to copy an image (frm.copyTo()) instead of std::move.
  • capturing from a webcam doesn't take much time, so it isn't necessary to parallelize it*.

It will be much easier to capture the image, then do the processing in parallel, and when it's ready, capture the next frame. Here's a pseudocode:

Rect roi[10];
Mat frm(480,640,CV_8UC3);
Mat subfrm[10];
for(int i=0;i<10;i++)subfrm[i]=frm(roi[i]);  //define 10 ROIs on the image
    cap>>frm;   //get the image
    for(int i=0;i<10;i++)thread[i].run(subfrm[i]);  //launch a thread for each ROI
    for(int i=0;i<10;i++)thread[i].join();  //wait for all the threads to finish
    imshow("Result",frm); //display the result

This code solves the problem of combining the frames, and saves some time because it won't make copies of the image. You won't need global variables or concurrency check (mutex).

*Of course, this isn't true anymore if you want to capture a large quantity of data (e.g. high resolution RAW image with high framerate) in a time-critical application.