|   | 1 |  initial version  | 
If your goal is speed, you shouldn't be making a copy. Nor should you be using non-contiguous allocations like a vector of vectors.
Instead, invert this problem. Create a storage vector up-front and then create cv::Mat on top of that.
You want a single flat contiguous allocation for cache coherency and you want to avoid the copy as well.
In essence, just do:
auto storage = std::vector<uchar>(480 * 640 * 3); // or however many channels you have
auto mat     = cv::Mat_<uchar>(480, 640, CV_8UC3, storage.data());