Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

copyTo is oddly slow for (at least for planes of 3D Mats)

I know this isn't really a question though I guess I would like to know why. I think it's important to share and I didn't know where else to post this.

Basic setup: reading a frames from a file and inserting them in a preallocated plane (aka slice) of a 3d matrix. After taking some timings I realized that all the time was being spent in copyTo vs the read. So, then I hacked my code into just a copyTo speed test and then I compared that to the equivalent Matlab test.

This is done on a win7x64. The opencv test was compiled in x64 Release (FYI: 2x faster than Debug)

Note that Matlab's memory is contiguous w/r to its first index whereas openCV it's last so to make the comparison fair, I reversed the order of the indices for my matlab test (non-contiguous memory spacing same for Matlab's first dimension slice and opencv's 3rd dimension slice) . Since the matlab test is so much shorter etc, I will give it first with its times.

1) Matlab test. This ran in about 0.19ms/frame. (i.e. top copy F into A(i,:,:))

F=ones(1,448,48,'uint8');
A = zeros(1000,448,48,'uint8');
tic; 
for i=1:1000 
     A(i,:,:) = F; 
end; 
T=toc  % = 0.19 sec. Given 1000 frames also == ms/frame

2) Opencv equivalent. This takes about 1.4 ms / frame. Note, copyTo doesn't treat 2d marices and the equivalent 3d slices the same so extra variables are needed to provide a the same 3d slice view to both the 2d source and 3d slice target to make copyTo "happy". The views point to the same matrix data, or portion thereof, as their original variables.

    int nReadFrames = (int)cap.get(CV_CAP_PROP_FRAME_COUNT);
    int nFrameCols = (int)cap.get(CV_CAP_PROP_FRAME_WIDTH); // 448
    int nFrameRows = (int)cap.get(CV_CAP_PROP_FRAME_HEIGHT); // 48


    int a3DMsizes[3] = {nFrameRows,nFrameCols,nReadFrames};
    int aslicesizes[3] = {nFrameRows,nFrameCols,1};
    Mat MovieFrames3D = Mat::zeros(3,a3DMsizes,CV_8UC1); // target 3d array
    Mat FramePlane; // Generic slice that will hava a 3d slice (plane) view of the 2d matrices 
    Range MovieFrames3DRanges[] = {Range::all(),Range::all(),Range(0,1)}; // first slice range; 
    Mat MovieFrames3DPlane; // slice of the 3d target array

    // for copyTo test only.  Comment out for reade test.
    Mat Frame = Mat::ones(nFrameRows,nFrameCols,CV_8UC1);  // for non read/copyTo spead test.  
    FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // for non read/copyTo spead test.  

    int iFrame;
    double t = (double)getTickCount(); 
    for(iFrame=0;iFrame < nReadFrames;iFrame++) {
        // for read test only.  Comment out for just copyTo test.
        // bool bSuccess = cap.read(Frame); // for timing read
        // FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // source slice of Frame. 

        MovieFrames3DRanges[2] = Range(iFrame,iFrame+1); // change FramePlane plane index 
        MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); // slice(plane) of MovieFrames3D
        FramePlane.copyTo(MovieFrames3DPlane);//copying Frame toMovieFrames3D(:,:,iFrame)       
    }
    t = ((double)getTickCount() - t)/((double)nReadFrames * getTickFrequency());     
    cout << "Times passed in ms per frame: " << t*1000.0 << endl;

copyTo is oddly slow for (at least for planes of 3D Mats)

I know this isn't really a question though I guess I would like to know why. I think it's important to share and I didn't know where else to post this.

Basic setup: reading a frames from a file and inserting them in a preallocated plane (aka slice) of a 3d matrix. After taking some timings I realized that all the time was being spent in copyTo vs the read. So, then I hacked my code into just a copyTo speed test and then I compared that to the equivalent Matlab test.

This is done on a win7x64. The opencv test was compiled in x64 Release (FYI: 2x faster than Debug)

Note that Matlab's memory is contiguous w/r to its first index whereas openCV it's last so to make the comparison fair, I reversed the order of the indices for my matlab test (non-contiguous memory spacing same for Matlab's first dimension slice and opencv's 3rd dimension slice) . Since the matlab test is so much shorter etc, I will give it first with its times.

1) Matlab test. This ran in about 0.19ms/frame. (i.e. top copy F into A(i,:,:))

F=ones(1,448,48,'uint8');
A = zeros(1000,448,48,'uint8');
tic; 
for i=1:1000 
     A(i,:,:) = F; 
end; 
T=toc  % = 0.19 sec. Given 1000 frames also == ms/frame

2) Opencv equivalent. This takes about 1.4 ms / frame. Note, copyTo doesn't treat 2d marices and the equivalent 3d slices the same so extra variables are needed to provide a the same 3d slice view to both the 2d source and 3d slice target to make copyTo "happy". The views point to the same matrix data, or portion thereof, as their original variables.

    int nReadFrames = (int)cap.get(CV_CAP_PROP_FRAME_COUNT);
    int nFrameCols = (int)cap.get(CV_CAP_PROP_FRAME_WIDTH); // 448
    int nFrameRows = (int)cap.get(CV_CAP_PROP_FRAME_HEIGHT); // 48


    int a3DMsizes[3] = {nFrameRows,nFrameCols,nReadFrames};
    int aslicesizes[3] = {nFrameRows,nFrameCols,1};
    Mat MovieFrames3D = Mat::zeros(3,a3DMsizes,CV_8UC1); // target 3d array
    Mat FramePlane; // Generic slice that will hava a 3d slice (plane) view of the 2d matrices 
    Range MovieFrames3DRanges[] = {Range::all(),Range::all(),Range(0,1)}; // first slice range; 
    Mat MovieFrames3DPlane; // slice of the 3d target array

    // for copyTo test only.  Comment out for reade test.
    Mat Frame = Mat::ones(nFrameRows,nFrameCols,CV_8UC1);  // for non read/copyTo spead test.  
    FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // for non read/copyTo spead test.  

    int iFrame;
    double t = (double)getTickCount(); 
    for(iFrame=0;iFrame < nReadFrames;iFrame++) {
        // for read test only.  Comment out for just copyTo test.
        // bool bSuccess = cap.read(Frame); // for timing read
        // FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // source slice of Frame. 

        MovieFrames3DRanges[2] = Range(iFrame,iFrame+1); // change FramePlane plane index 
        MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); // slice(plane) of MovieFrames3D
        FramePlane.copyTo(MovieFrames3DPlane);//copying Frame toMovieFrames3D(:,:,iFrame)  Frames3D(:,:,iFrame)         
    }
    t = ((double)getTickCount() - t)/((double)nReadFrames * getTickFrequency());     
    cout << "Times passed in ms per frame: " << t*1000.0 << endl;

EDIT - since I fist posted this, in order to absolutely confirm the culpability of copyTo, I tried another rev where I removed everything from the loop except the copyTo statement (FramePlane.copyTo(MovieFrames3DPlane);). Note, to make this work I inserted MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); right before the loop. The result is that it writes the Frame into just the first slice (plane) of Frames3D. The result didn't change the times significantly. Still 7x or so slower than malab.

copyTo is oddly slow for (at least for planes of 3D Mats)

I know this isn't really a question though I guess I would like to know why. I think it's important to share and I didn't know where else to post this.

In short, copyTo runs about 7x slower than it should. I know this because I made a simple, code below, equivalent copy in matlab and it was that much faster. It's just a memory copy. This isn't any "fancy" matlab function.

Basic setup: reading a frames from a file and inserting them in a preallocated plane (aka slice) of a 3d matrix. After taking some timings I realized that all the time was being spent in copyTo vs the read. So, then I hacked my code into just a copyTo speed test and then I compared that to the equivalent Matlab test.

This is done on a win7x64. The opencv test was compiled in x64 Release (FYI: 2x faster than Debug)

Note that Matlab's memory is contiguous w/r to its first index whereas openCV it's last so to make the comparison fair, I reversed the order of the indices for my matlab test (non-contiguous memory spacing same for Matlab's first dimension slice and opencv's 3rd dimension slice) . Since the matlab test is so much shorter etc, I will give it first with its times.

1) Matlab test. This ran in about 0.19ms/frame. (i.e. top copy F into A(i,:,:))

F=ones(1,448,48,'uint8');
A = zeros(1000,448,48,'uint8');
tic; 
for i=1:1000 
     A(i,:,:) = F; 
end; 
T=toc  % = 0.19 sec. Given 1000 frames also == ms/frame

2) Opencv equivalent. This takes about 1.4 ms / frame. Note, copyTo doesn't treat 2d marices and the equivalent 3d slices the same so extra variables are needed to provide a the same 3d slice view to both the 2d source and 3d slice target to make copyTo "happy". The views point to the same matrix data, or portion thereof, as their original variables.

    int nReadFrames = (int)cap.get(CV_CAP_PROP_FRAME_COUNT);
    int nFrameCols = (int)cap.get(CV_CAP_PROP_FRAME_WIDTH); // 448
    int nFrameRows = (int)cap.get(CV_CAP_PROP_FRAME_HEIGHT); // 48


    int a3DMsizes[3] = {nFrameRows,nFrameCols,nReadFrames};
    int aslicesizes[3] = {nFrameRows,nFrameCols,1};
    Mat MovieFrames3D = Mat::zeros(3,a3DMsizes,CV_8UC1); // target 3d array
    Mat FramePlane; // Generic slice that will hava a 3d slice (plane) view of the 2d matrices 
    Range MovieFrames3DRanges[] = {Range::all(),Range::all(),Range(0,1)}; // first slice range; 
    Mat MovieFrames3DPlane; // slice of the 3d target array

    // for copyTo test only.  Comment out for reade test.
    Mat Frame = Mat::ones(nFrameRows,nFrameCols,CV_8UC1);  // for non read/copyTo spead test.  
    FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // for non read/copyTo spead test.  

    int iFrame;
    double t = (double)getTickCount(); 
    for(iFrame=0;iFrame < nReadFrames;iFrame++) {
        // for read test only.  Comment out for just copyTo test.
        // bool bSuccess = cap.read(Frame); // for timing read
        // FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // source slice of Frame. 

        MovieFrames3DRanges[2] = Range(iFrame,iFrame+1); // change FramePlane plane index 
        MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); // slice(plane) of MovieFrames3D
        FramePlane.copyTo(MovieFrames3DPlane);//copying Frame  Frames3D(:,:,iFrame)         
    }
    t = ((double)getTickCount() - t)/((double)nReadFrames * getTickFrequency());     
    cout << "Times passed in ms per frame: " << t*1000.0 << endl;

EDIT - since I fist posted this, in order to absolutely confirm the culpability of copyTo, I tried another rev where I removed everything from the loop except the copyTo statement (FramePlane.copyTo(MovieFrames3DPlane);). Note, to make this work I inserted MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); right before the loop. The result is that it writes the Frame into just the first slice (plane) of Frames3D. The result didn't change the times significantly. Still 7x or so slower than malab.

copyTo is oddly slow for (at least for planes of 3D Mats)

I know this isn't really a question though I guess I would like to know why. I think it's important to share and I didn't know where else to post this.

In short, copyTo runs about 7x slower than it should. I know this because I made a simple, code below, equivalent copy in matlab and it was that much faster. It's just a memory copy. This isn't any "fancy" matlab function.

Basic setup: reading a frames from a file and inserting them in a preallocated plane (aka slice) of a 3d matrix. After taking some timings I realized that all the time was being spent in copyTo vs the read. So, then I hacked my code into just a copyTo speed test and then I compared that to the equivalent Matlab test.

This is done on a win7x64. The opencv test was compiled in x64 Release (FYI: 2x faster than Debug)

Note that Matlab's memory is contiguous w/r to its first index whereas openCV it's last so to make the comparison fair, I reversed the order of the indices for my matlab test (non-contiguous memory spacing same for Matlab's first dimension slice and opencv's 3rd dimension slice) . Since the matlab test is so much shorter etc, I will give it first with its times.

1) Matlab test. This ran in about 0.19ms/frame. (i.e. top copy F into A(i,:,:))

F=ones(1,448,48,'uint8');
A = zeros(1000,448,48,'uint8');
tic; 
for i=1:1000 
     A(i,:,:) = F; 
end; 
T=toc  % = 0.19 sec. Given 1000 frames also == ms/frame

2) Opencv equivalent. This takes about 1.4 ms / frame. Note, copyTo doesn't treat 2d marices and the equivalent 3d slices the same so extra variables are needed to provide a the same 3d slice view to both the 2d source and 3d slice target to make copyTo "happy". The views point to the same matrix data, or portion thereof, as their original variables.

 int nReadFrames = (int)cap.get(CV_CAP_PROP_FRAME_COUNT);
 int nFrameCols = (int)cap.get(CV_CAP_PROP_FRAME_WIDTH); // 448
 int nFrameRows = (int)cap.get(CV_CAP_PROP_FRAME_HEIGHT); // 48

 int a3DMsizes[3] = {nFrameRows,nFrameCols,nReadFrames};
 int aslicesizes[3] = {nFrameRows,nFrameCols,1};
 Mat MovieFrames3D = Mat::zeros(3,a3DMsizes,CV_8UC1); // target 3d array
 Mat FramePlane; // Generic slice that will hava a 3d slice (plane) view of the 2d matrices 
 Range MovieFrames3DRanges[] = {Range::all(),Range::all(),Range(0,1)}; // first slice range; 
 Mat MovieFrames3DPlane; // slice of the 3d target array

    // for copyTo test only.  Comment out for reade test.
    Mat Frame = Mat::ones(nFrameRows,nFrameCols,CV_8UC1);  // for non read/copyTo spead test.  
 FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // for non read/copyTo spead test.  

 MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges);        
int iFrame;
 double t = (double)getTickCount(); 
 for(iFrame=0;iFrame < nReadFrames;iFrame++) {
        // for read test only.  Comment out for just copyTo test.
        // bool bSuccess = cap.read(Frame); // for timing read
        // FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // source slice of Frame. 

        MovieFrames3DRanges[2] = Range(iFrame,iFrame+1); // change FramePlane plane index 
        MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); // slice(plane) of MovieFrames3D
        FramePlane.copyTo(MovieFrames3DPlane);//copying Frame  Frames3D(:,:,iFrame)         
 }
 t = ((double)getTickCount() - t)/((double)nReadFrames * getTickFrequency());     
 cout << "Times passed in ms per frame: " << t*1000.0 << endl;

EDIT - since I fist posted this, in order to absolutely confirm the culpability of copyTo, I tried another rev where I removed everything from the loop except the copyTo statement (FramePlane.copyTo(MovieFrames3DPlane);). Note, to make this work I inserted MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); right before the loop. The result is that it writes the Frame into just the first slice (plane) of Frames3D. The result didn't change the times significantly. Still 7x or so slower than malab.

copyTo is oddly slow for (at least for planes of 3D Mats)

I know this isn't really a question though I guess I would like to know why. I think it's important to share and I didn't know where else to post this.

In short, copyTo runs about 7x slower than it should. I know this because I made a simple, code below, equivalent copy in matlab and it was that much faster. It's just a memory copy. This isn't any "fancy" matlab function.

Basic setup: reading a frames from a file and inserting them in a preallocated plane (aka slice) of a 3d matrix. After taking some timings I realized that all the time was being spent in copyTo vs the read. So, then I hacked my code into just a copyTo speed test and then I compared that to the equivalent Matlab test.

This is done on a win7x64. The opencv test was compiled in x64 Release (FYI: 2x faster than Debug)

Note that Matlab's memory is contiguous w/r to its first index whereas openCV it's last so to make the comparison fair, I reversed the order of the indices for my matlab test (non-contiguous memory spacing same for Matlab's first dimension slice and opencv's 3rd dimension slice) . Since the matlab test is so much shorter etc, I will give it first with its times.

1) Matlab test. This ran in about 0.19ms/frame. (i.e. top copy F into A(i,:,:))

F=ones(1,448,48,'uint8');
A = zeros(1000,448,48,'uint8');
tic; 
for i=1:1000 
     A(i,:,:) = F; 
end; 
T=toc  % = 0.19 sec. Given 1000 frames also == ms/frame

2) Opencv equivalent. This takes about 1.4 ms / frame. Note, copyTo doesn't treat 2d marices and the equivalent 3d slices the same so extra variables are needed to provide a the same 3d slice view to both the 2d source and 3d slice target to make copyTo "happy". The views point to the same matrix data, or portion thereof, as their original variables.

int nReadFrames = (int)cap.get(CV_CAP_PROP_FRAME_COUNT);
int nFrameCols = (int)cap.get(CV_CAP_PROP_FRAME_WIDTH); // 448
int nFrameRows = (int)cap.get(CV_CAP_PROP_FRAME_HEIGHT); // 48

int a3DMsizes[3] = {nFrameRows,nFrameCols,nReadFrames};
int aslicesizes[3] = {nFrameRows,nFrameCols,1};
Mat MovieFrames3D = Mat::zeros(3,a3DMsizes,CV_8UC1); // target 3d array
Mat FramePlane; // Generic slice that will hava a 3d slice (plane) view of the 2d matrices 
Range MovieFrames3DRanges[] = {Range::all(),Range::all(),Range(0,1)}; // first slice range; 
Mat MovieFrames3DPlane; // slice of the 3d target array

Mat Frame = Mat::ones(nFrameRows,nFrameCols,CV_8UC1);  // for non read/copyTo spead test.  
FramePlane = Mat::Mat(3,aslicesizes,CV_8UC1,Frame.data); // for non read/copyTo spead test.  

MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges);     // slice ,aka plane ,0  
int iFrame;
double t = (double)getTickCount(); 
for(iFrame=0;iFrame < nReadFrames;iFrame++) {
    FramePlane.copyTo(MovieFrames3DPlane);//copying Frame  Frames3D(:,:,iFrame)         
Frames3D(:,:,0) for test only
}
t = ((double)getTickCount() - t)/((double)nReadFrames * getTickFrequency());     
cout << "Times passed in ms per frame: " << t*1000.0 << endl;

EDIT - since I fist posted this, in order to absolutely confirm the culpability of copyTo, I tried another rev where I removed everything from the loop except the copyTo statement (FramePlane.copyTo(MovieFrames3DPlane);). Note, to make this work I inserted MovieFrames3DPlane = MovieFrames3D(MovieFrames3DRanges); right before the loop. The result is that it writes the Frame into just the first slice (plane) of Frames3D. The result didn't change the times significantly. Still 7x or so slower than malab.