Ask Your Question

Revision history [back]

MASSIVELY slow transfer between GPU and host memory

I'm doing some simple benchmarking and comparing the cost of transferring data from host to GPU and back. Here's a paraphrasing of the snippet that's acting up:

    Mat lImage( 720, 1280, CV_8UC3, Scalar( 100, 250, 30 ) );
    UMat lUImage; 
    lUImage = lImage.getUMat(ACCESS_READ); /* This is fast */
    // lImage.copyTo( lUImage ); /* This is SLOW */
    cvtColor( lUImage, lUDestImage, COLOR_BGR2YCrCb );
    lNumGpuCopyConverts++;
    // lImage = lUImage.getMat( ACCESS_READ ); /* This is SLOW */
    // lUImage.copyTo( lImage ); /* This is SLOW */

When I say very slow I'm talking about literally over a minute to do the copy from GPU to CPU. This is an AMD FirePro card and for perspective, using a 720p image will have cvtColor done 170k-180k times per second. Using just the one-way copy it drops to 22k conversions per second. If I copy back to the CPU I don't even get one.

I tested this on a variety of other machines/laptops/etc. and doing the two-way copy seems to be slow-ish but not terrible so I'm assuming there must be either something weird about this card or some misconfiguration on my machine. Does anyone have any ideas about what I could check?

cheers,

Chris