pwuertz's profile - activity

overview network karma followed questions activity

2020-02-05 14:45:46 -0600	received badge	● Necromancer (source)
2020-02-05 04:24:34 -0600	received badge	● Supporter (source)
2020-02-05 04:13:51 -0600	received badge	● Editor (source)
2020-02-05 04:13:51 -0600	edited answer	No effect from using cuda::Stream? I think the real issue is that CUDA needs pinned / pagelocked host-memory to do asynchronous transfers to the GPU. If yo
2020-02-05 04:02:50 -0600	answered a question	No effect from using cuda::Stream? I think the real issue is that CUDA needs pinned / pagelocked host-memory to do asynchronous transfers to the GPU. If yo
2020-01-31 08:15:02 -0600	commented question	i need to detect human body (dead,alive,injured),what method do i use? A stethoscope maybe?
2017-01-25 06:59:44 -0600	commented question	OpenCL TAPI mixed performance The operations mean and sum should be almost identical (see CPU results for reference). The "backend" is the NVidia GPU driver, which provides almost identical performance on Windows and Linux (see popular benchmarks). I doubt that the GPU "is somehow configured incorrectly", but I'm open to suggestions.
2017-01-25 02:19:10 -0600	received badge	● Student (source)
2017-01-25 02:18:38 -0600	asked a question	OpenCL TAPI mixed performance I'm getting mixed results when using the OpenCL transparent API in terms of performance, so I wrote a simple test application for measuring the execution time of a few OpenCV methods. I'm testing the methods `Sobel`, `mean` and `sum` with 5000x5000 matrices on CPU and GPU. The methods are called 10 times, with an additional call before starting the measurement as advised in another Q&A post. The code is found here `pastebin.com/wa0yvu30` The following results were obtained on the same machine, using a GTX 980 GPU with the latest drivers on both Linux and Windows, built with OpenCV 3.2: `OpenCL devices: GeForce GTX 980 Ubuntu Windows sobel on cpu: 224ms 234ms sobel on gpu: 44ms 593ms mean on cpu: 85ms 78ms mean on gpu: 399ms 500ms sum on cpu: 86ms 78ms sum on gpu: 7ms 15ms` The CPU results are perfectly comparable. On Linux the GPU results from `Sobel` and `sum` are reasonable, I guess. The speedup is 5x to 10x. On Windows however the `Sobel` performance is very poor and `sum` is two times slower. What could be the reason for this? Also note that on both systems the performance of `mean` is very poor, which is why I'm currently using `sum` as a workaround for now. Is this because the OpenCV functions must be explicitly optimized for OpenCL and `mean` just didn't get the attention yet? If so, is the current state of OpenCV support documented somewhere or do we have to write tests like this?