1 | initial version |
Because you are measuring it wrong :)
A GPU needs initialization each time a process wants to access it, before it can start calculating. Adding is one of the most basic operations, which can go extremely fast, on CPU and GPU. This means that the initialization time of the GPU is way larger than the actual processing time.
In order to see any difference, use heavy calculation algorithms, like for exampling matching features between a database of 1000 images and you will see imense increasing of processing speed.
Another approach is to apply the gpu loop two times after eachother and measure that, the initialization part should be left out at the second run.