Image size is 640x480 and MOG2 is applied, it executes in 1ms. While if I take above image and copy it 4 times I will get execution time of 3.2ms(This execution time is only of GPU when I am applying the extractor not memory transfer). The time shouldn't jump from 1ms to 3.2ms in my opinion because it is parallel. My system has Cuda 8.0, OpenCV 3.3.0 and 850M Graphics Card.