BIG gpu matrix division

asked 2015-03-18 02:56:09 -0600

Johnny Chien gravatar image

updated 2015-03-18 03:02:52 -0600

Hi there,

I tried to perform the per-element division of two long 1D GPU matrices, and it ends up with the following exception:

Invalid Configuration Argument - This error means that the dimension of either the specified grid of blocks (dimGrid) , or number of threads in a block (dimBlock), is incorrect. In such a case, the dimension is either zero or the dimension is larger than it should be. This error will only occur if you dynamically determine the dimensions.

After tracing down to the source I found

const dim3 grid(divUp(cols, block.x), divUp(rows, block.y));


const dim3 block(Policy::block_size_x, Policy::block_size_y)

Since there're 729,632 rows and 1 column in each of the gpu matrices, the determined grid size is 1 by 91,204 by 1 according to the policy

struct DefaultTransformPolicy
    enum {
        block_size_x = 32,
        block_size_y = 8,
        shift = 4

which looks don't fit well with my case because 91,204 already exceeds the limit of 65536.

I was wondering how this policy is decided. Is it possible to override it within my own code, without rebuilding the library?

Same problem here. How can you modify the Policy? I have not found a solution.

luisruiz gravatar imageluisruiz ( 2018-04-05 00:09:06 -0600 )edit