I want to accelerate my code using SSE or NEON, opencv has uniform interface for different device, so I'd like to use the SIMD feature of OpenCV. But I find the macro CV_SIMD128 is not defined outside opencv, so is there any way to use the v_int16x8 or v_reduce_sum structure or function safely?