CARMA DevKit Tegra 3 NEON optimizations
Hello,
We are using the CARMA (CUDA on ARM) Development Kit (Tegra T30 Q7 module onboard with ARM Cortex A9, 1.3Ghz, ubuntu linux 3.1, gcc 4.6.3) for some experiments on performance benefits of NEON instructions. We've modified some OpenCV 2.4.2 routines with NEON intrinsics, but observe very little performance gains. We get much better gains with Samsung Exynos 4412 (Also ARM Cortex A9, 1.4Ghz, Android ndk gcc 4.6.x) with the same routines.
Is there a way for us to make use of the the Tegra 3 optimized OpenCV routines (binary pack) on ubuntu linux? Or are there any specific flags/methods for the tegra 3 that we are missing out on?
Any help is much appreciated!
Thanks, Gaurav