Ask Your Question

open-source NEON optimizations

asked 2013-07-30 14:52:40 -0600

adrians gravatar image

Hello !

I saw in the ICVS materials that there would be an open-source code for the NEON optimizations for the OpenCV library (maybe in the 3.0 release), but at the moment it's uncertain.

I want to know more about that because I'd like to present what effect does the optimization level (prefetching, vectorizations, etc.) have on the runtime performance of computer vision algorithms on embedded platforms, and if the optimized versions are not available I'll have to write them myself (but that would contain only a few operations so that a working demo could be done and the statistics be gathered).

I'd like to mention that I'll be using an dev board with an ARM Cortex A8 core (with Mali400 GPU but I'm not aiming for using GPU).

Thanks a lot!

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2013-07-31 06:09:41 -0600

Kirill Kornyakov gravatar image

Well, this is true that OpenCV may get open-source NEON optimizations someday. But they should be contributed by somebody (company or individual), and core team doesn't have plans (and more importantly resources) to work on these optimizations. So, no guarantees, but the possibility exists...

So, if you need open-source NEON code for you demos, I wouldn't wait on your place... It may take years to get them to the OpenCV. Instead, you can work on your own, and become a first contributor =) Prefetches and vectorizations are able to give you a good speedup, typically 3x in comparison to the original code. But of course it depends on algorithm.

If you don't need the source, you can use NEON optimizations that are available in binary form as a part of OpenCV for Tegra (official docs, introductory talk, sample app, relevant article). This way you can try to estimate what speedup is expected in your application, but you'll need a Tegra 3 device for these experiments. Alternatively, you can try to use FastCV, but again, you need a Qualcomm device, and you'll have to rewrite your application.

That's basically all I have to say. NEON code contribution is not going to happen in a short term, but we have a chance to see it in 3.0. In the meantime I would recommend you to experiment with your own code, or try optimized libraries like OpenCV for Tegra, provided in the binary form.

edit flag offensive delete link more


Also it is worth taking a look at the disassembly of the binary code. When compiled on GCC with auto-vectorization, some part of OpenCV may see speed-ups, although currently there's few information on which part gets speed-up and which part doesn't. Hand-tuned NEON will always beat auto-vectorization, so it is still important to keep it on the roadmap - someone else's (that is, the hardware vendors).

rwong gravatar imagerwong ( 2014-08-11 20:20:25 -0600 )edit

Question Tools



Asked: 2013-07-30 14:52:40 -0600

Seen: 7,963 times

Last updated: Jul 31 '13