Does OpenCV support the use of vector reciprocal on ARM NEON?

asked 2014-09-22 14:18:39 -0600

rwong gravatar image

Some ARM NEON architecture do not have a native floating-point division instruction for vector data. Instead, the operation must be composed from a sequence of native instructions which together implement an iterative reciprocal estimate algorithm (most probably of Newton-Raphson method).

C++ compilers targeting ARM NEON should automatically generate such instructions for the scalar floating-point source code, or defer to a standard math library function call. However, if the library code specifically loops over each element performing its own non-trivial approximation, then it is dubious that C++ compilers (even with auto-vectorization enabled) would dare to defy the hard-coded logic.

It appears that unless the library specifically codes ARM NEON-specific matrix floating point divisions, it will fall back on to scalar C++ code, resulting in one math library function call per matrix element.

I see that OpenCV contains a very nifty vector-of-four elementwise division algorithm, but I doubt if it could beat the native implemented instructions.

Has anyone performed a benchmark on mobile ARM NEON processors to evaluate the performance of the native NEON vector reciprocal estimate operations?

edit retag flag offensive close merge delete

Comments

Hey Rwong! I am very interested in this topic... would you like to exchange some e-mails? I have some questions I'd love to ask you :) Please email me @ [email protected]

Pedro Batista gravatar imagePedro Batista ( 2015-02-11 12:15:07 -0600 )edit

@PedroBatista If your question is related to OpenCV you can post a question on this site. If your question is about source code sharing, unfortunately all of my work is done for my employer (due to the "work for hire" contract), therefore I cannot share any source code unless that sharing is explicitly permitted by and deemed beneficial to my employer.

rwong gravatar imagerwong ( 2015-04-03 19:30:25 -0600 )edit

Nop, the question was related with Neon programming and how it works, thanks anyway :)

Pedro Batista gravatar imagePedro Batista ( 2015-04-06 05:07:51 -0600 )edit