historical performance test cast data in build pipeline?

asked 2020-12-03 09:34:41 -0600

Hi all. Is there organized data/results kept for performance tests that are automatically run/collected by the OpenCV build pipeline? So that I can inspect and compare across builds/releases performance changes? My intention is to see the performance effect of a code fix and compare it historically across all the platforms the pipeline/builtbot makes.

I can see at e.g. https://pullrequest.opencv.org/buildb... that performance tests were run. The raw stdio was collected. That 44 seconds elapsed for 2400+ test cases. However, I do not see where the individual test results are collected, verified, or compared across builds. There is no visibility into how a single test within those 2400+ test cases in 44 seconds changes over time. Like summary.py does for isolated builds.

Naturally, I run performance tests on my own computers and I analyze individual test cases for changed code. But my tests run on a minority of the hardware/os platforms that the pipeline already does today. Which could lead to performance increasing on some platforms, and decreasing on others -- which is undesired.

Comments

I doubt it. performance depends on the machine the test is run on. you'd need a reference machine to make values comparable. even trying to normalize for processor speed won't do it. performance depends on many factors, bandwidths of the machine's memory hierarchy being critical.

crackwitz ( 2020-12-03 11:13:40 -0600 )edit

Naturally. I doubt it also...but I want to check and to raise the concern that there is no testing automation that is checking performance. So all code changes to OpenCV rely on the individual developer always checking for performance regressions. And that is not an easy task nor a task that is quickly done. Since it requires stable testing harnesses across many platforms and running performance tests on the before code..and after code...and then comparing the two results allowing for slight variations.

diablodale ( 2020-12-04 08:29:22 -0600 )edit

ah, then I misunderstood. it sounded to me as if there was performance testing in place but without expected performance values. I don't know if there is any performance testing. I believe I saw some "performance" titled things in OpenCV's repository before but I can't be sure.

crackwitz ( 2020-12-04 09:22:15 -0600 )edit

add a comment

answered 2020-12-05 05:14:19 -0600

Eduardo
3589 ●1 ●15 ●41

I am not aware of any historical performance data stored somewhere for the OpenCV buildbots. And I doubt performance data on this CI instances would be meaningful. Dedicated hardware (but costly, time consuming for maintenance, ...) would be needed, that is not the job of the buildbot CI. And OpenCV human resources are already limited.

So all code changes to OpenCV rely on the individual developer always checking for performance regressions.

Yes it is. See the optimization label for some examples/discussions.

I would like also add that OpenCV relies on external libraries for "performance critical" functions like Intel IPP for lot of image processing functions, or cuDNN/OpenVINO/Tengine for deep learning.

I think there are few "micro optimizations" for contributed new code, or in OpenCV code base.

I mean, a large part of the pull requests for performance optimization are "obvious optimizations": e.g. add SIMD code path for a specific function, parallelize an algorithm, add OpenCL code path, ...

Another topic is conversion of old code from native intrinsics to universal intrinsics / wide-universal intrinsics (more information). For a library of this size, it is unmaintainable to have native intrinsics for all the supported architecture (x86, ARM, PowerPC, RISC-V, WebAssembly, ...).

In summary, and in my opinion:

performance of OpenCV code are in general really good
but for critical performance, OpenCV relies on external libraries, e.g. for the deep learning hot topic see OpenVINO, cuDNN, Tengine
or the user should rely on external libraries (e.g. Intel MKL, OpenBLAS for BLAS operations)
contributed new SIMD code must rely on universal intrinsics / wide-universal intrinsics (WUI) framework
for conversion of native intrinsics to WUI, requirement is no perf regression (most of the time it is conversion from x86 (or ARM NEON) intrinsics to universal intrinsics where you can safely expect similar results since it is more or less a mapping of intrinsics, so no need to have rigorous regression benchmarks on different CPU models, ...)
contributed new performance code should bring noticeable performance gain, most of the time it is obvious since you add SIMD code path, and you quantify the perf gain
OpenCV remains a general computer vision library, I am sure you can easily outperform OpenCV performance if you are dead serious and if you are written code targeting a specific architecture, specific memory cache size, etc.
for instance it is likely this Simd Library outperforms OpenCV for image processing functions it targets, but if you look at the source code it is a nightmare for maintenance in my opinion

My intention is to see the performance effect of a code fix and compare it historically across all the platforms the pipeline/builtbot makes.

Maybe you can ad more information about which function you are targeting, what is the optimization?

Is the expected performance gain significative?

Note:

I am not affiliated with the OpenCV organisation
neither my job involves HPC

edit flag offensive delete link

add a comment

historical performance test cast data in build pipeline?

Comments

1 answer

Links

Question Tools

Stats

Related questions

historical performance test cast data in build pipeline? edit

Comments

1 answer

Links

Question Tools

Stats

Related questions

historical performance test cast data in build pipeline?