Fastest way to convert BGR <-> RGB! Aka: Do NOT use Numpy magic "tricks".
IMPORTANT: This article is very long. Remember to click the (more)
at the bottom of this post to read the whole article!
I was reading this question: https://answers.opencv.org/question/1... and it didn't explain things very well at all. So here's a deep examination and explanation for everyone's future reference!
Converting RGB to BGR, and vice versa, is one of the most important operations you can do in OpenCV if you're interoperating with other libraries, raw system memory, etc. And each imaging library depends on their own special channel orders.
There are many ways to achieve the conversion, and cv2.cvtColor()
is often frowned upon because there are "much faster" ways to do it via numpy "view" manipulation.
Whenever you attempt to convert colors in OpenCV, you actually invoke a huge machinery:
https://github.com/opencv/opencv/blob... https://github.com/opencv/opencv/blob...
As you can see, internally, OpenCV creates an "OpenCL Kernel" with the instructions for the data transformation, and then runs it. This creates brand new (re-arranged) image data in memory, which is of course a pretty slow operation, involving new memory allocation and data-copying.
However, there is another way to flip between RGB and BGR channel orders, which is very popular - and very bad (as you'll find out soon). And that is: Using numpy's built-in methods for manipulating the array data.
Note that there are two ways to manipulate data in Numpy:
- One of the ways, the bad way, just changes the "view" of the Numpy array and is therefore instant (
O(1)
), but does NOT transform the underlyingimg.data
in RAM/memory. This means that the raw memory does NOT contain the new channel order, and Numpy instead "fakes" it by creating a "view" that simply says "when we read this data from RAM, view it as R=B, G=G, B=R" basically... (Technically speaking, it changes the ".strides" property of the Numpy object, which instead of saying "read R then G then B" (stride "1" aka going forwards in RAM when reading the color channels) changes it to say "read B, then G, then R" (stride "-1" aka going backwards in RAM when reading the color channels)). - The second way, which is totally fine, is to always ensure that we arrange the pixel data properly in memory too, which is a lot slower but is almost always necessary, depending on what library/API your data is intended to be sent to!
To determine whether a numpy array manipulation has also changed the underlying MEMORY, you can look at the img.flags['C_CONTIGUOUS']
value. If True
it means that the data in RAM is in the correct order (that's great!). If False
it means that the data in RAM is in the wrong order and that we are "cheating" via a numpy View instead (that's BAD!).
Whenever you use the "View-based" methods to flip channels in an ndarray (such as RGB -> ...
Stop using Python in general.
@sjhalayka Python is the standard language for machine learning/AI/computer vision. You'll find Python code in most research papers. The reason is that it's super easy to iterate on algorithm ideas in Python, TONS of libraries, and you don't write tedious C++ or compile any code between executions. If you decide that something is slow in Python, do code profiling to find the bottleneck. Then try to optimize that part in pure Python (or perhaps via some pre-written library/module). If you still can't get the code to be fast enough, you write your own Python module in C++ which handles that part of the algorithm, and then call it from Python. That's how the ML/AI/CV world works right now. I am not a big fan of Python either, but it has a huge ecosystem and always gets the job done.
@twmnd2 ^^ he's pulling your leg ...
You should be in sales, not development. LOL
@berak Joke's on him... Pythons don't have legs. :-P
LOLOLOLOLOL