Ask Your Question
3

Fastest way to convert BGR <-> RGB! Aka: Do NOT use Numpy magic "tricks".

asked 2019-09-30 14:20:45 -0600

tjwmmd2 gravatar image

updated 2020-09-25 19:46:18 -0600


IMPORTANT: This article is very long. Remember to click the (more) at the bottom of this post to read the whole article!


I was reading this question: https://answers.opencv.org/question/1... and it didn't explain things very well at all. So here's a deep examination and explanation for everyone's future reference!

Converting RGB to BGR, and vice versa, is one of the most important operations you can do in OpenCV if you're interoperating with other libraries, raw system memory, etc. And each imaging library depends on their own special channel orders.

There are many ways to achieve the conversion, and cv2.cvtColor() is often frowned upon because there are "much faster" ways to do it via numpy "view" manipulation.

Whenever you attempt to convert colors in OpenCV, you actually invoke a huge machinery:

https://github.com/opencv/opencv/blob... https://github.com/opencv/opencv/blob...

As you can see, internally, OpenCV creates an "OpenCL Kernel" with the instructions for the data transformation, and then runs it. This creates brand new (re-arranged) image data in memory, which is of course a pretty slow operation, involving new memory allocation and data-copying.

However, there is another way to flip between RGB and BGR channel orders, which is very popular - and very bad (as you'll find out soon). And that is: Using numpy's built-in methods for manipulating the array data.

Note that there are two ways to manipulate data in Numpy:

  • One of the ways, the bad way, just changes the "view" of the Numpy array and is therefore instant (O(1)), but does NOT transform the underlying img.data in RAM/memory. This means that the raw memory does NOT contain the new channel order, and Numpy instead "fakes" it by creating a "view" that simply says "when we read this data from RAM, view it as R=B, G=G, B=R" basically... (Technically speaking, it changes the ".strides" property of the Numpy object, which instead of saying "read R then G then B" (stride "1" aka going forwards in RAM when reading the color channels) changes it to say "read B, then G, then R" (stride "-1" aka going backwards in RAM when reading the color channels)).
  • The second way, which is totally fine, is to always ensure that we arrange the pixel data properly in memory too, which is a lot slower but is almost always necessary, depending on what library/API your data is intended to be sent to!

To determine whether a numpy array manipulation has also changed the underlying MEMORY, you can look at the img.flags['C_CONTIGUOUS'] value. If True it means that the data in RAM is in the correct order (that's great!). If False it means that the data in RAM is in the wrong order and that we are "cheating" via a numpy View instead (that's BAD!).

Whenever you use the "View-based" methods to flip channels in an ndarray (such as RGB -> ... (more)

edit retag flag offensive close merge delete

Comments

Stop using Python in general.

sjhalayka gravatar imagesjhalayka ( 2019-09-30 19:09:01 -0600 )edit
1

@sjhalayka Python is the standard language for machine learning/AI/computer vision. You'll find Python code in most research papers. The reason is that it's super easy to iterate on algorithm ideas in Python, TONS of libraries, and you don't write tedious C++ or compile any code between executions. If you decide that something is slow in Python, do code profiling to find the bottleneck. Then try to optimize that part in pure Python (or perhaps via some pre-written library/module). If you still can't get the code to be fast enough, you write your own Python module in C++ which handles that part of the algorithm, and then call it from Python. That's how the ML/AI/CV world works right now. I am not a big fan of Python either, but it has a huge ecosystem and always gets the job done.

tjwmmd2 gravatar imagetjwmmd2 ( 2019-09-30 19:17:05 -0600 )edit

@twmnd2 ^^ he's pulling your leg ...

berak gravatar imageberak ( 2019-10-01 01:11:32 -0600 )edit

You should be in sales, not development. LOL

sjhalayka gravatar imagesjhalayka ( 2019-10-01 10:33:49 -0600 )edit

@berak Joke's on him... Pythons don't have legs. :-P

tjwmmd2 gravatar imagetjwmmd2 ( 2019-10-02 09:00:05 -0600 )edit

LOLOLOLOLOL

sjhalayka gravatar imagesjhalayka ( 2019-10-02 10:27:50 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2020-01-03 23:53:51 -0600

Winston gravatar image

@tjwmmd2 fantastic article~!!!!

I was looking for how to convert BGR to RGB and so many answers. Your answer unified all these answers and makes an conclusion which is very necessary.

Just 1 question.

In my use case, I get the image from opencv and use it to my trained PyTorch AI Model to predict whether there are birds in the image or not.

So my understanding is, I can use the "dirty numpy tricks" which is

x = x[...,::-1]

Am I correct?

edit flag offensive delete link more

Comments

Hi Winston, I haven't visited this question in almost a year. Just saw your message now. Well, whenever a library (such as OpenCV or PyTorch) is involved, the answer to your question depends on how THAT library deals with non-contiguous Numpy data. You will have to benchmark it. Create a loop that runs a PyTorch function a few times. Try it on an image that you've used "numpy tricks" on. Then do a x = x.copy() to force Numpy to create REAL, contiguous RAM, and try that loop again. In most cases, with most libraries, x = x.copy() will be the fastest data. In my post I describe both OpenCV and Matplotlib and how they handle "faked numpy trick data". OpenCV always does a copy. Matplotlib has code to handle the trick data without copies, BUT it runs SLOWER than real, contiguous Numpy data.

tjwmmd2 gravatar imagetjwmmd2 ( 2020-09-25 19:55:19 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2019-09-30 14:19:49 -0600

Seen: 13,433 times

Last updated: Sep 25 '20