How to marshal OpenCV objects

asked 2020-03-18 06:25:44 -0500

stefano gravatar image


I am working on a project where we need to marshal python objects at runtime, without knowing in advance the types of the objects that need to be saved to disk (and then loaded back in a different python context). We are using callbacks based on the object types, whenever we need specialised functions (e.g. using hdf5 format for Tensorflow models), otherwise falling back to dill.

I am having a hard time finding a general way to save openCV objects. In my specific case, I am trying to save to disk a CascadeClassifier object, but I notice that it implements just load and road and not write or save. What can I do? I noticed that some other cv objects implement the save or write function. Why doesn't CascadeClassifier implement it as well?

And more generally, what is the best approach to save OpenCV objects to disk and the load them back? Thanks!!

edit retag flag offensive close merge delete


why would you save something immutable (like a cascade) ?

(and no, this is c++ code, basically. you don't "marshal" or pickle objects. you create them, and (re)load the data)

berak gravatar imageberak ( 2020-03-18 06:37:28 -0500 )edit

@berak I made the CascadeClassifier example because that is what we bumped into. We need to be able to potentially marshal any python object, from one Python context to another. So the question in general is, what is the best we can do with open cv objects? What are the interfaces that are available to save things?

stefano gravatar imagestefano ( 2020-03-18 06:42:13 -0500 )edit

We need to be able to potentially marshal any python object,

you can't. it's not a "pure python" lib, but c++ code with python wrappers

(and you can't marshal the underlying c++ objects)

berak gravatar imageberak ( 2020-03-18 06:45:42 -0500 )edit

@berak Ok I understand your argument and I agree completely. There is no way of saving a general C++ object, without a specific save/load interface. Since this is critical to our efforts in building a data science platform, I would like to ask further a couple of questions. I would really appreciate if you could help me in getting a better understanding of this.

stefano gravatar imagestefano ( 2020-03-19 07:30:44 -0500 )edit

I see that many open cv objects support the save/load interface, to save the state of an object to xml/yaml and then load it back (see for example FaceRecognizer: Whare are the objects that support this interface? Why isn't it all of them? Why isn't this the case for CascadeClassifier?

stefano gravatar imagestefano ( 2020-03-19 07:30:51 -0500 )edit

Also, other libraries like TensorFlow provide a genericsave/loadinterface for any tensor, model, or other objects (, that are actually C++ objects down the line. I understand that OpenCV does not support this generic paradigm, is there a specific reason?

stefano gravatar imagestefano ( 2020-03-19 07:31:00 -0500 )edit

Lastly, the CascadeClassifierexample I am working on was taken from here ( Specifically, they are creating a CascadeClassifier using the xml file haarcascades/haarcascade_frontalface_alt.xml (cv2.CascadeClassifier(haarcascade_frontalface_alt.xml)). I also see that CascadeClassifier implements the load method to load these xml files, but how are they produced in the first place? I would expect to be able to write as well at this point.

stefano gravatar imagestefano ( 2020-03-19 07:31:06 -0500 )edit

the cascades are trained from a suite of external programs (the CascadeClassifier is a read-only application)

similar problem with opencv's dnn -- you're expected to use some external framework, like tf, pytorch to train & save it, then load it into dnn::Net for inference

imho, the only classes that work like you expect it are the cv::ml models, which produce a serialized class state (e.g. an instance Ptr<SVM> from the load() method)

berak gravatar imageberak ( 2020-03-19 08:18:49 -0500 )edit