Most efficient format to save and load file

asked 2015-10-19 20:31:54 -0600

cuongvt gravatar image

Hi all,

I'm wondering if YML/XML is the most efficient format to save and load file in OpenCV. The situation is that I'm gonna have a really big SIFT features matrix (say, 10 million features, which means 10 million rows and 128 columns), and I want to load (read from disk) it in the fastest way possible. Slow saving is ok but reading need to be fast.

Thanks

edit retag flag offensive close merge delete

Comments

the most efficient/fastest way to read write data in C++ is as binary files. However, I do not think that there is this option embedded in opencv. Therefore, if you still want to do it with opencv you will either have to go with YML/XML approach or by using the format() function as it is described here and save the data in another format lke .csv. and then try to read it again with the method described again in the above link or with another parser which might be faster. Actually, you could make some tests, measure the performance and tell us what is faster.

theodore gravatar imagetheodore ( 2015-10-20 08:11:23 -0600 )edit

hmmm, for such large things, i'd roll my own binary serialization using fopen() or such.

in fact, all you need to save to reconstruct a cv::Mat is rows,cols,type and data

using opencv's Filestorage has some serious drawbacks, e.g. for xml, a dom parser is used, meaning, it has to read the whole thing into memory, build the real model from that, and at that time you need like 2 x the memory for your model, which gets you close to e.g. 32bit limitations fast.

berak gravatar imageberak ( 2015-10-20 10:30:53 -0600 )edit
2

i did not test it but look at SO Answer may be it helps.

sturkmen gravatar imagesturkmen ( 2015-10-20 11:04:09 -0600 )edit
2

we definitely need to convince Miki, that SO is a rotten place, and that he should spread his wisdom here ;)

berak gravatar imageberak ( 2015-10-20 11:08:12 -0600 )edit

@berak count me in if you decide on a kidnapping plan :-p... @sturkmen nice hit ;-)

theodore gravatar imagetheodore ( 2015-10-20 11:26:27 -0600 )edit

What about imwrite with any losless format

sturkmen gravatar imagesturkmen ( 2015-10-20 13:01:58 -0600 )edit

10 million SIFT (float) features.

berak gravatar imageberak ( 2015-10-20 13:53:31 -0600 )edit

I'm confused. what did you say about Miki's code ? i tried to do with imwrite what Miki did.

sturkmen gravatar imagesturkmen ( 2015-10-20 14:05:26 -0600 )edit

i think, Miki had the appropriate idea there.

think of it, - pgm ppm or such are just writing a (very short) text header with rows, cols, type, then followed by a large binary data block. pretty close to a plain fopen() approach (there won't be much difference with timimg)

unfortunately, your imwrite() example below restricts the type to CV_8U, while CV_32F would be required to save/load SIFT features.

so, imho, better shortcut any openv builtin functionality, which cannot handle this (atm).

berak gravatar imageberak ( 2015-10-20 14:17:31 -0600 )edit

OK. sorry it seems i didn't realize the question and Miki's code well.

sturkmen gravatar imagesturkmen ( 2015-10-20 14:25:21 -0600 )edit