Ask Your Question
1

Reading large number of Images

asked 2018-04-25 11:15:28 -0600

Akash gravatar image

updated 2018-04-27 14:40:13 -0600

I have a dataset containing 101,000 images and I want to use them for Image classification task but I run out of memory whenever I try to read them using imread.Is there a way to do this? i am using python

edit retag flag offensive close merge delete

Comments

1

use a neural network, and train (consecutively) with small image-batches ?

berak gravatar imageberak ( 2018-04-25 11:17:52 -0600 )edit
1

But my previous model training will be over-written every time I use a different batch

Akash gravatar imageAkash ( 2018-04-25 12:03:05 -0600 )edit
1

there's an UPDATE flag, that can be passed to the train method on the 2nd (and further) run, will make a more detailled post tomorrow.

berak gravatar imageberak ( 2018-04-25 12:13:15 -0600 )edit
1

Here is some code where noisy images are fed into the neural network:

https://github.com/sjhalayka/opencv_i...

sjhalayka gravatar imagesjhalayka ( 2018-04-25 13:02:06 -0600 )edit
2

@sjhalayka, why not make a complete answer here, mentioning (and explaining the use of) ANN_MLP::TrainFlags::UPDATE_WEIGHTS and so on ?

berak gravatar imageberak ( 2018-04-25 13:16:44 -0600 )edit

@Akash... how about you generate a smaller set of images and zip them up and upload them to GitHub?

Also, while you're at it, make a text file with each line containing an image name and its classification. How many classifications are there in total?

Once I have that I can write you a full example.

sjhalayka gravatar imagesjhalayka ( 2018-04-25 13:27:28 -0600 )edit

Thank you for your assistance sjhalakya I have 101 classes each having 1000 images. https://github.com/akashjoshi123/Imag... is the link it is a very small subset of my dataset the meta folder has all the 101 class labels of the original dataset.

Akash gravatar imageAkash ( 2018-04-25 13:50:53 -0600 )edit

Alright. In the meantime, check out the source code that I linked to.

Also, have you tried your own C++ code?

sjhalayka gravatar imagesjhalayka ( 2018-04-25 14:04:43 -0600 )edit

Ahh, too bad... you're using Python. I have no idea how to do a neural network using Python. Sorry about that!

sjhalayka gravatar imagesjhalayka ( 2018-04-25 14:12:17 -0600 )edit

Thank you for your valuable time @sjhalayka.Extremely sorry I missed the python tag while posting the question

Akash gravatar imageAkash ( 2018-04-26 00:40:09 -0600 )edit

1 answer

Sort by » oldest newest most voted
0

answered 2018-04-28 20:48:49 -0600

sjhalayka gravatar image

updated 2018-05-02 19:55:42 -0600

I have written a code that takes a file list and trains the neural network. Be careful when changing any of the network initialization parameters:

https://github.com/sjhalayka/python_o...

See get_files.py for the code to generate the two file lists (training data and testing data).

See ann_image.py for the main code that reads in the file lists and trains / tests the network.

edit flag offensive delete link more

Comments

Yeah sorry, there can be no spaces in the directory / file name. It's primitive code.

sjhalayka gravatar imagesjhalayka ( 2018-04-29 19:18:31 -0600 )edit

Since you have such a large number of input image files, you may not need to iterate 1000 times. I'm jealous of your data set.

sjhalayka gravatar imagesjhalayka ( 2018-04-29 19:59:22 -0600 )edit

... and @berak is right, 512x512x3 is too many input neurons. I haven’t the physical memory to store that many neurons (and connection weights, etc).

I edited the main code to resize the images from 512x512 to 64x64. It's super easy, just call the cv2.resize function.

sjhalayka gravatar imagesjhalayka ( 2018-04-30 15:58:31 -0600 )edit

Yeah sorry for all the edits.

sjhalayka gravatar imagesjhalayka ( 2018-05-02 19:36:37 -0600 )edit

The network uses bit encoding to represent the answer. Some times the network predicts a number that is larger than the maximum class id being encoded. You could avoid this by using one-hot encoding.

sjhalayka gravatar imagesjhalayka ( 2018-05-06 11:51:58 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2018-04-25 11:15:28 -0600

Seen: 1,123 times

Last updated: May 02 '18