passing parameters to cv2.kmeans in python
i have a dataset of images and i want to do clustering on it. I have read the openCV documentations of kmeans but i just do not get it properly. Below is my code and i have no idea how can i pass images to the kmeans() and how to send all the clusters into different folders.
import cv2, sys, numpy, os
import matplotlib.pyplot as plt
from matplotlib import style
style.use("ggplot")
from sklearn.cluster import KMeans
fn_dir2 = 'unknown'
path2='/home/irum/Desktop/Face-Recognition/thakarrecog/unknown/UNKNOWNS'
# Create a list of images and a list of corresponding names
(images, lables, names, id) = ([], [], {}, 0)
#reading images from dataset
for (subdirs, dirs, files) in os.walk(fn_dir2):
for subdir in dirs:
names[id] = subdir
subjectpath = os.path.join(fn_dir2, subdir)
for filename in os.listdir(subjectpath):
path = subjectpath + '/' + filename
lable = id
images.append(cv2.imread(path, 0))
lables.append(int(lable))
id += 1
#converting images and lables to numpy arrays
(images, lables) = [numpy.array(lis) for lis in [images, lables]]
print "length images", len(images)
print type(images)
print "length lables", len(lables)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
images = np.asarray(images, np.float32)
N = len(images)
images = images.reshape(N,-1)
ret,label,center=cv2.kmeans(images,k,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
now i am having this error
Traceback (most recent call last):
File "Kmeans2.py", line 39, in <module>
ret,label,center=cv2.kmeans(images,2,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
TypeError: an integer is required
what is the purpose of the clustering ? (in the context of face-recognition ?)
i will use this clustered data for face recognition .... basically i am doing clustering for face-recognition
again, why ? you're trying to solve some problem there, - which is ?
i am doing detection and recognition , any face which is detected and not recognized is saved in a dataset. after a day my algorithm start doing clustering on that dataset in order to put them in a database of known people and each cluster will have a specific name after that. So when i tried doing clustering using open cv KMeans i am having different errors. Because i think i am passing wrong samples MAY BE. Problem is that i am asking how can i pass the images numpy array to this cv2.KMeans function ?
(images, lables) = [numpy.array(lis) for lis in [images, lables]]
these are two numpy arrays, one has faces and other has lables. i need to cluster image data, i want to differentiate one face from another so i can get same faces in one cluster.yes determining k is also another problem so i thought for now i can use k=10 because i have not that much different people in dataset. secondly if there is any other technique that can be more useful i will be very happy to try.