How to use SVM(Support Vector Machine) to detect Hand Gesture in real time using Python and OpenCV?

asked 2017-04-24 00:33:20 -0600

I want to detect hand gestures in real time using some dataset of about 100 images (10 images for 10 gestures).

What i have done till now :-

  1. I have created a dataset of 100 images.
  2. I have code to detect hand.
  3. I'm using OpenCV 3.2.0 and Python 2.7 on Linux.

What i want to do next :-

  1. I want to load my folder(containing images) into SVM.
  2. I want to give SIFT descs as input into SVM.
  3. I want to predict real time hand gesture.

I have written a code to understand SVM and tested it successfully :-

import cv2
import numpy as np
from sklearn.svm import SVC

img = cv2.imread("1_1.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps1_1, descs1_1) = sift.detectAndCompute(img, None)
a1=np.mean(descs1_1)

img = cv2.imread("1_2.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps1_2, descs1_2) = sift.detectAndCompute(img, None)
a2=np.mean(descs1_2)

img = cv2.imread("1_3.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps1_3, descs1_3) = sift.detectAndCompute(img, None)
a3=np.mean(descs1_3)

img = cv2.imread("1_4.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps1_4, descs1_4) = sift.detectAndCompute(img, None)
a4=np.mean(descs1_4)

img = cv2.imread("1_5.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps1_5, descs1_5) = sift.detectAndCompute(img, None)
a5=np.mean(descs1_5)

img = cv2.imread("3_1.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps3_1, descs3_1) = sift.detectAndCompute(img, None)
b1=np.mean(descs3_1)

img = cv2.imread("3_2.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps3_2, descs3_2) = sift.detectAndCompute(img, None)
b2=np.mean(descs3_2)

img = cv2.imread("3_3.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps3_3, descs3_3) = sift.detectAndCompute(img, None)
b3=np.mean(descs3_3)

img = cv2.imread("3_4.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps3_4, descs3_4) = sift.detectAndCompute(img, None)
b4=np.mean(descs3_4)

img = cv2.imread("3_5.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps3_5, descs3_5) = sift.detectAndCompute(img, None)
b5=np.mean(descs3_5)


X = np.array([[a1], [a2], [a3], [a4], [a5], [b1], [b2], [b3] , [b4], [b5]])
y = np.array([1, 1,1, 1,1, 2, 2, 2, 2, 2])

img = cv2.imread("3_16.jpg")
img = cv2.resize(img, (600,400))
sift = cv2.xfeatures2d.SIFT_create()
(kps3_16, descs3_16) = sift.detectAndCompute(img, None)
v=np.mean(descs3_16)


clf = SVC()
clf.fit(X, y) 
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)
print(clf.predict(v))

It gives the following output :-

[2]

Now following are my problems :-

  1. I do not want to manually load hundred images and label them.
  2. How can I compare my real time frame with my dataset?
  3. Is this the correct way to give features as input into SVM ...
(more)
edit retag flag offensive close merge delete

Comments

  • we probably cannot help much with sklearn (somewhat off-topic)
  • you're using the mean of the sift descriptors ? that won't work.
berak gravatar imageberak ( 2017-04-24 01:32:35 -0600 )edit

then what should i use??

sauravdhakad gravatar imagesauravdhakad ( 2017-04-24 01:36:47 -0600 )edit

$ shape reco , maybe.

berak gravatar imageberak ( 2017-04-24 02:19:37 -0600 )edit

can you tell me the way to load multiple images into svm with label?

sauravdhakad gravatar imagesauravdhakad ( 2017-04-24 05:27:35 -0600 )edit

Oh and btw, getting a robust gesture classification, based on 10 images per gesture, you can simply forget it. State of the art systems are learned over multiple thousands and millions of images.

StevenPuttemans gravatar imageStevenPuttemans ( 2017-04-24 08:28:23 -0600 )edit

i just want to check whether it is working or not, i'm not concerned about robustness.

sauravdhakad gravatar imagesauravdhakad ( 2017-04-24 08:53:01 -0600 )edit

But it won't work :D You will get misclassification all the time and thus will give you the wrong impression of the algorithm.

StevenPuttemans gravatar imageStevenPuttemans ( 2017-04-24 09:09:12 -0600 )edit

then how much images per gesture should i take?

sauravdhakad gravatar imagesauravdhakad ( 2017-04-26 03:29:14 -0600 )edit

250-1000 to start with I guess

StevenPuttemans gravatar imageStevenPuttemans ( 2017-04-26 05:01:06 -0600 )edit

thnxx..................

sauravdhakad gravatar imagesauravdhakad ( 2017-04-27 02:42:22 -0600 )edit