1 | initial version |
Generally plase give me some suggestions what is you impression about such a system. Depends on a lot of things.
How many different gestures do you want to detect?
How much time do you have?
How robust does the final application should be?
Can you gather gesture data?
Are there speed requisites?
If its a short amount of different gestures (2-5), there is no need to resort to neural networks or other rather complex machine learning algorithms. Given that you can already segment the hand, it should be easy to process it in such a way to differentiate between your small set of different available gestures. For example, if you need to differentiate between fist and a open hand, you can do that by counting amount of fingers, size of the hand, etc.
If your system needs to handle a lot of different gestures and you can gather gesture data, then it will be better to go for a machine learning approach, by recording examples of different available gestures and use it as training data for a classifier. You need a machine learning algorithm that can handle multiple responses. For a simple gesture problem I think a kNN or a Decision Tree will suffice, and should be a good way to start, since these are simple machine learning algorithms that can handle multiple responses. For more complex, faster and more accurate solutions, you can start looking at neural networks, eventualy deep learning, knowing that these solutions need more and more knowledge about machine learning and will also require more and more data to train.