1 | initial version |
The more I think about it, the more complex it seems to me. What you describe consists indeed of the three steps you mention. To alleviate the problem I'd suggest that you start from something simpler:
From this simpler problem, you need to locate the cube in the image / the nine stickers of each face. The steps are:
Having solved this easier problem you can then try to work on assumption 1 and detect that there is a cube at all, there you can use the cascade-classifier using e.g. LBP or HOG, which should work very good, since the rubic cube is an easy target (keypoint based method via feature detection/matching should work as well).
Assumptions 2 and 3 are imho the most difficult ones since you need to either a) build a model of the rubik cube with one camera drive over the whole cube, or b) to estimate which face the camera is currently looking at and then run the detection process as before, but finding the transitions between the faces (i.e. when does the face change) is the difficult part. Maybe some simple heuristics like: a) when are the nine stickers of one face clearly visible and b) a check if their configuration hasn't been captured yet could help here.
Good luck with your nice project! Let us know, when you are done or facing any issues.