Yesterday I got this question asked at an interview. The interviewer said that one of his friends set up a video camera on the ceiling of his office looking down directly to the floor. He tried to detect and count humans using HoG features, but it failed. Why did it fail? I am studying HoG features but I am not able to understand what the answer should be. Please help.

probably because the cascade(or svm) was trained on data from side view ( inria data set )

