Tracking humans with ceiling-mounted downwards-pointing webcams
I need to track people walking around inside a building. I have a downwards-pointing webcam mounted in the ceiling 6 meters (20 ft) above ground.
Performance is important as the plan is to have it running in real-time on several webcams. Currently I have a single camera mounted, and with background subtraction using VIBE on GPU I am getting ~320 fps on 720p which is great.
While VIBE performs well I am struggling with consistency. If people walk too close to each other they are considered to be one blob. I also need to ignore movement from non-human objects.
I really need some input and ideas from you people on how to determine what is a human and what is not in my VIBE output. What would be sensible approaches? I have tried to define humans as having a certain square pixel size, but I feel there must be smarter ways to discriminate humans from non-human motion.
If an alternative approach would make sense e.g. optical flow, a tracking-learning-detection algorithm (OpenTLD?), a well-performing HOG, I would be very interested in hearing about it.
Any thoughts and input is appreciated :)