If you want to detect people, look at the people detector of HOGdescriptor (This is the GPU version, but it's almost the same). Look at the sample cpp/peopledetect.cpp to see a full implementation. The sample is drawing box around detected people, and to find the center of that box it's really basic math... no need to use a moment. You could used the moment if you have subtract the background, find the people (in color image, not in foreground) and apply the foreground as a mask of pixel around your detection to have the center of mass of the person, but I'm not sure it's useful.

For the tracking, ie keep the same label for people across time, you could use the Kalman filter, and the sample in cpp/kalman.cpp. More complex approach could involve the blobdetector. The blobdetector is working like other features detector (see tutorial here). It could be a second stage after people detection.

I'm not sure you should use a background subtraction, especially if your camera is moving (PTZ camera for example).