Basically the speed isn't determined by your webcam quality, but by the scene it has to detect in. I talk from experience that any 640x480 pixel webcam, which isn't even HD, can get face tracking done almost realtime without exact problems.
It all comes down to downscaling your search space for candidates? Use limitations on scale space, use background-foreground subtraction, use histogram equalization, ...
However always keep in mind that openCV included face detector as initiator for your tracker isn't a very robust model. It works well when sitting in a constant environment, behind a desk, but it fails rather quickly in a highly changing environment. If you can track using other techniques, like active shape models, which need an allocation of the model phase, then switch to them rather quickly. These can all be implemented realtime.