Key-Frame VO Key-Point Data Fusion
I have a key-point based visual odometry routine which accepts as input an RGB-D frame. Successive image are tracked to each other and a cumulative rotation and translation is maintained. In this current form, significant drift occurs. I intend to transition this routine to make use of key-frames, whereby until some sufficient displacement has occurred to necessitate a new key-frame, new RGB-D frames are tracked to the most recent key-frame. Key-frames should significantly reduce drift and are useful for further processing if so desired (m-frame bundle adjustment, etc.).
My question is pretty fundamental. Assume I have performed tracking (key-point matching and PNP) and have [R|t] for the current frame to the current key-frame. Now, given a key-point pair, one in the key-frame and one in the current frame, each with 3D position and uncertainty/covariance, how can I fuse the new data into the key-frame data? Of course, there are many papers that dance around this and take it for granted, but for someone new to this sort of thing, I am having trouble finding a source that offers a good explanation (this might even come from radar systems).
So, you're trying to update the R|t from current to key frame? Why not just add the correspondence to the set you're using to get the R|t you have so far? Maybe I just don't understand by what you mean when you say "fuse it into the key-frame data". Wouldn't the R|t be the current frame's data because the key-frame is the reference?
By fuse, I mean to fuse the estimated position and uncertainty of the current keypoint into its corresponding keypoint in the keyframe. The fused measurement then reflects the most probable position and the combined uncertainty of all fused measurements to that keypoint. I would expect this to mean that as multiple frames are tracked to a single keyframe, the uncertainty in the position of the keypoints would decrease and the estimated locations would more closely resemble their true positions.