Ask Your Question
0

Curious as to why keypoints in video of static room seem to come and go

asked 2019-09-26 19:03:01 -0600

eric_engineer gravatar image

updated 2019-09-26 20:18:46 -0600

I wrote a simple python script to process an mp4 video that I took with my phone. The phone was propped up with a clamp, and held perfectly still. It's just looking at my room, there's no motion going on. So I process that video through SIFT, and draw keypoints for every frame then output the video.

Why do some keypoints come and go if it's the same static room in the recording? Is there some uncertainty or randomness inherent in the detector algorithm? Or is it more likely some compression artifacts introduced by the H.264 encoding? Maybe my lighting that's running at 60hz is dimming just enough to periodically cause different frames? I'm not sure but these are the things I'm speculating about.

If I ran SIFT against the same JPEG hundreds of time would you expect to get the same exact keypoints or would some of them come and go as well?

Thanks for any advice you can give, I'm just curious about why this sort of thing happens.

Edit: I took a jpeg picture of my room and and ran it through SIFT feature detect a few hundred times, and made the output a mp4. Features in that were rock solid :) So... it's something that changes in my stream or maybe lighting I guess.

edit retag flag offensive close merge delete

Comments

About light condition http://perso.univ-lemans.fr/~berger/A...

1000fps with artificial light

LBerger gravatar imageLBerger ( 2019-09-27 01:27:24 -0600 )edit
1

Thank you that's a pretty cool video.

eric_engineer gravatar imageeric_engineer ( 2019-09-27 09:36:26 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
2

answered 2019-09-27 00:35:38 -0600

berak gravatar image

It's just looking at my room, there's no motion going on.

hehe, no visible motion for a human, but at pixel level, --- whoa.

Maybe my lighting

sure.

If I ran SIFT against the same JPEG

don't use jpeg compression for computer-vision, you'll never get back the same thing twice

Edit: I took a jpeg picture of my room

yea, now try that from another machine, a different libjpeg version, etc.

edit flag offensive delete link more

Comments

@berak Considering your remark to the jpeg format Do you think feeding pngs instead of jpeg to a object detector for training will give better results?

holger gravatar imageholger ( 2019-09-27 08:38:49 -0600 )edit

@holger, no i don't think, there's much difference. (you even train that thing to be robust against small variation, no ?)

its another story with SIFT, where OP somehow expected reproducable, pixel-exact results

berak gravatar imageberak ( 2019-09-27 09:11:06 -0600 )edit

Got it - thank you!

holger gravatar imageholger ( 2019-09-27 09:12:28 -0600 )edit
1

Thanks for the advice as usual

eric_engineer gravatar imageeric_engineer ( 2019-09-27 09:34:58 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2019-09-26 19:03:01 -0600

Seen: 270 times

Last updated: Sep 27 '19