I want to implement an image DB. If i've got it correctly, I need to use BoW and then I would be able to classify a new image if I have seen it before.
But all examples I have seen nefore (e.g. this) have some set of traning images, when training is done then BoW is using to classify the rest.
My situation is different: I have to search through the BoW for an image, and if there is no match then we consider it to be a new image, otherwise we don't add it and say "already exist". Skewed, rescaled and others images should be considered the same, so I used SIFT as a feature detector.
So the problem in a nutshell: At beginning, I have an empty database of images (so I have no data to train on). I have a stream of images. I can calculate features of this image. Then I have check if there is some images which descriptors match descriptors of this image more than some treshold then I say "existing", otherwise I add these descriptors as a new image and return "new image". If I get the same image (or similar image) in future I say "existing".
It requires rebuild BoW with every new image, which seems to be a very complicated solution. All search engines on the internet are doing this so I'm surprised there is almost no information in the web.
Okay, I tried to use PHash
to perform a search operation. But it fails to recognize slightly different images. here is a sample code (it's a thin Rust wrapper over C++ interface, so don't be affraid to see an alien language):
pub trait Database {
fn save_image(&mut self, image: &[u8]);
fn load_images(&self) -> Vec<Vec<u8>>;
}
pub enum ImageVariant {
New,
AlreadyExists
}
pub struct Storage<T: Database> {
database: T,
hasher: PHash,
images: Vec<Mat>
}
impl<T: Database> Storage<T> {
pub fn new(database: T) -> Self {
Self {
database,
hasher: PHash::new(),
images: Vec::new()
}
}
}
// the main execution happens here
impl<T: Database, D: Clone> Storage<T> {
pub fn save_image_if_new(&mut self, image: &[u8], filename: &str) -> ImageVariant {
const DIFF: f64 = 0.5;
let mat = Mat::image_decode(image, ImageReadMode::Grayscale);
let mat = self.hasher.compute(&mat);
let mut last_diff = std::f64::INFINITY;
for image in self.images.iter() {
let diff = self.hasher.compare(&mat, &image);
if diff < last_diff {
last_diff = diff;
}
}
if last_diff < DIFF {
return ImageVariant::AlreadyExists;
}
self.database.save_image(image);
self.images.push(mat);
ImageVariant::New
}
}
Then I run the test on following images:
-
-
-
So PHash.compare gives distance 28
for images 1 and 2 (should get small distance), while it gives 30
for images 1 and 3 (should get a big distance). So it's practically doesn't allow to know if I already say this image before.
The entire code is available here and here
Test code:
#[test]
fn it_works() {
let lenna = fs::read(get_asset_path("lenna.png")).unwrap();
let lenna_demotivator = fs::read(get_asset_path("lenna_demotivator.png")).unwrap();
let solvay_conference = fs::read(get_asset_path("Solvay_conference_1927.jpg")).unwrap();
let db = imagedb::InMemoryDatabase::new();
let mut storage = imagedb::Storage::new(db);
let result = storage.save_image_if_new(&lenna, "lenna");
let result_demotivator = storage.save_image_if_new(&lenna_demotivator, "lenna demotivator");
let result_solvay_conference = storage.save_image_if_new(&solvay_conference, "solvay_conference");
assert_eq!(result, imagedb::ImageVariant::New);
assert_ne!(result_demotivator, imagedb::ImageVariant::New);
assert_eq!(result_solvay_conference, imagedb::ImageVariant::New);
}