Group glyphs images from a book page scan

asked 2015-11-13 16:03:40 -0600

updated 2015-11-14 13:30:13 -0600

Hello!

I am a type designer working on a type revival project. I have a bunch of high quality scans from a book printed with movable type. My goal is to group all the image of the same glyph in a dedicated subFolder.

Using PIL I can easily split the original scan in single glyph images as these:

o n image description

I started to design a script for the grouping task using as source these tutorials (1, 2).

Here you can find my draft.

The script partially works: I got some matches and I managed to group together a decent number of glyphs. But I am wondering if I can improve the results.

A significant number of glyphs doesn't get any match, and some of the matches were just wrong. Given that the complexity of these images is very low, there is maybe room for improvement.

A few questions:

  • I started with the SURF detection algorithm, is it a good choice? There is an algorithm more indicated for this kind of images?
  • the two variables I am using to declare the match as true are:

    • maximum distance inferior to .58
    • the number of matches superior than 60

should I take something else in consideration?

Any comment or suggestion would be really appreciated.

All the best

EDIT

–––––––

Extra images. They all come from a group collected by the script.

The right ones (lowercase 'o', 29/35):

image description image description image description image description

The wrong ones (6/35):

image description image description image description image description

edit retag flag offensive close merge delete

Comments

take a look at shape_example.cpp

sturkmen gravatar imagesturkmen ( 2015-11-14 03:50:25 -0600 )edit

sorry, but c++ is not my thing. Any python resource to the topic?

Roberto Arista gravatar imageRoberto Arista ( 2015-11-14 05:59:38 -0600 )edit

please provide two matched and one different sample image.

sturkmen gravatar imagesturkmen ( 2015-11-14 06:48:58 -0600 )edit

some images added at the bottom of the question!

Roberto Arista gravatar imageRoberto Arista ( 2015-11-14 13:30:35 -0600 )edit

again ShapeContextDistanceExtractor class is good (as i pointed out firstly). try to implement it with python or search a sample python code

sturkmen gravatar imagesturkmen ( 2015-11-14 14:17:34 -0600 )edit