Ask Your Question

Please suggest me how to do automatic image annotation?

asked 2013-06-20 22:29:53 -0500

roman gravatar image

updated 2015-09-30 07:49:28 -0500

Automatic Image Annotation is the process of automatically labelling an image content. Such as sky, building ,ground ,human ,water ,hill , and so on.

edit retag flag offensive close merge delete



It's great you know what annotation is. However, you should also know that what you want hasn't been done yet. It seems to fall in the same category as this post: Apart from the unfriendly tone, you should keep in mind that those labelled image collections that you can find on the internet were manually labelled, one by one, point by point.

sammy gravatar imagesammy ( 2013-06-21 00:52:46 -0500 )edit

@sammy: just wanted to note that there is research on that topic, see e.g. the work by Tighe and Lazebnik:

Guanta gravatar imageGuanta ( 2013-08-21 05:14:16 -0500 )edit

Maybe good to point to the labelMe software which is an open source labeling project. However, it seems that some people have made good software in the past, like Piotr Dollar did for example, that was opensource, but that is now bought by compagnies liek dropbox and google and who will not share the code anymore. This is still an active research topic in all current computer vision conferences.

StevenPuttemans gravatar imageStevenPuttemans ( 2013-10-17 06:23:27 -0500 )edit

3 answers

Sort by » oldest newest most voted

answered 2013-08-21 09:31:23 -0500

SR gravatar image

It is definitely not true that this problem has not been tackled before. Have a look at the following papers: (In no special order and probably not up-to-date)

  • Monay, F., & Gatica-Perez, D. (2004). pLSA-based image auto-annotation: constraining the latent space. ACM Multimedia, 1–4.

  • Zhang, R., Zhang, L., Wang, X., & Guan, L. (2011). Multi-Feature pLSA for Combining Visual Features in Image Annotation. ACM Multimedia.

  • Socher, R. (2009). Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. IEEE Conference on Computer Vision and Pattern Recognition, 2036–2043.

  • Li, J., & Wang, J. Z. J. (2008). Real-time computerized annotation of pictures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 985–1002.

  • Wang, C., Blei, D., & Li, F. F. (2009). Simultaneous image classification and annotation. IEEE Conference on Computer Vision and Pattern Recognition, 1903–1910.

  • Blei, D. M., & Jordan, M. I. (2003). Modeling annotated data. ACM SIGIR Conference on Research and Development in Information Retrieval, 127.

  • Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D. M., & Jordan, M. I. (2003). Matching words and pictures. Journal of Machine Learning Research, 3, 1107–1135.

  • Cusano, C., Bicocca, M., & Bicocca, V. (2003). Image annotation using SVM. Proceedings of SPIE, (1), 330–338.

  • Sinha, P., & Jain, R. (2008). Classification and annotation of digital photos using optical context data. Proceedings of the 2008 international conference on Content-based image and video retrieval, 309–318.

  • Pham, T., Maillot, N. E., Lim, J., & Chevallet, J. (2007). Latent Semantic Fusion Model for Image Retrieval and Annotation. Image (Rochester, N.Y.), 439–443.

edit flag offensive delete link more

answered 2013-08-21 05:18:21 -0500

As @sammy pointed out, you must understand that what you are trying to accomplish here is a hard and open computer vision problem.

Having said that, if you still interested in the problem then I would look for combination of scene categorization algorithms (for example [1]) and object detection algorithms (for example [2]) for the simple cases. Annotations such as "ground" or "sky" can be done by using the spatial location of the object and it's color. Other annotation such as "hill" and "water" seems very hard and it's not clear (at least, not to me) how to approach them.

[1]Lazebnik, Svetlana, Cordelia Schmid, and Jean Ponce. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories." Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. Vol. 2. IEEE, 2006.‏

[2]Felzenszwalb, Pedro F., et al. "Object detection with discriminatively trained part-based models." Pattern Analysis and Machine Intelligence, IEEE Transactions on 32.9 (2010): 1627-1645.‏

edit flag offensive delete link more

answered 2013-10-15 22:41:49 -0500

updated 2013-10-16 18:39:52 -0500

SR gravatar image

Automatic image annotation (also known as automatic image tagging or linguistic indexing) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database. This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted feature vectors and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images.

edit flag offensive delete link more

Question Tools

1 follower


Asked: 2013-06-20 22:29:53 -0500

Seen: 5,050 times

Last updated: Oct 16 '13