ACM Home Page
Please provide us with feedback. Feedback
Summarization of compressed text images: an experience on Indic script documents
Full text PdfPdf (247 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
POSTER SESSION: Posters group 3: multimedia and domain specific IR table of contents
Pages 803-804  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Author
Utpal Garain  Indian Statistical Institute, Kolkata, India
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 94,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390512
What is a DOI?

ABSTRACT

Automatic summarization of JBIG2 coded textual images is discussed. Compressed images are partially decompressed to compute relevant features. The feature extraction method is free from using any character recognition module. Summary sentences are ranked. Experiment considers documents in Indic scripts that lack in having any efficient OCR systems. Script independent aspect of the approach is highlighted through use of two most popular Indic scripts. Sentence selection efficiency of about 61% is achieved when judged against man-made summarization. A nonparametric (distribution-free) rank statistic shows a correlation coefficient of 0.33 as a measure of the (minimum) strength of the associations between sentence ranking by machine and human.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Lu, Y. and Tan, C.L. 2003. Document Retrieval from Compressed Images. Pattern Recognition, 36, 987--996.
 
4
 
5
ITU-T Recommendation T.88, 2000. Bilevel Image coding.
6
 
7
8