| The handwritten trie: indexing electronic ink |
| Full text |
Pdf
(1.19 MB)
|
| Source
|
International Conference on Management of Data
archive
Proceedings of the 1995 ACM SIGMOD international conference on Management of data
table of contents
San Jose, California, United States
Pages: 151 - 162
Year of Publication: 1995
ISBN:0-89791-731-6
Also published in ...
|
|
Authors
|
|
Walid Aref
|
Matsushita Information Technology Laboratory, 2 Research Way, 3rd Floor, Princeton, N.J.
|
|
Daniel Barbará
|
Matsushita Information Technology Laboratory, 2 Research Way, 3rd Floor, Princeton, N.J.
|
|
Padmavathi Vallabhaneni
|
Matsushita Information Technology Laboratory, 2 Research Way, 3rd Floor, Princeton, N.J.
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 28, Citation Count: 2
|
|
|
ABSTRACT
The emergence of the pen as the main interface device for personal digital assistants and pen-computers has made handwritten text, and more generally ink, a first-class object. As for any other type of data, the need of retrieval is a prevailing one. Retrieval of handwritten text is more difficult than that of conventional data since it is necessary to identify a handwritten word given slightly different variations in its shape. The current way of addressing this is by using handwriting recognition, which is prone to errors and limits the expressiveness of ink. Alternatively, one can retrieve from the database handwritten words that are similar to a query handwritten word using techniques borrowed from pattern and speech recognition. In particular, Hidden Markov Models (HMM) can be used as representatives of the handwritten words in the database. However, using HMM techniques to match the input against every item in the database (sequential searching) is unacceptably slow and does not scale up for large ink databases. In this paper, an indexing technique based on HMMs is proposed. The new index is a variation of the trie data structure that uses HMMs and a new search algorithm to provide approximate matching. Each node in the tree contains handwritten letters, where each letter is represented by an HMM. Branching in the trie is based on the ranking of matches given by the HMMs. The new search algorithm is parametrized so that it provides means for controlling the matching quality of the search process via a time-based budget. The index dramatically improves the search time in a database of handwritten words. Due to the variety of platforms for which this work is aimed, ranging from personal digital assistants to desktop computers, we implemented both main-memory and disk-based systems. The implementations are reported in this paper, along with performance results that show the practicality of the technique under a variety of conditions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
W. Aref, D. Barbar~, and P. Vallabhaneni. The Handwritten Trie: Indexing Electronic Ink. Technical report, M.I.T.L, October 1994.
|
| |
2
|
Walid Aref and Daniel Barbar~. The Hidden Markov Model Tree Index: A Practical Approach to Fast Recognition of Handwritten Documents in Large Databases. Technical Report MITL-TR-84- 93, MITL, January 1994.
|
| |
3
|
Walid G. Aref, Padmavathi Vallabhaneni, and Daniel Barbar~. Towards a realization of handwritten databases: Training and recognition. Technical Report MITL-TR-98-94, Matsushita information Technology Laboratory, Princeton, NJ, March 1994.
|
| |
4
|
R. Bakis. Continuous speech word recognition via centisecond acoustic states. In Proc. ASA Meeting, Washington, DC, April 1976.
|
| |
5
|
Daniel Barbar~. Method to index electronic handwritten documents. Technical Report MITL-TR- 77-93, Matsushita Information Technology Laboratory, Princeton, N3, November 1993.
|
| |
6
|
|
| |
7
|
Rene de la Briandais. File searching using variable length keys. in Proceedings of the Western Joint Computer Conference, pages 295-298, 1959.
|
 |
8
|
|
| |
9
|
A. Kaltenmeier, T. Caesar, J. M. Gloger, and E. Mandler. Sophisticated topology of hidden markov models for cursive script recognition. In Proceeding8 of the Sad. International Confe,ence on Document Analysis and Recognition, pages 139- 142, Tsukuba Science City, October 1993.
|
| |
10
|
|
| |
11
|
K. Landau, S. Major, and C. Wiederhold. The Role of PDA in the Office. Notes of the Seminar at PC-Expo, June 1994.
|
| |
12
|
S. E. Levinson, L. R. Rabiner, and M. M. Sondhi. An introduction to the application of the theory of probabilistic functions of a markov proces to automatic speech recognition. Bell System Technical Journal, 62(4):1035-1074, April 1983.
|
| |
13
|
D. P. Lopresti and A. Tomkins. Approximate Matching of Hand-Drawn Pictograms. In Proceedings of the Third International Workshop or~ Frontiers in Handwriting Recognition, May 1993.
|
 |
14
|
|
| |
15
|
I-ice-Seen Park and Seong-Whan Lee. Large-set handwritten character recognition with multiple stochastic models. In Proceedings of ~he Znd. International Conference on Document Analysis and Recognition, pages 143-146, Tsukuba Science City, October 1993.
|
| |
16
|
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceeding of the IEEE, 77(2):257-285, February 1989.
|
| |
17
|
|
| |
18
|
|
| |
19
|
Bong Kee Sin and Jim Hyuung Kim. A statistical approach with HMMs for on-line cursive I-iangui (Korean script) recognition. In Proceedings of the ~nd. International Conference on Document Analysis and Recognition, pages 147-150, Tsukuba Science City, October 1993.
|
| |
20
|
W. Smith. The Newton Application Architecture. In Proceedings of the IEEE Computer Conference, San Francisco, 1994.
|
|