| Detecting word substitutions: PMI vs. HMM |
| Full text |
Pdf
(224 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Amsterdam, The Netherlands
POSTER SESSION: Posters
table of contents
Pages: 885 - 886
Year of Publication: 2007
ISBN:978-1-59593-597-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 55, Citation Count: 0
|
|
|
ABSTRACT
Those who want to conceal the content of their communications can do so by replacing words that might trigger attention. For example, instead of writing "The bomb is in position", a terrorist may chose to write "The flower is in position." The substituted sentence would sound a bit "odd" for a human reader and it has been shown in prior research that such oddity is detectable by text mining approaches. However, the importance of each component in the suggested oddity detection approach has not been thoroughly investigated. Also, the approach has not been compared with such an obvious candidate for the task as Hidden Markov Models (HMM). In this work, we explore further oddity detection algorithms reported earlier, specifically those based on pointwise mutual information (PMI) and Hidden Markov Models (HMM).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
British National Corpus (BNC), 2004. www.natcorp.ox.ac.uk.
|
| |
3
|
Lee, H. and Ng, A.Y. Spam deobfuscation using a Hidden Markov Model. In Proceedings of the Second Conference on Email and Anti-Spam, 2005.
|
| |
4
|
S. Fong, D. Skillicorn, and D. Roussinov, "Measures to detect word substitution in intercepted communication," IEEE International Conference on Intelligence and Security Informatics, ISI 2006, San Diego, CA, USA, May 23-24, ser. LNCS 3975.
|
| |
5
|
Dmitri Roussinov , Leon J. Zhao , Weiguo Fan, Mining context specific similarity relationships using the world wide web, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.499-506, October 06-08, 2005, Vancouver, British Columbia, Canada
[doi> 10.3115/1220575.1220638]
|
| |
6
|
Skillicorn, D.B. Beyond keyword filtering for message and conversation detection. In IEEE International Conference on Intelligence and Security Informatics (ISI2005), pages 231--243. Springer-Verlag Lecture Notes in Computer Science LNCS 3495, May 2005.
|
| |
7
|
|
|