ACM Home Page
Please provide us with feedback. Feedback
Natural language processing and e-Government: crime information extraction from heterogeneous data sources
Full text PdfPdf (565 KB)
Source
dg.o; Vol. 289 archive
Proceedings of the 2008 international conference on Digital government research table of contents
Montreal, Canada
SESSION: Research papers and management, case study & policy papers: public safety table of contents
Pages 162-170  
Year of Publication: 2008
ISBN:978-1-60558-099-9
Authors
Chih Hao Ku  Claremont Graduate University, Claremont, CA
Alicia Iriberri  Claremont Graduate University, Claremont, CA
Gondy Leroy  Claremont Graduate University, Claremont, CA
Sponsors
: Routledge
: Elsevier
: Springer
: Cefrio
NCDG : National Center for Digital Government
Publisher
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 107,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Much information that could help solve and prevent crimes is never gathered because the reporting methods available to citizens and law enforcement personnel are not optimal. Detectives do not have sufficient time to interview crime victims and witnesses. Moreover, many victims and witnesses are too scared or embarrassed to report incidents. We are developing an interviewing system that will help collect such information. We report here on one component, the crime information extraction module, which uses natural language processing to extract crime information from police reports, newspaper articles, and victims' and witnesses' crime narratives. We tested our approach with two types of document: police and witness narrative reports. Our algorithms extract crime-related information, namely weapons, vehicles, time, people, clothes, and locations. We achieved high precision (96%) and recall (83%) for police narrative reports and comparable precision (93%) but somewhat lower recall (77%) for witness narrative reports. The difference in recall was significant at p < .05. We then used a spell checker to evaluate if this would help with witness narrative processing. We found that both precision (94%) and recall (79%) improved slightly.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
E. D. d. Leeuw, "Reducing Missing Data in Surveys: An Overview of Methods," Quality and Quantity, vol. 35, pp. 147--160, 2001.
 
2
 
3
A. Iriberri and G. Leroy, "Natural Language Processing and e-Government: Extracting Reusable Crime Report Information," in Information Reuse and Integration, 2007. IRI 2007. IEEE International Conference on, Las Vegas, NV, USA, 2007, pp. 221--226.
 
4
L. Hirschman, A. Yeh, C. Blaschke, and A. Valencia, "Overview of BioCreAtIvE: Critical Assessment of Information Extraction for Biology," BMC Bioinformatics, vol. 6, May 2005.
 
5
 
6
A. M. Lyons, J. Michael W. Packer, M. B. Thomason, J. C. Wesley, P. J. Hansen, J. H. Conklin, and D. E. Brown, "Uniform Crime Report "SuperClean" Data Cleaning Tool," Systems and Information Engineering Design Symposium, 2006 IEEE, pp. 14--18, 2006.
 
7
 
8
D. Maynard, K. Bontcheva, and H. Cunningham, "Towards a Semantic Extraction of Named Entities," in Recent Advances in Natural Language Processing Bulgaria, 2003.
 
9
K. Pastra, D. Maynard, O. Hamza, H. Cunningham, and YorickWilks, "How Feasible is the Reuse of Grammars for Named Entity Recognition?," in In Proceedings of the Language Resources and Evaluation Conference, 2002, pp. 1412--1418.
 
10
K. Bontcheva, M. Dimitrov, D. Maynard, V. Tablan, and H. Cunningham, "Shallow Methods for Named Entity Coreference Resolution," in Workshop TALN 2002 Nancy, France, 2002.
 
11
 
12
 
13
B. T. Pentland, "Building Process Theory with Narrative: From Description to Explanation," Academy of Management Review, vol. 24, pp. 711--724, Oct. 1999.
 
14
A. Borthwick, J. Sterling, E. Agichtein, and R. Grishman, "NYU: Description of the MENE Named Entity System as Used in MUC-7," in in Proceedings of the Seventh Message Understanding Conference (MUC-7), April 1998.
 
15
 
16
 
17
 
18
Y. Liu, Y. Lin, and Z. Chen, "Using Hidden Markov Model for Information Extraction Based on Multiple Templates," in Natural Language Processing and Knowledge Engineering, 2003. Proceedings, 2003, pp. 394--399.
 
19
D. Maynard, M. Yankova, A. Kourakis, and A. Kokossis, Ontology-Based Information Extraction For Market Monitoring And Technology Watch. Heraklion, Crete, 2005.
 
20
H. Cunningham, "GATE, a General Architecture for Text Engineering," Computers and the Humanities, vol. 36, pp. 223--254, May 2002.
 
21
 
22
 
23
L. A. Ramshaw and M. P. Marcus, "Text Chunking using Transformation-Based Learning," in Proceedings of the Third Workshop on Very Large Corpora, 1995, pp. 82--94.
 
24
A. Memon and R. Bull, "The Cognitive Interview - Its Origins, Empirical Support, Evaluation and Practical Implications," Journal Of Community & Applied Social Psychology, vol. 1, pp. 291--307, 1991.

Collaborative Colleagues:
Chih Hao Ku: colleagues
Alicia Iriberri: colleagues
Gondy Leroy: colleagues