ACM Home Page
Please provide us with feedback. Feedback
Domain adaptation of information extraction models
Full text PdfPdf (708 KB)
Source
ACM SIGMOD Record archive
Volume 37 ,  Issue 4  (December 2008) table of contents
COLUMN: Special section on managing information extraction table of contents
Pages 35-40  
Year of Publication: 2009
ISSN:0163-5808
Authors
Rahul Gupta  IIT Bombay
Sunita Sarawagi  IIT Bombay
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 67,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1519103.1519109
What is a DOI?

ABSTRACT

Domain adaptation refers to the process of adapting an extraction model trained in one domain to another related domain with only unlabeled data. We present a brief survey of existing methods of retraining models to best exploit labeled data from a related domain. These approaches that involve expensive model retraining are not practical when a large number of new domains have to be handled in an operational setting. We describe our approach for adapting record extraction models that exploits the regularity within a domain to jointly label records without retraining any model.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In ACL, 2007.
 
3
J. Blitzer, R. McDonald, and F. Pereira. Domain Adaptation with Structural Correspondence Learning. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), 2006.
 
4
 
5
J. Duchi, D. Tarlow, G. Elidan, and D. Koller. Using combinatorial optimization within max-product belief propagation. In Advances in Neural Information Processing Systems (NIPS 2006), 2007.
6
 
7
R. Gupta and S. Sarawagi. A generalized framework for collective inference with applications in domain adaptation, Under Preparation.
 
8
J. Huang, A. Smola, A. Gretton, K. Borgwardt, and B. Schölkopf. Correcting Sample Selection Bias by Unlabeled Data. In Advances in Neural Information Processing Systems 20, Cambridge, MA, 2007. MIT Press.
 
9
 
10
F. Peng and A. McCallum. Accurate information extraction from research papers using conditional random fields. In HLT-NAACL, pages 329--336, 2004.
 
11
 
12
 
13
H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, pages 227--244, 2000.
 
14
C. Sutton and A. McCallum. Collective segmentation and labeling of distant entities in information extraction. Technical Report TR # 04-49, University of Massachusetts, July 2004. Presented at ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields.
15

Collaborative Colleagues:
Rahul Gupta: colleagues
Sunita Sarawagi: colleagues