ACM Home Page
Please provide us with feedback. Feedback
Multi-task text segmentation and alignment based on weighted mutual information
Full text PdfPdf (126 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the 15th ACM international conference on Information and knowledge management table of contents
Arlington, Virginia, USA
POSTER SESSION: Posters table of contents
Pages: 846 - 847  
Year of Publication: 2006
ISBN:1-59593-433-2
Authors
Bingjun Sun  The Pennsylvania State University, University Park, PA
Ding Zhou  The Pennsylvania State University, University Park, PA
Hongyuan Zha  The Pennsylvania State University, University Park, PA
John Yen  The Pennsylvania State University, University Park, PA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 45,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1183614.1183760
What is a DOI?

ABSTRACT

Text segmentation is important for text analysis, while text alignment is to determine shared sub-topics among similar documents. Multi-task text segmentation and alignment is the extension of single-task segmentation to utilize information of multi-source documents. In this paper we introduce a novel domain-independent unsupervised method for multi-task segmentation and alignment based on the idea that the optimal segmentation and alignment maximizes weighted mutual information, mutual information with term weights. The experiment results show that our approach works well.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Systems, 1990.
5
 
6
T. Hofmann. Probabilistic latent semantic analysis. In Proc. UAI, 1999.
7
 
8
M. Utiyama and H. Isahara. A statistical model for domain-independent text segmentation. In Proc. ACL, pages 491--498, 1999.


Collaborative Colleagues:
Bingjun Sun: colleagues
Ding Zhou: colleagues
Hongyuan Zha: colleagues
John Yen: colleagues