ACM Home Page
Please provide us with feedback. Feedback
Correlating multilingual documents via bipartite graph modeling
Full text PdfPdf (79 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
POSTER SESSION: Poster session table of contents
Pages: 443 - 444  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Hongyuan Zha  The Pennsylvania State University, University Park, PA
Xiang Ji  The Pennsylvania State University, University Park, PA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 32,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564485
What is a DOI?

ABSTRACT

There is enormous amount of multilingual documents from various sources and possibly from different countries describing a single event or a set of related events. It is desirable to construct text mining methods that can compare and highlight similarities and differences of those multilingual documents. We discuss our ongoing research that seeks to model a pair of multilingual documents as a weighted bipartite graph with the edge weights computed by means of machine translation. We use spectral method to identify dense subgraphs of the weighted bipartite graph which can be considered as corresponding to sentences that correlate well in textual contents. We illustrate our approach using English and German texts.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Systran: Information and Translation Technologies. http://www.systransoft.com/
 
2
Translingual Information Detection, Extraction and Summarization. http://www.darpa.mil/ito/research/tides/
3