ACM Home Page
Please provide us with feedback. Feedback
A multidimensional scaling approach for representing XML documents
Full text PdfPdf (367 KB)
Source ACM Southeast Regional Conference archive
Proceedings of the 45th annual southeast regional conference table of contents
Winston-Salem, North Carolina
SESSION: Papers table of contents
Pages: 111 - 115  
Year of Publication: 2007
ISBN:978-1-59593-629-5
Authors
Zhonghang Xia  Western Kentucky University, Bowling Green, KY
Gugangming Xing  Western Kentucky University, Bowling Green, KY
Qi Li  Western Kentucky University, Bowling Green, KY
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 49,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1233341.1233362
What is a DOI?

ABSTRACT

It has been shown that storing documents having similar structures together can reduce the fragmentation problem and improve query efficiency. Unlike the flat text document, the Web document has no standard vectorial representation, which is required in most existing classification algorithms. In this paper, we propose a vectorization method for XML documents by using multidimensional scaling (MDS) so that Web documents can be fed into an existing classification algorithm. The classical MDS embeds data points into an Euclidean space if the similarity matrix constructed by the data points is semidefinite. The semidefniteness condition, however, may not hold due to the inference technique used in practice. We will find a semi-definite matrix which is the closest to the distance matrix in the Euclidean space. Based on recent developments on strongly semismooth matrix valued functions, we solve the nearest semi-definite matrix problem with a Newton-type method. Experimental studies show that the classification accuracy can be improved.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
C. Burges, Geometric methods for feature extraction and dimensional reduction. In L. Rokach and O. Maimon (Eds.), Data mining and knowledge discovery handbook: A complete guide for practition- ers and researchers. Kluwer Academic Publishers, 2005
 
4
Cox, T., and Cox, M. Multidimensional scaling. London: Chapman & Hall, 1994
5
 
6
 
7
Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.
 
8
F. H. Clarke, Optimization and Nonsmooth Analysis, John Wiley & Sons, New York, 1983.
9
 
10
N. J. Higham, Computing the nearest correlation matrix - a problem from finance, IMA J. Numer. Analysis 22 (2002), pp. 329--343.
 
11
 
12
 
13
M. Murata Hedge Automata: A Formal Model for XML Schemata, http://www.xml.gr.jp/relax/hedge_nice.html
 
14
 
15
A. Nierman and H. V. Jagadish, Evaluating structural similarity in XML documents, WebDB 2002, Madison, Wisconsin, June 2002.
 
16
 
17
R. T. Rockafellar, Conjugate Duality and Optimization, SIAM, Philadelphia, 1974.
18
 
19
Schölkopf, B., K. Tsuda and J. P. Vert, Kernel Methods in Computational Biology, MIT Press, Cambridge, MA, USA (2004).
 
20
D. Shasha and K. Zhang, Approximate Tree Pattern Matching, Chapter 14 Pattern Matching Algorithms (eds. Apostolico, A. and Galil, Z.), Oxford University Press, June 1997.
 
21
J. F. Sturm, Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones, Optimization Methos and Software, vol. 11--12, 625--653.
 
22
 
23
V. de Silva and J. B. Tenenbaum, Global versus local methods in nonlinear dimensionality reduction, in Advances in Neural Information Processing Systems 15, S. T. S. Becker and K. Obermayer, Eds. Cambridge, MA: MIT Press, 2003, pp. 705--712.
 
24
 
25
 
26
XML Document Mining Challenge, http://xmlmining.lip6.fr/.

Collaborative Colleagues:
Zhonghang Xia: colleagues
Gugangming Xing: colleagues
Qi Li: colleagues