ACM Home Page
Please provide us with feedback. Feedback
Data mining with sparse grids using simplicial basis functions
Full text PdfPdf (809 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Francisco, California
Pages: 87 - 96  
Year of Publication: 2001
ISBN:1-58113-391-X
Authors
Jochen Garcke  Universität Bonn, Wegelerstraβe 6, D-53115 Bonn, Germany
Michael Griebel  Universität Bonn, Wegelerstraβe 6, D-53115 Bonn, Germany
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
AAAI : American Association for Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 56,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502512.502528
What is a DOI?

ABSTRACT

Recently we presented a new approach [18] to the classification problem arising in data mining. It is based on the regularization network approach but, in contrast to other methods which employ ansatz functions associated to data points, we use a grid in the usually high-dimensional feature space for the minimization process. To cope with the curse of dimensionality, we employ sparse grids [49]. Thus, only O(hn-1nd-1) instead of O(hn-d) grid points and unknowns are involved. Here d denotes the dimension of the feature space and hn = 2-n gives the mesh size. We use the sparse grid combination technique [28] where the classification problem is discretized and solved on a sequence of conventional grids with uniform mesh sizes in each dimension. The sparse grid solution is then obtained by linear combination. In contrast to our former work, where d-linear functions were used, we now apply linear basis functions based on a simplicial discretization. This allows to handle more dimensions and the algorithm needs less operations per data point.We describe the sparse grid combination technique for the classification problem, give implementational details and discuss the complexity of the algorithm. It turns out that the method scales linearly with the number of given data points. Finally we report on the quality of the classifier built by our new method on data sets with up to 10 dimensions. It turns out that our new method achieves correctness rates which are competitive to that of the best existing methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
R. Balder. Adaptive Verfahren fiat elliptische und parabolische Differentialgleichungen auf diinnen Gittern. Dissertation, Technische Universitait Miinchen, 1994.
 
3
G. Baszenski. N -t h order polynomial spline blending. In W. Schempp and K. Zeller, editors, Multivariate Approximation Theory III, ISNM 75, pages 35-46. Birkhaiuser, Basel, 1985.
 
4
S. D. Bay. The UCI KDD archive. http://kdd.ics.uci.edu, 1999.
 
5
M. J. A. Berry and G. S. Linoff. Mastering Data Mining. Wiley, 2000.
 
6
C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/amlearn/MLRepository.html.
 
7
D. Botkin, J. Janak, and J. Wallis. Some ecological consequences of a computer model of forest growth. J. Ecology, 60:849-872, 1972.
 
8
H.-J. Bungartz. Diinne Gitter und deren Anwendun 9 bei der adaptiven LSsung der dreidimensionalen Poisson-Gleichung. Dissertation, Institut ffir Informatik, Technische Universitait Miinchen, 1992.
 
9
H.-J. Bungartz, T. Dornseifer, and C. Zenger. Tensor product approximation spaces for the efficient numerical solution of partial differential equations. In Proc. Int. Workshop on Scientific Computations, Konya, 1996. Nova Science Publishers, 1997.
 
10
H.-J. Bungartz, M. Griebel, D. RSschke, and C. Zenger. Pointwise convergence of the combination technique for the Laplace equation. East-West J. Numer. Math., 2:21-45, 1994. also as SFB-Bericht 342/16/93A, Institut fiir Informatik, TU Miinchen, 1993.
 
11
 
12
 
13
 
14
 
15
H. Freudenthal. Simplizialzerlegungen yon beschra.nkter Flachheit. Annals of Mathematics, 43:580-582, 1942.
 
16
 
17
J. Garcke and M. Griebel. On the parallelization of the sparse grid approach for data mining. SFB 256 Preprint 721, Universitgt Bonn, 2001. http://wissrech.iam.unibonn.de/research/pub/garcke/psm.pdf.
 
18
J. Garcke, M. Griebel, and M. Thess. Data mining with sparse grids. 2000. Submitted, also as SFB 256 Preprint 675, Institut fiir Angewandte Mathematik, Universitait Bonn, 2000.
 
19
T. Gerstner and M. Griebel. Numerical Integration using Sparse Grids. Numer. Algorithms, 18:209-232, 1998. (also as SFB 256 preprint 553, Univ. Bonn, 1998).
 
20
 
21
 
22
G. Golub, M. Heath, and G. Wahba. Generalized cross validation as a method for choosing a good ridge parameter. Technometrics, 21:215-224, 1979.
 
23
M. Griebel. The combination technique for the sparse grid solution of PDEs on multiprocessor machines. Parallel Processing Letters, 2(1):61-70, 1992. also as SFB Bericht 342/14/91 A, Institut fiir Informatik, TU Mfinchen, 1991.
 
24
 
25
 
26
M. Griebel and S. Knapek. Optimized tensor-product approximation spaces. Constructive Approximation, 16(4):525-540, 2000.
 
27
M. Griebel, P. Oswald, and T. Schiekofer. Sparse grids for boundary integral equations. Numer. Mathematik, 83(2):279-312, 1999. also as SFB 256 report 554, Universitat Bonn.
 
28
M. Griebel, M. Schneider, and C. Zenger. A combination technique for the solution of sparse grid problems. In P. de Groen and R. Beauwens, editors, Iterative Methods in Linear Algebra, pages 263-281. IMACS, Elsevier, North Holland, 1992. also as SFB Bericht, 342/19/90 A, Institut fiir Informatik, TU Miinchen, 1990.
 
29
M. Griebel and V. Thurner. The efficient solution of fluid dynamics problems by the combination technique. Int. J. Num. Meth. for Heat and Fluid Flow, 5(3):251-269, 1995. also as SFB Bericht 342/1/93 A, Institut fiir Informatik. TU Miinchen, 1993.
 
30
M. Hegland, O. M. Nielsen, and Z. Shen. High dimensional smoothing based on multilevel analysis. Technical report, Data Mining Group, The Australian National University, Canberra, November 2000. Submitted to SIAM J. Scientific Computing.
 
31
J. Hoschek and D. Lasser. Grundlagen der goemetrischen Datenverarbeitung, chapter 9. Teubner, 1992.
 
32
H. W. Kuhn. Some combinatorial lemmas in topology. IBM J. Res. Develop., 4:518-524, 1960.
 
33
 
34
G. Melli. Datgen: A program that creates structured data. Website. http://www.datasetgenerator.com.
 
35
 
36
B. D. Ripley. Neural networks and related methods for classification. Journal of the Royal Statistical Society B, 56(3):409-456, 1994.
 
37
 
38
T. Schiekofer. Die Methode der Finiten Differenzen auf diinnen Gittern zur L5sung elliptischer und parabolischer partieller Differentialgleichungen. Doktorarbeit, Institut fiir Angewandte Mathematik, Universitait Bonn, 1999.
 
39
W. Sickel and F. Sprengel. Interpolation on sparse grids and Nikol'skij-Besov spaces of dominating mixed smoothness. J. Comput. Anal. Appl., 1:263-288, 1999.
 
40
 
41
S. A. Smolyak. Quadrature and interpolation formulas for tensor products of certain classes of functions. Dokl. Akad. Nauk SSSR, 148:1042-1043, 1963. Russian, Engl. Transl.: Soviet Math. Dokl. 4:240-243, 1963.
 
42
V. N. Temlyakov. Approximation of functions with bounded mixed derivative. Proc. Steklov Inst. Math,, 1, 1989.
 
43
A. N. Tikhonov and V. A. Arsenin. Solutios of ill-posed problems. W.H. Winston, Washington D.C., 1977.
 
44
F. Utreras. Cross-validation techniques for smoothing spline functions in one or two dimensions. In T. Gasser and M. Rosenblatt, editors, Smoothing techniques for curve estimation, pages 196-231. Springer-Verlag, Heidelberg, 1979.
 
45
 
46
 
47
G. Wahba. Spline models for observational data, volume 59 of Series in Applied Mathematics. SIAM, Philadelphia, 1990.
 
48
A. Wieland. Spiral data set. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/airepository/ai/areas/neural/bench/cmu/0.html.
 
49
C. Zenger. Sparse grids. In W. Hackbusch, editor, Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar, Kiel, 1990, volume 31 of Notes on Num. Fluid Mech. Vieweg-Vcrlag, 1991.


Collaborative Colleagues:
Jochen Garcke: colleagues
Michael Griebel: colleagues