|
ABSTRACT
Multivariate time series (MTS) datasets are common in various multimedia, medical and financial applications. We propose a similarity measure for MTS datasets, <i>Eros</i> <i>E</i>xtended F<i>ro</i>beniu<i>s</i> norm), which is based on Principal Component Analysis (PCA). <i>Eros</i> applies PCA to MTS datasets represented as matrices to generate principal components and associated eigenvalues. These principal components and eigenvalues are then used to compare the similarity between MTS matrices. Though <i>Eros</i> in itself does not satisfy the triangle inequality, without which existing multidimensional indexing structures may not be utilized, the lower and upper bounds to satisfy the triangle inequality are obtained. In order to show the validity of <i>Eros</i> for similarity search on MTS datasets, we performed several experiments on three datasets (2 real-world and 1 synthetic). The results show the superiority of our approaches as compared to the traditional similarity measures for MTS datasets, such as Euclidean Distance (ED), Dynamic Time Warping (DTW), Weighted Sum SVD (WSSVD) and PCA similarity factor (S<sc>PCA</sc>) in precision/recall.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
J. Alon, S. Sclaroff, G. Kollios, and V. Pavlovic. Discovering clusters in motion time-series data. In IEEE CVPR, 2003.
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
G. H. Golub and C. F. V. Loan. Matrix Computations. Johns Hopkins Univ Press, 1996.
|
| |
11
|
C. Goutte, P. Toft, E. Rostrup, F. A. Nielsen, and L. K. Hansen. On clustering fmri time series. NeuroImage, 9(3), 1999.
|
 |
12
|
|
| |
13
|
S. Hettich and S. D. Bay. The UCI KDD Archive. http://kdd.ics.uci.edu, 1999.
|
| |
14
|
F. Hoppner. Learning dependencies in multivariate time series. In Proc. of the ECAI'02 Workshop, 2002.
|
| |
15
|
J. E. Jackson. A User's Guide to Principal Components. Wiley-Interscience, 1991.
|
| |
16
|
|
| |
17
|
E. Keogh. Exact indexing of dynamic time warping. In VLDB, 2002.
|
| |
18
|
D. Kifer, S. Ben-David, and J. Gehrke. Detecting change in data streams. In Very Large Databases, 2004.
|
 |
19
|
|
 |
20
|
Flip Korn , H. V. Jagadish , Christos Faloutsos, Efficiently supporting ad hoc queries in large datasets of time sequences, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.289-300, May 11-15, 1997, Tucson, Arizona, United States
|
| |
21
|
W. Krzanowski. Between-groups comparison of principal components. JASA, 74(367), 1979.
|
| |
22
|
T. K. Moon and W. C. Stirling. Mathematical Methods and Algorithms for Signal Processing. Prentice Hall, 2000.
|
| |
23
|
C. Myers, L. R. Rabiner, and A. E. Rosenberg. Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE TASSP, ASSP-28(6), 1980.
|
 |
24
|
|
| |
25
|
K. pong Chan and A. W.-C. Fu. Efficient time series matching by wavelets. In ICDE, 1999.
|
| |
26
|
|
| |
27
|
T. M. Rath and R. Manmatha. Lower-bounding of dynamic time warping distances for multivariate time series. Technical Report MM-40, University of Massachusetts Amherst, 2002.
|
| |
28
|
D. Roverso. Plant diagnostics by transient classification: The aladdin approach. IJIS,Special Issue on Intelligent Systems for Plant Surveillance and Diagnostics, 2002.
|
| |
29
|
H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE TASSP, ASSP-26(1), 1978.
|
| |
30
|
|
| |
31
|
C. Shahabi. AIMS: An immersidata management system. In VLDB CIDR, 2003.
|
| |
32
|
C. Shahabi and D. Yan. Real-time pattern isolation and recognition over immersive sensor data streams. In the 9th International Conference On Multi-Media Modeling, 2003.
|
| |
33
|
D. Singhal, A.; Seborg. Clustering of multivariate time-series data. In Proceedings of the American Control Conference, volume 5, 2002.
|
 |
34
|
|
 |
35
|
Michael Steinbach , Pang-Ning Tan , Vipin Kumar , Steven Klooster , Christopher Potter, Discovery of climate indices using clustering, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C.
[doi> 10.1145/956750.956801]
|
| |
36
|
Tanawongsuwan and Bobick. Performance analysis of time-distance gait parameters under different speeds. In 4th International Conference on Audio- and Video Based Biometric Person Authentication, Guildford, UK, June 2003.
|
| |
37
|
A. Tucker, S. Swift, and X. Liu. Variable grouping in multivariate time series via correlation. IEEE TSMC, Part B, 31(2), 2001.
|
| |
38
|
M. Vlachos, G. Kollios, and D. Gunopulos. Discovering similar multidimensional trajectories. In ICDE, 2002.
|
 |
39
|
Yi-Leh Wu , Divyakant Agrawal , Amr El Abbadi, A comparison of DFT and DWT based similarity search in time-series databases, Proceedings of the ninth international conference on Information and knowledge management, p.488-495, November 06-11, 2000, McLean, Virginia, United States
[doi> 10.1145/354756.354857]
|
| |
40
|
|
| |
41
|
X. L. Zhang, H. Begleiter, B. Porjesz, W. Wang, and A. Litke. Event related potentials during object recognition tasks. Brain Research Bulletin, 38(6), 1995.
|
|