ACM Home Page
Please provide us with feedback. Feedback
Finding regional co-location patterns for sets of continuous variables in spatial datasets
Full text PdfPdf (1.16 MB)
Source
Geographic Information Systems archive
Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems table of contents
Irvine, California
SESSION: OLAP and co-location mining table of contents
Article No. 30  
Year of Publication: 2008
ISBN:978-1-60558-323-5
Authors
Christoph F. Eick  University of Houston, Houston, TX
Rachana Parmar  University of Houston, Houston, TX
Wei Ding  University of Massachusetts, Boston, MA
Tomasz F. Stepinski  Lunar and Planetary Institute, Houston, TX
Jean-Philippe Nicot  University of Texas at Austin, Austin, TX
Sponsors
: Google
: Oak Ridge National Laboratory
: ESRI
Microsoft : Microsoft
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 126,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1463434.1463472
What is a DOI?

ABSTRACT

This paper proposes a novel framework for mining regional co-location patterns with respect to sets of continuous variables in spatial datasets. The goal is to identify regions in which multiple continuous variables with values from the wings of their statistical distribution are co-located. A co-location mining framework is introduced that operates in the continuous domain and which views regional co-location mining as a clustering problem in which an externally given fitness function has to be maximized. Interestingness of co-location patterns is assessed using products of z-scores of the relevant continuous variables. The proposed framework is evaluated by a domain expert in a case study that analyzes Arsenic contamination in Texas water wells centering on regional co-location patterns. Our approach is able to identify known and unknown regional co-location patterns, and different sets of algorithm parameters lead to the characterization of Arsenic distribution at different scales. Moreover, inconsistent colocation sets are found for regions in South Texas and West Texas that can be clearly attributed to geological differences in the two regions, emphasizing the need for regional co-location mining techniques. Moreover, a novel, prototype-based region discovery algorithm named CLEVER is introduced that uses randomized hill climbing, and searches a variable number of clusters and larger neighborhood sizes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
Brimicombe, A. J. 2005. Cluster Detection in Point Event Data Having Tendency Towards Spatially Repetitive Events. In the 8th Intl. Conf. on GeoComputation.
5
 
6
Choo, J., Jiamthapthaksin, R., Chen, C., Celepcikay, O., Giusti, C., and Eick, C. F. 2007. MOSAIC: A Proximity Graph Approach to Agglomerative Clustering. In Proc. of the 9th Intl. Conf. on Data Warehousing and Knowledge Discovery. DaWaK' 07.
 
7
Cougar^2: Data Mining and Machine Learning Framework, https://cougarsquared.dev.java.net/.
 
8
Data Mining and Machine Learning Group, University of Houston, http://www.tlc2.uh.edu/dmmlg.
 
9
 
10
Ding, W., Jiamthapthaksin, R., Parmar, R., Jiang, D., Stepinski, T., and Eick, C. F. 2008. Towards Region Discovery in Spatial Datasets. In Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining (Osaka, Japan, May 2008). PAKDD '08.
 
11
Eick, C. F., Vaezian, B., Jiang, D., and Wang, J. 2006. Discovering of Interesting Regions in Spatial Data Sets Using Supervised Clustering. In Proc. of the 10th European Conference on Principles of Data Mining and Knowledge Discovery. PKDD '06.
 
12
 
13
Getis, A., and Ord, J. K. 1996. Local Spatial Statistics: an Overview. In Spatial analysis: modeling in a GIS environment, Cambridge, GeoInformation International. (Cambridge, 1996), 261--277.
 
14
 
15
 
16
Jaroszewicz, S. 2008. Minimum Variance Associations---Discovering Relationships in Numerical Data. In Proc. of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, (Osaka, Japan, May 2008). PAKDD '08.
 
17
Kaufman, L., and Rousseeuw, P. J. 2005. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons, New Jersey.
 
18
Kulldorff, M. 2001. Prospective Time Periodic Geographical Disease Surveillance Using a Scan Statistic. Journal of the Royal Statistical Society Series A, 164, 6--72.
 
19
Lloyd, S. P. 1982. Least Squares Quantization in PCM. IEEE Trans. on Information Theory, 28, 128--137.
 
20
 
21
Ord, J. K., and Getis, 1995. A. Local Spatial Autocorrelation Statistics: Distributional Issues and an Application. Geographical Analysis, 27(4), 286--306.
 
22
Scanlon, B. R., Nicot, J. P. et al. 2005. Evaluation of Arsenic Contamination in Texas. Technical report prepared for TCEQ, under contract no. UT-08-5-70828.
 
23
 
24
Smedley, P. L., and Kinniburgh, D. G. 2002. A Review of the Source, Behavior and Distribution of Arsenic in Natural Waters. Applied Geochemistry 17, 517--568.
 
25
Smith, A. H. et al. 1992. Cancer Risks From Arsenic in Drinking Water. Environmental Health Perspectives, 97, 259--267.
26
 
27
Texas Water Development Board, http://www.twdb.state.tx.us/home/index.asp
 
28
Xiong, H., Shekhar, S., Huang, Y., Kumar, V., Ma, X., and Yoo, J. S. 2004. A Framework for Discovering Co-location Patterns in Data Sets with Extended Spatial Objects. In Proc. of SIAM Intl. Conf. on Data Mining (SDM).
 
29


Collaborative Colleagues:
Christoph F. Eick: colleagues
Rachana Parmar: colleagues
Wei Ding: colleagues
Tomasz F. Stepinski: colleagues
Jean-Philippe Nicot: colleagues