ACM Home Page
Please provide us with feedback. Feedback
Mining citizen science data to predict orevalence of wild bird species
Full text PdfPdf (772 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
POSTER SESSION: Industrial and government applications track posters table of contents
Pages: 909 - 915  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Rich Caruana  Cornell University
Mohamed Elhawary  Cornell University
Art Munson  Cornell University
Mirek Riedewald  Cornell University
Daria Sorokina  Cornell University
Daniel Fink  Cornell Lab of Ornithology
Wesley M. Hochachka  Cornell Lab of Ornithology
Steve Kelling  Cornell Lab of Ornithology
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 72,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150527
What is a DOI?

ABSTRACT

The Cornell Laboratory of Ornithology's mission is to interpret and conserve the earth's biological diversity through research, education, and citizen science focused on birds. Over the years, the Lab has accumulated one of the largest and longest-running collections of environmental data sets in existence. The data sets are not only large, but also have many attributes, contain many missing values, and potentially are very noisy. The ecologists are interested in identifying which features have the strongest effect on the distribution and abundance of bird species as well as describing the forms of these relationships. We show how data mining can be successfully applied, enabling the ecologists to discover unanticipated relationships. We compare a variety of methods for measuring attribute importance with respect to the probability of a bird being observed at a feeder and present initial results for the impact of important attributes on bird prevalence.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
L. Breiman. Random forests. Technical Report 567, University of California Berkeley, Statistics Department, 2001.
 
4
W. Buntine. Artificial Intelligence Frontiers in Statistics, chapterLearning Classification Trees. Chapman and Hall, 1993.
 
5
R. Caruana, A. Niculescu, B. Rao, and C. Simms. Evaluating the C -section rate of different physician practices: Using machine learning to model standard practice. In The American Medical Informatics Conference (AMIA), 2003.
 
6
J. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5):1189--1232, 2001.
 
7
J. Friedman and B. Popescu. Predictive learning via rule ensembles. Technical report, Stanford University, 2005.
 
8
 
9
G. Hooker. Generalized functional ANOVA diagnostics for high dimensionalfunctions of dependent variables. Available at http://ego.psych.mcgill.ca/perpg/pstdc/giles, 2005.
 
10
K. Kira and L. Rendell. The feature selection problem: Traditional methods and a newalgorithm. In Proc. Int. Conf. on Artificial Intelligence (AAAI), 1992.
 
11
R. Kohavi and G. John. The wrapper approach. Artificial Intelligence, 97(1--2), 1997.
 
12
E. L. Lehmann. Nonparametrics: Statistical Methods Based on Ranks. Chapman and Hall/CRC, 1989.
 
13
P. McCullagh and J. A. Nelder. Generalized Linear Models. Mcgraw-Hill, 1989.
14
 
15


Collaborative Colleagues:
Rich Caruana: colleagues
Mohamed Elhawary: colleagues
Art Munson: colleagues
Mirek Riedewald: colleagues
Daria Sorokina: colleagues
Daniel Fink: colleagues
Wesley M. Hochachka: colleagues
Steve Kelling: colleagues