ACM Home Page
Please provide us with feedback. Feedback
Sequential reservoir sampling with a nonuniform distribution
Full text PdfPdf (296 KB)
Source ACM Transactions on Mathematical Software (TOMS) archive
Volume 32 ,  Issue 2  (June 2006) table of contents
Pages: 257 - 273  
Year of Publication: 2006
ISSN:0098-3500
Authors
M. Kolonko  Technical University of Clausthal, Clausthal-Zellerfeld, Germany
D. Wäsch  Technical University of Clausthal, Clausthal-Zellerfeld, Germany
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 42,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1141885.1141891
What is a DOI?

ABSTRACT

We present a simple algorithm that allows sampling from a stream of data items without knowing the number of items in advance and without having to store all items in main memory. The sampling distribution may be general, that is, the probability of selecting a data item i may depend on the individual item. The main advantage of the algorithms is that they have to pass through the data items only once to produce a sample of arbitrary size n.We give different variants of the algorithm for sampling with and without replacement and analyze their complexity. We generalize earlier results of Knuth on reservoir sampling with a uniform sampling distribution. The general distribution considered here allows us to sample an item with a probability equal to the relative weight (or fitness) of the data item within the whole set of items. Applications include heuristic optimization procedures such as genetic algorithms where solutions are sampled from a population with probability proportional to their fitness.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Devroye, L. 1986. Nonuniform Random Variate Generation. Springer-Verlag, New York, NY.
 
2
 
3
4
 
5
Reeves, C. 2003. Genetic algorithms. In Handbook of Metaheuristics, F. Glover and G. A. Kochenberger, Eds. Kluwer Academic Publishers, Boston, MA.
6