ABSTRACT
MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Hadoop: Open source implementation of MapReduce. http://lucene. apache.org/hadoop/.
|
| |
2
|
The Phoenix system for MapReduce programming. http://csl.stanford. edu/~christos/sw/phoenix/.
|
 |
3
|
Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , David E. Culler , Joseph M. Hellerstein , David A. Patterson, High-performance sorting on networks of workstations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.243-254, May 11-15, 1997, Tucson, Arizona, United States
|
| |
4
|
|
| |
5
|
John Bent , Douglas Thain , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , Miron Livny, Explicit control a batch-aware distributed file system, Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation, p.27-27, March 29-31, 2004, San Francisco, California
|
| |
6
|
|
| |
7
|
Chu, C.-T., Kim, S. K., Lin, Y. A., Yu, Y., Bradski, G., Ng, A., and Olukotun, K. 2006. Map-Reduce for machine learning on multicore. In Proceedings of Neural Information Processing Systems Conference (NIPS). Vancouver, Canada.
|
| |
8
|
|
 |
9
|
Armando Fox , Steven D. Gribble , Yatin Chawathe , Eric A. Brewer , Paul Gauthier, Cluster-based scalable network services, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.78-91, October 05-08, 1997, Saint Malo, France
|
 |
10
|
|
| |
11
|
|
| |
12
|
Gray, J. Sort benchmark home page. http://research.microsoft.com/barc/SortBenchmark/.
|
| |
13
|
Larry Huston , Rahul Sukthankar , Rajiv Wickremesinghe , M. Satyanarayanan , Gregory R. Ganger , Erik Riedel , Anastassia Ailamaki, Diamond: A Storage Architecture for Early Discard in Interactive Search, Proceedings of the 3rd USENIX Conference on File and Storage Technologies, March 31-31, 2004, San Francisco, CA
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
CITED BY 32
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tiankai Tu , Charles A. Rendleman , David W. Borhani , Ron O. Dror , Justin Gullingsrud , Morten Ø. Jensen , John L. Klepeis , Paul Maragakis , Patrick Miller , Kate A. Stafford , David E. Shaw, A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, November 15-21, 2008, Austin, Texas
|
|
|
|
|
|
Bingsheng He , Wenbin Fang , Qiong Luo , Naga K. Govindaraju , Tuyong Wang, Mars: a MapReduce framework on graphics processors, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abdallah Deeb I. Al Zain , Kevin Hammond , Jost Berthold , Phil Trinder , Greg Michaelson , Mustafa Aswad, Low-pain, high-gain multicore programming in Haskell: coordinating irregular symbolic computations on multicore architectures, Proceedings of the 4th workshop on Declarative aspects of multicore programming, January 20-20, 2009, Savannah, GA, USA
|
|
|
Kensuke Onuma , Hanghang Tong , Christos Faloutsos, TANGENT: a novel, 'Surprise me', recommendation algorithm, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|
|
Anastasios Gounaris , Jim Smith , Norman W. Paton , Rizos Sakellariou , Alvaro A. Fernandes , Paul Watson, Adaptive workload allocation in query processing in autonomous heterogeneous environments, Distributed and Parallel Databases, v.25 n.3, p.125-164, June 2009
|
|
|
|
|
|
Wen-Yen Chen , Jon-Chyuan Chu , Junyi Luan , Hongjie Bai , Yi Wang , Edward Y. Chang, Collaborative filtering for orkut communities: discovery of user latent behavior, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
Padmanabhan S. Pillai , Lily B. Mummert , Steven W. Schlosser , Rahul Sukthankar , Casey J. Helfrich, SLIPstream: scalable low-latency interactive perception on streaming data, Proceedings of the 18th international workshop on Network and operating systems support for digital audio and video, June 03-05, 2009, Williamsburg, VA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Chris A Mattmann : Reviewer"
Google has revolutionized the way that large-scale data management is engineered and deployed, and evolves over time. In particular, it developed novel methods for file-based data management for rapid indexing and searching of Web pages (PageRank
more...
|