ACM Home Page
Please provide us with feedback. Feedback
Active storage revisited: the case for power and performance benefits for unstructured data processing applications
Full text PdfPdf (246 KB)
Source
Conference On Computing Frontiers archive
Proceedings of the 5th conference on Computing frontiers table of contents
Ischia, Italy
SESSION: Systems table of contents
Pages 293-304  
Year of Publication: 2008
ISBN:978-1-60558-077-7
Authors
Clinton Wills Smullen, IV  University of Virginia, Charlottesville, VA, USA
Shahrukh Rohinton Tarapore  Lockheed Martin, Cherry Hill, NJ, USA
Sudhanva Gurumurthi  University of Virginia, Charlottesville, VA, USA
Parthasarathy Ranganathan  Hewlett Packard Labs, Palo Alto, CA, USA
Mustafa Uysal  Hewlett Packard Labs, Palo Alto, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 73,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1366230.1366280
What is a DOI?

ABSTRACT

The proliferation of digital data has resulted in a mushrooming of data-intensive applications, especially in the area of unstructured data processing. Given the growing popularity of unstructured data processing applications (e.g., FlickrTM, Google MapsTM), it is important to rethink system architectures to efficiently run these applications, from both the performance and power viewpoints. In this paper, we revisit active storage, which proposed offloading computation to disk drive processors, as a possible system architecture for these applications. Unlike previous work, we evaluate the microarchitectural aspects of active storage and perform an in-depth examination of the design of the offload processors. Using a set of unstructured data processing benchmarks, we examine two choices along the I/O path where the computation can be offloaded in existing system architectures -- a disk drive processor and a disk array controller. Our evaluation demonstrates that there are interesting tradeoffs in the choice of each location and that microarchitectural enhancements to these processors can provide significant performance boosts. We show that active storage architectures can provide large power savings, by using lower-power processors along the I/O path, while exploiting the data-level parallelism on the storage side of the system.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
AMD Opteron Processor Power and Thermal Data Sheet, May 2006.
 
3
 
4
 
5
ARM Collaborates With Seagate For Hard Disc Drive Control, June 2002. ARM Press Release.
 
6
H. Boral and D. DeWitt. Database Machines: An Idea Whose Time Has Passed? In Proceedings of the International Workshop on Database Machines, pages 166?-187, September 1983.
7
 
8
9
 
10
R. Bryant. Data-Intensive Supercomputing: The Case for DISC. Technical Report CMU-CS-07-128, School of Computer Science, Carnegie Mellon University, May 2007.
11
 
12
M. Casey and M. Slaney. Song intersection by approximate nearest neighbor search. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), pages 144?-149, October 2006.
 
13
R. Chamberlain, M. Franklin, and R. Indeck. Exploiting Reconfigurability for Text Search. In Proceedings of the Workshop on High Performance Embedded Computing (HPEC), September 2006.
 
14
Exegy TextMiner (Whitepaper). http://www.exegy.com.
 
15
J. Gantz, D. Reinsel, C. Chute, V. Schlichting, J. McArthur, S. Minton, I. Xheneti, A. Toncheva, and A. Manfrediz. The Expanding Digital Universe - A Forecaset of Worldwide Information Growth Through 2010, March 2007. IDC Whitepaper.
 
16
R. Gens. Geospatial Data Fusion - Seminar Talk, Geophysical Institute, University of Alaska Fairbanks, February 2004.
17
18
 
19
20
 
21
 
22
IBM Unstructured Information Management Architecture. http://www.research.ibm.com/UIMA/.
 
23
Intel PXA 255 Processor. http://www.intel.com/design/pca/prodbref/252780.htm.
 
24
Intel PXA255 Processor - Electrical, Mechanical, and Thermal Specification, February 2004.http://www.intel.com/design/pca/applicationsprocessors/manuals/278780.htm.
 
25
International Technology Roadmap for Semiconductors - 2006 Update, 2006. http://www.itrs.net.
 
26
B. Jarvinen, C. Neumann, and M. Davis. A Tropical Cyclone Data Tape for the North Atlantic Basin, 1886--1983: Contents, Limitations, and Uses. Technical Report NWS NHC 22, National Oceanic and Atmospheric Administration (NOAA), 1984.
27
 
28
Y. Kim, S. Gurumurthi, and A. Sivasubramaniam. Understanding the Performance-Temperature Interactions in Disk I/O of Server Workloads. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), pages 179?-189, February 2006.
29
 
30
T. Lehmann, M. Guld, C. Thies, B. Fischer, D. Keysers, M. Kohnen, H. Schubert, and B. Wein. Content-based Image Retrieval in Medical Applications. Methods of Information in Medicine, 43(4):354?-361, 2004.
 
31
X. Ma and A. N. Reddy. MVSS: An Active Storage Architecture. IEEE Transactions on Parallel and Distributed Systems, 14(10):993-?1005, 2003.
 
32
Micron. http://www.micron.com/.
 
33
MIT Center for Biological and Computational Learning (CBCL) Face Recognition Database. http://cbcl.mit.edu/software-datasets/heisele/facerecognitiondatabase.html.
 
34
 
35
 
36
NASA World Wind. http://worldwind.arc.nasa.gov/.
 
37
38
 
39
 
40
 
41
 
42
 
43
The Netezza Performance Server System. http://www.netezza.com/products/products.cfm.
 
44
A. Waxman, D. Fay, B. Rhodes, T. McKenna, R. Ivey, N. Bomberger, V. Bykoski, and G. Carpenter. Information Fusion for Image Analysis: Geospatial Foundations for Higher-Level Fusion. In Proceedings of the International Conference on Information Fusion (ISIF), pages 562-?569 Vol. 1, July 2002.
 
45
R. Weber. SCSI Object-Based Storage Device Commands (OSD). Technical Report T10/1355-D, InterNational Committee for Information Technology Standards, July 2004.
 
46
C. White. Consolidating, Accessing, and Analyzing Unstructured Data, December 2005. Business Intelligence Network article.
 
47

Collaborative Colleagues:
Clinton Wills Smullen, IV: colleagues
Shahrukh Rohinton Tarapore: colleagues
Sudhanva Gurumurthi: colleagues
Parthasarathy Ranganathan: colleagues
Mustafa Uysal: colleagues