ACM Home Page
Please provide us with feedback. Feedback
A low cost, multithreaded processing-in-memory system
Full text PdfPdf (329 KB)
Source ACM International Conference Proceeding Series; Vol. 68 archive
Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture table of contents
Munich, Germany
Pages: 16 - 22  
Year of Publication: 2004
ISBN:1-59593-040-X
Authors
Jay B. Brockman  University of Notre Dame, Notre Dame, IN
Shyamkumar Thoziyoor  University of Notre Dame, Notre Dame, IN
Shannon K. Kuntz  University of Notre Dame, Notre Dame, IN
Peter M. Kogge  University of Notre Dame, Notre Dame, IN
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 23,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1054943.1054946
What is a DOI?

ABSTRACT

This paper discusses die cost vs. performance tradeoffs for a PIM system that could serve as the memory system of a host processor. For an increase of less than twice the cost of a commodity DRAM part, it is possible to realize a performance speedup of nearly a factor of 4 on irregular applications. This cost efficiency derives from developing a custom multithreaded processor architecture and implementation style that is well-suited for embedding in a memory. Specifically, it takes advantage of the low latency and high row bandwidth to both simplify processor design --- reducing area --- as well as to improve processing throughput. To support our claims of cost and performance, we have used simulation, analysis of existing chips, and also designed and fully implemented a prototype chip, PIM Lite.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
T. Takahashi et. al. A multi-gigabit DRAM technology with 6F2 open-bit-line cell distributed over-driven sensing and stacked-flash fuse. In International Solid-State Circuits Conference (ISSCC), San Francisco, CA, Feb. 2002. IEEE, IEEE.
 
2
ARM. ARM thumb family, www.arm.com, 2003.
 
3
J. Barnes and P. Hut. A hierarchical O(N logN) force-calculation algorithm. Nature, 324(4):446--449, Dec. 1986.
 
4
J. B. Brockman. PIM lite architecture and assembly language manual. Technical report, University of Notre Dame CSE Dept., July 2003.
 
5
J. B. Brockman. Programming PIM lite. Technical report, University of Notre Dame CSE Dept., July 2003.
6
 
7
8
 
9
 
10
11
12
 
13
IBM. The power PC 440 core. Technical report, IBM Microelectronics Division, Research Triangle Park, NC, Sept. 1999.
 
14
IBM. IBM SA-27E Embedded DRAM Macro Datasheet, Apr. 2002.
 
15
IBM. Embedded Memory Selection Guide. http://www-3.ibm.com/chips/products/asics/products/ememory.html, Mar. 2003.
 
16
 
17
 
18
T. Kirihata et. al. A 113 mm2 600 mb/s/pin 512 mb DDR2 SDRAM with vertically-folded bitline architecture. In International Solid-State Circuits Conference (ISSCC), San Francisco, CA, Feb. 2002. IEEE, IEEE.
 
19
G. Kirsch. Active memory device delivers massive parallelism. In Microprocessor Forum, San Jose, CA, Oct. 2002.
 
20
G. Konstadinidis et. al. Implementation of a third-generation 1.1GHz 64b microprocessor. In International Solid-State Circuits Conference (ISSCC), page 338, San Francisco, CA, Feb. 2002. IEEE, IEEE.
 
21
C. Kozyrakis, J. Gebis, D. Martin, S. Williams, I. Mavroidis, S. Pope, D. Jones, and D. Patterson. Vector IRAM: A media-enhanced vector processor with embedded DRAM. In IEEE, editor, Hot Chips 12: Stanford University, Stanford, California, August 13--15, 2000, pages ??--??, 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, 2000. IEEE Computer Society Press.
 
22
 
23
MIPS. MIPS64 5K family, www.mips.com, 2003.
 
24
S. D. Naffziger and G. Hammond. The implementation of the next-generation 64 b itanium microprocessor. In International Solid-State Circuits Conference (ISSCC), page 344, San Francisco, CA, Feb. 2002. IEEE, IEEE.
25
26
27
28
 
29
R. P. Preston et. al. Design of an 8-wide superscalar RISC microprocessor with simultaneous multithreading. In International Solid-State Circuits Conference (ISSCC), page 334, San Francisco, CA, Feb. 2002. IEEE, IEEE.
30
 
31
Semiconductor Industries Association. International technology roadmap for semiconductors. Technical report, 2001.
32
 
33
 
34
S. Thoziyoor. PIM lite: VLSI prototype of a multithreaded processor-in-memory chip. M.s. thesis, University of Notre Dame, Apr. 2004.
35
 
36
H. Yoon et. al. A 4 gb DDR SDRAM with gain-controlled pre-sensing and reference bitline calibration schemes in the twisted open bitline architecture. In International Solid-State Circuits Conference (ISSCC), pages 378--79, San Francisco, CA, Feb. 2002. IEEE, IEEE.


Collaborative Colleagues:
Jay B. Brockman: colleagues
Shyamkumar Thoziyoor: colleagues
Shannon K. Kuntz: colleagues
Peter M. Kogge: colleagues