| Architectural support for SWAR text processing with parallel bit streams: the inductive doubling principle |
| Full text |
Pdf
(515 KB)
|
Source
|
Architectural Support for Programming Languages and Operating Systems
archive
Proceeding of the 14th international conference on Architectural support for programming languages and operating systems
table of contents
Washington, DC, USA
SESSION: Architectures
table of contents
Pages 337-348
Year of Publication: 2009
ISBN:978-1-60558-406-5
Also published in ...
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 15, Downloads (12 Months): 113, Citation Count: 0
|
|
|
ABSTRACT
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF-8 to UTF-16 transcoding, XML parsing, string search and regular expression matching. Direct architectural support for these algorithms in future SWAR instruction sets could further increase performance as well as simplifying the programming task. A set of simple SWAR instruction set extensions are proposed for this purpose based on the principle of systematic support for inductive doubling as an algorithmic technique. These extensions are shown to significantly reduce instruction count in core parallel bit stream algorithms, often providing a 3X or better improvement. The extensions are also shown to be useful for SWAR programming in other application areas, including providing a systematic treatment for horizontal operations. An implementation model for these extensions involves relatively simple circuitry added to the operand fetch components in a pipelined processor.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, Dec 2006.
|
| |
2
|
Robert D. Cameron. u8u16 -- a high-speed UTF-8 to UTF-16 transcoder using parallel bit streams. Technical Report TR 2007-18, Simon Fraser University, Burnaby, BC, Canada, 2007.
|
 |
3
|
|
 |
4
|
|
| |
5
|
James R. Green, Hanan Mahmoud, and Michel Dumontier. Towards real time protein identification using the Cell BE. In Workshop on Cell BE and Heterogeneous Multicore Systems: Architectures and Applications at CASCON '08, Toronto, Ontario, Canada, 2008.
|
| |
6
|
Kenneth S. Herdy, David S. Burggraf, and Robert D. Cameron. High performance GML to SVG transformation for the visual presentation of geographic data in web-based mapping systems. In Proceedings of SVG Open 2008, Nuremburg, Germany, 2008.
|
| |
7
|
Donald E. Knuth. The Art of Computer Programming Volume 4 Pre-Fascicle 1A: Bitwise Tricks and Techniques. Draft of 22 December 2008, Stanford University.
|
| |
8
|
|
| |
9
|
Chester Rebeiro, A. David Selvakumar, and A. S. L. Devi. Bitslice implementation of AES. In David Pointcheval, Yi Mu, and Kefei Chen, editors, CANS, volume 4301 of Lecture Notes in Computer Science, pages 203--212. Springer, 2006.
|
| |
10
|
Leo Stocco and Günther Schrack. Integer dilation and contraction for quadtrees and octrees. In Proceedings of the IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, pages 426--428, Victoria, B.C., 1995.
|
| |
11
|
|
|