¡¡Chinese Journal of Computers   Full Text
  TitleA Small Close-Coupled Fast Shared Data Pool for Multi-Core DSPs
  AuthorsCHEN Shu-Ming WANG Dong CHEN Xiao-Wen WAN Jiang-Hua
  Address(Microelectronics Institute, School of Computer, National University of Defense Technology, Changsha 410073)
  Year2008
  IssueNo.10(1737¡ª1744)
  Abstract &
  Background
Abstract Improving the performance of Multi-Core Digital Signal Processors(MC-DSPs) for embedded applications needs the support of higher memory bandwidth and more flexible memory structures. This paper proposes a new shared Scratch-Pad Memory(SPM) structure for MC-DSPs, Fast Shared Data Pool(FSDP). FSDP is on the same hierarchy with L1 cache and can be directly accessed by LOAD/STORE instructions. FSDP is organized as parallel multi-bank structures with an interleaving access strategy and auto synchronous scheme based on hard signal-lamps. It supports high-speed parallel access and fast data words exchange. FSDP is a close-coupled share memory structure and it takes only four cycles to transmit a word between any two cores. The authors build the behavior simulator of FSDP and make its RTL implementation. The simulation with several typical benchmarks shows that FSDP is well suited to transmitting the fine-grain shared data in MC-DSPs. It achieves computation speedup ratio of 1.1 and 1.14 compared with traditional shared L2 caches and DMA units.
Keywords SPM; shared memory; multi-core DSP; release consistency
Background This paper belongs to the memory optimization problem of Multi-Core Digital Signal Processors(MC-DSPs), which are emerging embedded multi-core SoCs for high performance multimedia and wireless applications. Improving the performance of MC-DSPs often suffers from limited memory bandwidth and long access latency caused by the long data-path between CPU kernels and memories. Scratch-pad memory(SPM) is a type of low capacity high-speed on-chip memories mapped with global addresses, which has been proven to be an effective structure to supplement conventional caches hierarchy. The related work on SPMs mainly deals with the memory space partition and management for SPMs in terms of software optimization, and mostly based on uni-core processors. However, the exploration about SPMs organization in multi-core processors with shared memories is still rare.
This paper proposes a new shared SPMs structure for MC-DSPs, Fast Shared Data Pool(FSDP). FSDP is organized as parallel multi-bank structures with an interleaving access strategy and auto synchronous scheme based on physical signal-lamps. FSDP is a close-coupled share memory structure and it takes only four cycles to transmit a word between any two cores. The simulation with typical benchmarks shows that FSDP is well suited to transmitting fine-grain shared data in MC-DSPs. It can offer speedup ratio of 1.1 and 1.14 respectively compared with shared data transmission through traditional shared L2 caches and DMA units.
This paper comes from a project funded by National High Tech Project of China(No.2007AA01Z108)and National Science Foundation of China(No.60473079). The goal of this project is constructing a MC-DSP consisted of four 32-bit floating-point DSP cores. This kind of MC-DSP will be widely used in the base stations of next-generation wireless communication and high speed video processing applications. The main research results of our team include high performance 32-bit DSPs, Series of YHFT-DSP.