

# DSPACE Project Introduction and Core Architecture

#### ESA DSP DAY ESA/ESTEC Noordwijk, August 28<sup>th</sup> 2012

*Walter Errico SITAEL S.p.A. Phone: +39 050 9912116 E-mail: walter.errico@sitael.com URL - http://www.sitael.com* 



È





#### Outline

#### Project Overview

**Design Approach** 

**Core Architecture** 

Conclusion















#### Requirements

Higher DSP performance required for feasibility of new space missions:

- Earth Observation missions (e.g. processing of IR sounder and SAR instruments data)
- Science and Robotic Exploration missions (e.g. optical navigation for descent and landing; camera image compression)
- Telecom applications

European strategic non-dependence for critical technologies

□Key requirements for next generation space DSP:

- Processing power  $\ge$  1 GFLOPS
- Radiation hardness (TID > 100 Krad) and protected memories (EDAC)
- Space standard I/O interfaces
- High quality SW Development









# **Objectives**

□ DSP synthetisable VHDL core for space applications

- o TMR
- EDAC
- Synchronous/Asynchronous reset
- Optional low power support with clock gating

#### Software Development Environment including:

- Optimized 'C' compiler chain
- Instruction Level Simulator
- SW debugger

#### Benchmark Core validation





È





#### Outline

#### Project Overview

**Design Approach** 

**Core Architecture** 

Conclusion





## **Design approach**



- The exploitation of LISA methodology to automatically generate part of the 1. SDK (assembly, linker, instruction Level Simulator ..).
- 2. The reuse of existing compilation chain tools combining them with a low level glue SW layer and code optimizer.







## **DSPACE** activities

#### □ HW design Architecture definition $\bigcirc$ Benchmarks VHDL Core finalization $\bigcirc$ **Demo Board** 0 □ SW Design SoftWare **Glue Layer** 0 **Development Code** Optimizer 0 **Environment Benchmarks** 0 Reuse LISA VHDL, ILS, 0 Glue Layer & Assembler Ground Code Optimizer DSP **Open-Source DSP** Space DSP Ο Core optimized 'C' compiler





## **Benchmarks**

□ Adoption of ESA DSP SW benchmarks specification

"Next Generation Space Digital Signal Processor Software Benchmark, TEC-EDP/2008.18/RT, December 2008"

Application-oriented benchmarks

■ Main kernel algorithms:

FIR filters, FFTs

Standard CCSDS Lossless Data Compression, Image Data Compression

Benchmark code development in C and Assembly (for filters and FFTs)

Execution for performance measurement on:

- Cycle Accurate Simulator
- FPGA demonstrator board







#### Outline

Project Overview

**Design Approach** 

**Core Architecture** 

Conclusion







#### **DSPACE** Architecture









## **Building Blocks**

- VLIW 8 slot data processing unit
- □ Instruction Cache, 32 Kbyte + EDAC
- □ Data Cache , 64 Kbyte + EDAC
- DMA
- Memory controller for 72 bit ECC DDR2
- □ SpW-RMAP Communication modules
- □ AHB/APB bus









## **Data Processing Unit**









#### **Three Different Functional Units**

| Functional Unit                                                    | Instruction Functionality                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|--------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FPMUL: Integer and single precision FP Multiplier unit             | <ul> <li>32x32-bit multiply</li> <li>Single-precision floating point multiplication</li> <li>16 x 16bit multiply</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| FPALU: Integer and single<br>precision FP Arithmetic Logic<br>Unit | <ul> <li>32-bit logical operations</li> <li>32-bit arithmetic operations</li> <li>32-bit compare operations</li> <li>32-bit bit-field operations</li> <li>32-bit shift</li> <li>Min/Max operations</li> <li>Bit reversal</li> <li>Trailing bit count</li> <li>Single-precision floating point compare</li> <li>Single-precision floating point arithmetic operations</li> <li>Min/Max floating point operations</li> <li>Integer / floating point conversion</li> <li>Branch</li> <li>Interrupt Handling</li> <li>Control register transfers to/from register file</li> <li>Input / Output transfer to/from external peripherals (I/O Space)</li> </ul> |
| AGU: Integer Address<br>Generation Unit                            | <ul> <li>Load and store bytes, half-words, words and double-words</li> <li>32-bit add and sub in linear or circular address calculation</li> <li>Load and store with 5-bit and 15-bit constant offset using 18 different addressing option (base + register with address post/pre modify)</li> </ul>                                                                                                                                                                                                                                                                                                                                                    |













# **DPU Pipeline**

- Simple and scholastic HW architecture
  - $\circ$  Strictly pipelined VLIW structure
  - No dynamic instruction schedule, no stall management
  - Delayed branch
    - The same latency for all instructions except than LOAD/STORE/JMP
- HW specification early frozen to allow the full SW/HW integration
- □ HW optimization/enhancements are postponed:
  - Data forwarding (Register File by-pass)
  - Instruction Multi-latency
  - Complex operands support







## **Target Technologies and Performances**

- DSPACE design will incorporate radiation-hardening-by-design techniques (EDAC protection of on-chip and off-chip memories, TMR on registers)
- Atmel ATC18RHA
  - currently available, space-qualified 180 nm ASIC technology

→ expected DSP performance: 750 MFLOPS (peak) @ 125 MHz

- ST Microelectronics DSM65
  - new space-qualified 65 nm ASIC technology (NDA signed)
    - → expected DSP performance ≥ 1 GFLOPS
- Xilinx Virtex5-QV XQR5VFX130 FPGA
  - It is the only space-qualified FPGA compatible with DSPACE features
  - A Larger Kintex7 XC7K160T/XC7K325T device has been selected for the demonstrator board







#### **DSPACE Demonstrator Board**

- Compact PCI 3U + mezzanine communication board
- Xilinx Kintex7 FPGA
- In C-PCI crates or as a standalone desktop board (with case and power adapter)
- Will be used in test and validation activities





È





#### Outline

Project Overview

**Design Approach** 

**Core Architecture** 

# Conclusion







## Conclusions

DSPACE project aims to realize the core for a Space 1 GFLOPs DSP component providing it with a complete and reliable SDE.

□ The VHDL core is complete and under test.

Simple 'C' programs and Assembly test are running on the Instruction Level Simulator.

Schedule

- Demo board for FPGA emulation is planned in Q4 2012
- Complete C compiler chain: Q1 2013
- HW/SW test and benchmarking Q2 2013

#### Evaluation Kit based on ILS can be released to Beta-Tester in Q1 2013







# Thank you for your attention!



Walter Errico Design Responsible

phone: +39 050 9912116 e-mail: walter.errico@sitael.com

#### SITAEL S.p.A.

Via Livornese 1019 56122 Pisa - San Piero a Grado (PI) ITALY www.sitael.com

