MaPU

An Open Source DSP Ecosystem

Simulator

Our simulator is based on Gem5.

usage

The Simulator binary is under <install-dir>/simulator/apc/gem5.opt. After compiled the source code into elf file, we can use following simple command line to run the simulator. For example, in example/1.Hello_MaPU/, we can use follow command to generate the elf file:

/home/xiesl/sdk/apc/bin/clang -target mspu Hello_MaPU.c -g -o Hello_MaPU.elf

and then use following command to run the simulation:

gem5.opt <install-dir>/simulator/apc/system/se.py -c Hello_MaPU.elf

source code base

The original source code is a stable version with tag stable_2012_06_28, which can be downloaded using follow commands:

hg clone http://repo.gem5.org/gem5
cd gem5; hg chekcout -r f75ee4849

Adding a new target in Gem5

For general description of how to add a new ISA in Gem5, please refer to this tutorial: Defining ISA In our implementation, we modified following files in Gem5 to add the MaPU CPU model:

  1. MaPUSim/APC/build_opts/MAPU
    TARGET_ISA = 'mapu'
    CPU_MODELS = 'InOrderCPU'
    PROTOCOL = 'MI_example'
    

    The PROTOCOL is the used in cache coherence, which is not actully used in our implementation

  2. MaPUSim/APC/utils/compile

Added the mapu isa as other archtectures

  1. /src/cpu/BaseCPU.py

Added the TLB type in MaPU.

Implement the simulator

For how to construct a simulation system (including how to configure system, how to add caches and memory controllers) please refer to this tutorial:Learning gem5

In this section, we will focuses only on how to define the ISA using the Gem5 description languages. More reference can be found on the gem5 wiki page: ISA Parser

decoding process

The main purpose of the simulator is to define the behavior for the corresponding instruction encoding. For example, for instruction “R0 = R1 + R2”, the behavior should be adding R1 and R2, and then write the result back to R0.

In Gem5 ISA laugage, this is implemented using decode keyword:

decode BIT_FILED_NAME {
    format FORMAT_NAME {
 0x0 : add_INST( {{ Rc = Ra + Rb ; }} );
          0x1 : sub_INST( {{ Rc = Ra - Rb ; }} ); 
    }
}

The decoding process can be nested, for example :

0x1:decode OPCODE_LO {
  0x1:decode SD {
    format IntCIOp { 
      0x0: fixadd( {{ C++ Code }} ); 
    }
  }
}

def bitfield

This keyword give a name to the specific bit range of the instruction, which can be referred in the decoding process. For example:

def bitfield OPCODE <31:26>;
def bitfield IMM <12>;

def operands

This keyword defines the operands that used in the behavior code.

 def operands {{
    'Ra': ('IntReg', 'uq', 'RA', 'IsInteger', 1),
    'Rb': ('IntReg', 'uq', 'RB', 'IsInteger', 2),
    'Rc': ('IntReg', 'uq', 'RC', 'IsInteger', 3),
}} 

Each operand contains following fields:

 def operand_types {{
    'sb' : 'int8_t',
    'ub' : 'uint8_t',
    ...
}} 

def format

An instruction format is basically a Python function. It takes the arguments supplied by an instruction definition that defined by the aformentioned decode keywords. and generates up to four pieces of C++ code.

An example of the def format is shown in following:

 def format IntOp(code, *opt_flags) {{
    iop = InstObjParams(name, Name, 'IntOp', code, opt_flags)
    header_output = BasicDeclare.subst(iop)
    decoder_output = BasicConstructor.subst(iop)
    decode_block = RegNopCheckDecode.subst(iop)
    exec_output = BasicExecute.subst(iop)
}}; 

The InstObjParams is an python class that contains following fields:

The BasicDeclare, BasicConstructor, BasicExecute are pre-defined template

def template

A template is used to define a framework that generates C++ code. It takes the parameter InstObjParams, and return the C++ code pieces. For example:

 def template BasicDecode {{
    return new %(class_name)s(machInst);
}} 

The content of the template is actally python code, and can use the parameters defined in InstObjParams. In the above example, if the class_name is IntOp, the return string will be:

    return new IntOp( machInst );

ISA description structure

The MaPU ISA description are show in the following:

src/arch/mapu/isa/
   bitfield.isa : definitions of the bitfield
   operands.isa : definitions of the operands
   sdecoder.isa : definitions of the decode process
   formats/
        basic.isa: The template definition 
        int.isa  : The SCU format defintion 
        ...      : The format definition of other FUs

Next we will take a tour of how a SCU instruction (R0 = R1 + R2) is implemented in Gem5.

  1. Decoding defined in sdecoder.isa
    decode OPCODE_HI default Unknown::unknown() {
     ...
     0x1:decode OPCODE_LO {
       0x1:decode SD {
         format IntCIOp { 
           0x0: fixadd( {{ 
             if(SCU_U){ ... }
             else{
               uint64_t i, j, k;
               i = Rs;
               j = Rt;
               k = i + j;
               CI = (k >> 32) ;
               int64_t a, b, c;
               a = (int32_t)Rs;
               b = (int32_t)Rt;
               c = a + b;
               if(SCU_T) Rd = c;                             // Rs = Rm + Rn(T)
               else      Rd = c > MAX_INT32 ? MAX_INT32 :    // Rs = Rm + Rn
                              c < MIN_INT32 ? MIN_INT32 : c;
             }
           }} ); 
         }
       }
     }
    }
    
  2. OPCODE_HI,OPCODE_LO, SD, SCU_T, SCU_U: in bitfield.isa
     def bitfield OPCODE_HI  <30:28>;
     def bitfield OPCODE_LO  <27:23>;
     def bitfield SD         <20:19>;
     def bitfield SCU_T      <18:18>;
     def bitfield SCU_U      <21:21>;
    
  3. Rs, Rt, Rd, CI: in operands.isa
     'Rd': ('IntReg', 'uw', 'RD', 'IsInteger', 3),
     'Rs': ('IntReg', 'uw', 'RS', 'IsInteger', 2),
     'Rt': ('IntReg', 'uw', 'RT', 'IsInteger', 3),
     'CI': ('ControlReg', 'ud', 'MISCREG_CI', None,1),
    
  4. format IntCIOp: Defined in formats/int.isa
     def format IntCIOp(code, *opt_flags) {{
      iop = InstObjParams(name, Name, 'IntCIOp', code, opt_flags)
      header_output = BasicDeclare.subst(iop)
      decoder_output = BasicConstructor.subst(iop)
      decode_block = RegNopCheckDecode.subst(iop)
      exec_output = BasicExecute.subst(iop)
     }}; 
    

    The BasiceDeclare, BasicConstructor, and BasicExecute can be found in formats/basic.isa, the RegNopCheckDecode can be found in formats/noop.isa, which is the same with BasicDecode.

  5. Generated source file Can be found in build/MAPU/arch/mapu/generated
    • decoder.hh: containts the class definition of the fixadd
     class Fixadd : public IntCIOp {
     public:
         //Constructor.
         Fixadd(ExtMachInst machInst);
        
         Fault execute(CheckerCPU *, Trace::InstRecord *) const;
        
         Fault execute(InOrderDynInst *, Trace::InstRecord *) const;
        
     };
    
    • decoder.cc: contains the decoding process and the the class implementation:
      inline Fixadd::Fixadd(ExtMachInst machInst)  : IntCIOp("fixadd", machInst, IntAluOp) {
      _destRegIdx[0] = MISCREG_CI + Ctrl_Base_DepTag;
      _srcRegIdx[0] = RS;
      _srcRegIdx[1] = RT;
      _destRegIdx[1] = RD;
      _numSrcRegs = 2;
      _numDestRegs = 2;
      _numFPDestRegs = 0;
      _numIntDestRegs = 1;
      flags[Is2cycle] = true;
      flags[IsInteger] = true;;
      }
    

    The IntCIOp class is derived from MpuStaticInst, which is derived from the basic class StaticInst which is defined in Gem5 for all ISA (src/cpu/static_inst.hh).

    • inorder_cpu_exec.cc contains the executing code:
     Fault Fixadd::execute(InOrderDynInst *xc, Trace::InstRecord *traceData) const {
        Fault fault = NoFault;
        uint64_t CI = 0;
        uint32_t Rs = 0;
        uint32_t Rt = 0;
        uint32_t Rd = 0;
    
        Rs = xc->readIntRegOperand(this, 0);
        Rt = xc->readIntRegOperand(this, 1);
    
        if(fault == NoFault) {
           if(SCU_U){
             ...
           } else {
             uint64_t i, j, k;
             i = Rs;
             j = Rt;
             k = i + j;
             CI = (k >> 32) ;
             int64_t a, b, c;
             a = (int32_t)Rs;
             b = (int32_t)Rt;
             c = a + b;
             if(SCU_T) Rd = c;                             // Rs = Rm + Rn(T)
             else      Rd = c > MAX_INT32 ? MAX_INT32 :    // Rs = Rm + Rn
                            c < MIN_INT32 ? MIN_INT32 : c;
           }
        }
        ...
     }
    

    <—— [Table of Content] [Assembler] ——>