Simulator

Our simulator is based on Gem5.

usage

The Simulator binary is under <install-dir>/simulator/apc/gem5.opt. After compiled the source code into elf file, we can use following simple command line to run the simulator. For example, in example/1.Hello_MaPU/, we can use follow command to generate the elf file:

/home/xiesl/sdk/apc/bin/clang -target mspu Hello_MaPU.c -g -o Hello_MaPU.elf

and then use following command to run the simulation:

gem5.opt <install-dir>/simulator/apc/system/se.py -c Hello_MaPU.elf

<install-dir>/simulator/apc/system/se.py is the configuration file

source code base

The original source code is a stable version with tag stable_2012_06_28, which can be downloaded using follow commands:

hg clone http://repo.gem5.org/gem5
cd gem5; hg chekcout -r f75ee4849

Adding a new target in Gem5

For general description of how to add a new ISA in Gem5, please refer to this tutorial: Defining ISA In our implementation, we modified following files in Gem5 to add the MaPU CPU model:

MaPUSim/APC/build_opts/MAPU
```
TARGET_ISA = 'mapu'
CPU_MODELS = 'InOrderCPU'
PROTOCOL = 'MI_example'
```
The PROTOCOL is the used in cache coherence, which is not actully used in our implementation
MaPUSim/APC/utils/compile

Added the mapu isa as other archtectures

/src/cpu/BaseCPU.py

Added the TLB type in MaPU.

Implement the simulator

For how to construct a simulation system (including how to configure system, how to add caches and memory controllers) please refer to this tutorial:Learning gem5

In this section, we will focuses only on how to define the ISA using the Gem5 description languages. More reference can be found on the gem5 wiki page: ISA Parser

decoding process

The main purpose of the simulator is to define the behavior for the corresponding instruction encoding. For example, for instruction “R0 = R1 + R2”, the behavior should be adding R1 and R2, and then write the result back to R0.

In Gem5 ISA laugage, this is implemented using decode keyword:

decode BIT_FILED_NAME {
    format FORMAT_NAME {
 0x0 : add_INST( {{ Rc = Ra + Rb ; }} );
          0x1 : sub_INST( {{ Rc = Ra - Rb ; }} ); 
    }
}

BIT_FIELD_NAME: The bit range of the instruction encoding, defined with def bitfield keywords
FORMAT_NAME: The instruction format, indicates a specific type of the instructions (how many operands needed etc). Defined with def format keyword
xx_INST: The name of the corresponding instruction.
{{ C++_CODE }}: This field is the C++ code that descripts the behavior of the instruciton.

The decoding process can be nested, for example :

0x1:decode OPCODE_LO {
  0x1:decode SD {
    format IntCIOp { 
      0x0: fixadd( {{ C++ Code }} ); 
    }
  }
}

def bitfield

This keyword give a name to the specific bit range of the instruction, which can be referred in the decoding process. For example:

def bitfield OPCODE <31:26>;
def bitfield IMM <12>;

def operands

This keyword defines the operands that used in the behavior code.

 def operands {{
    'Ra': ('IntReg', 'uq', 'RA', 'IsInteger', 1),
    'Rb': ('IntReg', 'uq', 'RB', 'IsInteger', 2),
    'Rc': ('IntReg', 'uq', 'RC', 'IsInteger', 3),
}} 

Each operand contains following fields:

name: for example, the ‘Ra’
operand class: for example, ‘IntReg’. Can be one of this:
- IntReg
- FloatReg
- Mem
- NPC
- ControlReg
defaul data type: one of the pre-defined data type, which is defined with def operand_types:

 def operand_types {{
    'sb' : 'int8_t',
    'ub' : 'uint8_t',
    ...
}} 

bitfield: The corresponding bitfield name
flags: a string or triple of strings indicating the instruction flags that can be inferred when the operand is used
priority: used in disassembly

def format

An instruction format is basically a Python function. It takes the arguments supplied by an instruction definition that defined by the aformentioned decode keywords. and generates up to four pieces of C++ code.

header_output: Goes to decoder.hh, which is typically the C++ class declarations of the instruction
decode_output: Goes to decoder.cc, contains the code that do not visible to execute() function
exec_output: contains per-CPU model definition
decode_block: contains the a statement or block of statements that go into the decode function

An example of the def format is shown in following:

 def format IntOp(code, *opt_flags) {{
    iop = InstObjParams(name, Name, 'IntOp', code, opt_flags)
    header_output = BasicDeclare.subst(iop)
    decoder_output = BasicConstructor.subst(iop)
    decode_block = RegNopCheckDecode.subst(iop)
    exec_output = BasicExecute.subst(iop)
}}; 

The InstObjParams is an python class that contains following fields:

mnemonic The mnemonic name for this format
class_name The C++ class name for this format. ‘IntOp’ in the above example
snippets The C++ behavior code of the instruction that defined in decode block, ‘code’ in the above example
opt_args Additional parameters, ‘opt_flags’ in the above example

The BasicDeclare, BasicConstructor, BasicExecute are pre-defined template

def template

A template is used to define a framework that generates C++ code. It takes the parameter InstObjParams, and return the C++ code pieces. For example:

 def template BasicDecode {{
    return new %(class_name)s(machInst);
}} 

The content of the template is actally python code, and can use the parameters defined in InstObjParams. In the above example, if the class_name is IntOp, the return string will be:

    return new IntOp( machInst );

ISA description structure

The MaPU ISA description are show in the following:

src/arch/mapu/isa/
   bitfield.isa : definitions of the bitfield
   operands.isa : definitions of the operands
   sdecoder.isa : definitions of the decode process
   formats/
        basic.isa: The template definition 
        int.isa  : The SCU format defintion 
        ...      : The format definition of other FUs

Next we will take a tour of how a SCU instruction (R0 = R1 + R2) is implemented in Gem5.

Decoding defined in sdecoder.isa

decode OPCODE_HI default Unknown::unknown() {
 ...
 0x1:decode OPCODE_LO {
   0x1:decode SD {
     format IntCIOp { 
       0x0: fixadd( {{ 
         if(SCU_U){ ... }
         else{
           uint64_t i, j, k;
           i = Rs;
           j = Rt;
           k = i + j;
           CI = (k >> 32) ;
           int64_t a, b, c;
           a = (int32_t)Rs;
           b = (int32_t)Rt;
           c = a + b;
           if(SCU_T) Rd = c;                             // Rs = Rm + Rn(T)
           else      Rd = c > MAX_INT32 ? MAX_INT32 :    // Rs = Rm + Rn
                          c < MIN_INT32 ? MIN_INT32 : c;
         }
       }} ); 
     }
   }
 }
}

OPCODE_HI,OPCODE_LO, SD, SCU_T, SCU_U: in bitfield.isa

 def bitfield OPCODE_HI  <30:28>;
 def bitfield OPCODE_LO  <27:23>;
 def bitfield SD         <20:19>;
 def bitfield SCU_T      <18:18>;
 def bitfield SCU_U      <21:21>;

Rs, Rt, Rd, CI: in operands.isa

 'Rd': ('IntReg', 'uw', 'RD', 'IsInteger', 3),
 'Rs': ('IntReg', 'uw', 'RS', 'IsInteger', 2),
 'Rt': ('IntReg', 'uw', 'RT', 'IsInteger', 3),
 'CI': ('ControlReg', 'ud', 'MISCREG_CI', None,1),

format IntCIOp: Defined in formats/int.isa

 def format IntCIOp(code, *opt_flags) {{
  iop = InstObjParams(name, Name, 'IntCIOp', code, opt_flags)
  header_output = BasicDeclare.subst(iop)
  decoder_output = BasicConstructor.subst(iop)
  decode_block = RegNopCheckDecode.subst(iop)
  exec_output = BasicExecute.subst(iop)
 }}; 

The BasiceDeclare, BasicConstructor, and BasicExecute can be found in formats/basic.isa, the RegNopCheckDecode can be found in formats/noop.isa, which is the same with BasicDecode.

Generated source file Can be found in build/MAPU/arch/mapu/generated

decoder.hh: containts the class definition of the fixadd

 class Fixadd : public IntCIOp {
 public:
     //Constructor.
     Fixadd(ExtMachInst machInst);
    
     Fault execute(CheckerCPU *, Trace::InstRecord *) const;
    
     Fault execute(InOrderDynInst *, Trace::InstRecord *) const;
    
 };

decoder.cc: contains the decoding process and the the class implementation:

  inline Fixadd::Fixadd(ExtMachInst machInst)  : IntCIOp("fixadd", machInst, IntAluOp) {
  _destRegIdx[0] = MISCREG_CI + Ctrl_Base_DepTag;
  _srcRegIdx[0] = RS;
  _srcRegIdx[1] = RT;
  _destRegIdx[1] = RD;
  _numSrcRegs = 2;
  _numDestRegs = 2;
  _numFPDestRegs = 0;
  _numIntDestRegs = 1;
  flags[Is2cycle] = true;
  flags[IsInteger] = true;;
  }

The IntCIOp class is derived from MpuStaticInst, which is derived from the basic class StaticInst which is defined in Gem5 for all ISA (src/cpu/static_inst.hh).

inorder_cpu_exec.cc contains the executing code:

 Fault Fixadd::execute(InOrderDynInst *xc, Trace::InstRecord *traceData) const {
    Fault fault = NoFault;
    uint64_t CI = 0;
    uint32_t Rs = 0;
    uint32_t Rt = 0;
    uint32_t Rd = 0;

    Rs = xc->readIntRegOperand(this, 0);
    Rt = xc->readIntRegOperand(this, 1);

    if(fault == NoFault) {
       if(SCU_U){
         ...
       } else {
         uint64_t i, j, k;
         i = Rs;
         j = Rt;
         k = i + j;
         CI = (k >> 32) ;
         int64_t a, b, c;
         a = (int32_t)Rs;
         b = (int32_t)Rt;
         c = a + b;
         if(SCU_T) Rd = c;                             // Rs = Rm + Rn(T)
         else      Rd = c > MAX_INT32 ? MAX_INT32 :    // Rs = Rm + Rn
                        c < MIN_INT32 ? MIN_INT32 : c;
       }
    }
    ...
 }

<—— [Table of Content] [Assembler] ——>