Post

Template Generating Microop

Automatic class generation for microops

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
let {                                                                           
    class LdStOp(X86Microop):                                                   
        def __init__(self, data, segment, addr, disp,                           
                dataSize, addressSize, baseFlags, atCPL0, prefetch, nonSpec,    
                implicitStack, uncacheable):                                    
            self.data = data                                                    
            [self.scale, self.index, self.base] = addr                          
            self.disp = disp                                                    
            self.segment = segment                                              
            self.dataSize = dataSize                                            
            self.addressSize = addressSize                                      
            self.memFlags = baseFlags                                           
            if atCPL0:                                                          
                self.memFlags += " | (CPL0FlagBit << FlagShift)"                
            self.instFlags = ""                                                 
            if prefetch:                                                        
                self.memFlags += " | Request::PREFETCH"                         
                self.instFlags += " | (1ULL << StaticInst::IsDataPrefetch)"     
            if nonSpec:                                                         
                self.instFlags += " | (1ULL << StaticInst::IsNonSpeculative)"   
            if uncacheable:                                                     
                self.instFlags += " | (Request::UNCACHEABLE)"                   
            # For implicit stack operations, we should use *not* use the        
            # alternative addressing mode for loads/stores if the prefix is set 
            if not implicitStack:                                               
                self.memFlags += " | (machInst.legacy.addr ? " + \              
                                 "(AddrSizeFlagBit << FlagShift) : 0)"          
                                                                                
            ......                                                              
}                                                                               

Automatically define CPP classes and associated methods for microop using template

In the previous posting, we took a look at how the python class dedicated for one microop can be used to represent macroop and microop. Also, we saw that python class for macroop is used to populate CPP class counterpart that can be compiled with other CPP source code (GEM5 is CPP based project not python). In the middle of that journey, we saw that the getAllocator function of the microop python class generates CPP code snippets instantiating CPP microop class which is the counter part of the microop python class. We will see how those CPP classes for microops are generated by utilizing templates.

defineMicroLoadOp: define micro-load operations using templates

To understand how the CPP class for one microop can be implemented, we will take a look at the load related micro instructions in x86 architecture. The most important function of this microop class generation is the subst method provided by the Template object. GEM5 utilize the substitution a lot to populate various instructions having similar semantics.

gem5/src/arch/x86/isa/microops/ldstop.isa

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
434 let {{
435 
436     # Make these empty strings so that concatenating onto
437     # them will always work.
438     header_output = ""
439     decoder_output = ""
440     exec_output = ""
441 
442     segmentEAExpr = \
443         'bits(scale * Index + Base + disp, addressSize * 8 - 1, 0);'
444 
445     calculateEA = 'EA = SegBase + ' + segmentEAExpr
446 
447     debuggingEA = \
448         'DPRINTF(X86, "EA:%#x index:%#x base:%#x disp:%#x Segbase:%#x scale:%#x, addressSize:%#x, dataSize: %#x \\n", EA, Index, Base, disp, SegBase, scale, addressSize, dataSize)'
449 
450 
451     def defineMicroLoadOp(mnemonic, code, bigCode='',
452                           mem_flags="0", big=True, nonSpec=False,
453                           implicitStack=False):
454         global header_output
455         global decoder_output
456         global exec_output
457         global microopClasses
458         Name = mnemonic
459         name = mnemonic.lower()
460 
461         # Build up the all register version of this micro op
462         iops = [InstObjParams(name, Name, 'X86ISA::LdStOp',
463                               { "code": code,
464                                 "ea_code": calculateEA,
465                                 "memDataSize": "dataSize" })]
466         if big:
467             iops += [InstObjParams(name, Name + "Big", 'X86ISA::LdStOp',
468                                    { "code": bigCode,
469                                      "ea_code": calculateEA,
470                                      "memDataSize": "dataSize" })]
471         for iop in iops:
472             header_output += MicroLdStOpDeclare.subst(iop)
473             decoder_output += MicroLdStOpConstructor.subst(iop)
474             exec_output += MicroLoadExecute.subst(iop)
475             exec_output += MicroLoadInitiateAcc.subst(iop)
476             exec_output += MicroLoadCompleteAcc.subst(iop)
477 
478         if implicitStack:
479             # For instructions that implicitly access the stack, the address
480             # size is the same as the stack segment pointer size, not the
481             # address size if specified by the instruction prefix
482             addressSize = "env.stackSize"
483         else:
484             addressSize = "env.addressSize"
485 
486         base = LdStOp
487         if big:
488             base = BigLdStOp
489         class LoadOp(base):
490             def __init__(self, data, segment, addr, disp = 0,
491                     dataSize="env.dataSize",
492                     addressSize=addressSize,
493                     atCPL0=False, prefetch=False, nonSpec=nonSpec,
494                     implicitStack=implicitStack,
495                     uncacheable=False, EnTlb=False):
496                 super(LoadOp, self).__init__(data, segment, addr,
497                         disp, dataSize, addressSize, mem_flags,
498                         atCPL0, prefetch, nonSpec, implicitStack,
499                         uncacheable, EnTlb)
500                 self.className = Name
501                 self.mnemonic = name
502 
503         microopClasses[name] = LoadOp
504 
505     defineMicroLoadOp('Ld', 'Data = merge(Data, Mem, dataSize);',
506                             'Data = Mem & mask(dataSize * 8);')
507     defineMicroLoadOp('Ldis', 'Data = merge(Data, Mem, dataSize);',
508                               'Data = Mem & mask(dataSize * 8);',
509                                implicitStack=True)
510     defineMicroLoadOp('Ldst', 'Data = merge(Data, Mem, dataSize);',
511                               'Data = Mem & mask(dataSize * 8);',
512                       '(StoreCheck << FlagShift)')
513     defineMicroLoadOp('Ldstl', 'Data = merge(Data, Mem, dataSize);',
514                                'Data = Mem & mask(dataSize * 8);',
515                       '(StoreCheck << FlagShift) | Request::LOCKED_RMW',
516                       nonSpec=True)

As shown on the line 505-516, various load microops are populated by invoking defineMicroLoadOp python function. Because those microops have similar semantics which loads data from memory, defineMicroLoadOp function generates different microops by substituting generic template with microop-specific code-literals. You can find that multiple subst definitions from multiple templates are invoked in the defineMicroLoadOp function (line 472-476) to generate complete implementation of each microop.

Operands and its children classes can handle all operands in GEM5

Before we take a look at how the template is used to generate actual code for the microops, we should understand what is the InstObjParams and why it is necessary for template substitutions. To understand InstObjParams, we further need a deeper understanding about parameter system deployed by the GEM5. This includes generic classes to represent parameters of microop and macroop, and architecture specific operands and its parsing.

Generic classes representing various types of operands in GEM5

First of all, we need to understand that GEM5 provide common classes that can define multiple types of operands regardless of architecture. We will take a look at the class hierarchies representing various operands.

gem5/src/arch/isa_parser.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
 396 class Operand(object):
 397     '''Base class for operand descriptors.  An instance of this class
 398     (or actually a class derived from this one) represents a specific
 399     operand for a code block (e.g, "Rc.sq" as a dest). Intermediate
 400     derived classes encapsulates the traits of a particular operand
 401     type (e.g., "32-bit integer register").'''
 402 
 403     def buildReadCode(self, func = None):
 404         subst_dict = {"name": self.base_name,
 405                       "func": func,
 406                       "reg_idx": self.reg_spec,
 407                       "ctype": self.ctype}
 408         if hasattr(self, 'src_reg_idx'):
 409             subst_dict['op_idx'] = self.src_reg_idx
 410         code = self.read_code % subst_dict
 411         return '%s = %s;\n' % (self.base_name, code)
 412 
 413     def buildWriteCode(self, func = None):
 414         subst_dict = {"name": self.base_name,
 415                       "func": func,
 416                       "reg_idx": self.reg_spec,
 417                       "ctype": self.ctype,
 418                       "final_val": self.base_name}
 419         if hasattr(self, 'dest_reg_idx'):
 420             subst_dict['op_idx'] = self.dest_reg_idx
 421         code = self.write_code % subst_dict
 422         return '''
 423         {
 424             %s final_val = %s;
 425             %s;
 426             if (traceData) { traceData->setData(final_val); }
 427         }''' % (self.dflt_ctype, self.base_name, code)
 428 
 429     def __init__(self, parser, full_name, ext, is_src, is_dest):
 430         self.full_name = full_name
 431         self.ext = ext
 432         self.is_src = is_src
 433         self.is_dest = is_dest
 434         # The 'effective extension' (eff_ext) is either the actual
 435         # extension, if one was explicitly provided, or the default.
 436         if ext:
 437             self.eff_ext = ext
 438         elif hasattr(self, 'dflt_ext'):
 439             self.eff_ext = self.dflt_ext
 440 
 441         if hasattr(self, 'eff_ext'):
 442             self.ctype = parser.operandTypeMap[self.eff_ext]
 443 
 444     # Finalize additional fields (primarily code fields).  This step
 445     # is done separately since some of these fields may depend on the
 446     # register index enumeration that hasn't been performed yet at the
 447     # time of __init__(). The register index enumeration is affected
 448     # by predicated register reads/writes. Hence, we forward the flags
 449     # that indicate whether or not predication is in use.
 450     def finalize(self, predRead, predWrite):
 451         self.flags = self.getFlags()
 452         self.constructor = self.makeConstructor(predRead, predWrite)
 453         self.op_decl = self.makeDecl()
 454 
 455         if self.is_src:
 456             self.op_rd = self.makeRead(predRead)
 457             self.op_src_decl = self.makeDecl()
 458         else:
 459             self.op_rd = ''
 460             self.op_src_decl = ''
 461 
 462         if self.is_dest:
 463             self.op_wb = self.makeWrite(predWrite)
 464             self.op_dest_decl = self.makeDecl()
 465         else:
 466             self.op_wb = ''
 467             self.op_dest_decl = ''
 468 
 469     def isMem(self):
 470         return 0
 471 
 472     def isReg(self):
 473         return 0
 474 
 475     def isFloatReg(self):
 476         return 0
 477 
 478     def isIntReg(self):
 479         return 0
 480 
 481     def isCCReg(self):
 482         return 0
 483 
 484     def isControlReg(self):
 485         return 0
 486 
 487     def isVecReg(self):
 488         return 0
 489 
 490     def isVecElem(self):
 491         return 0
 492 
 493     def isVecPredReg(self):
 494         return 0
 495 
 496     def isPCState(self):
 497         return 0
 498 
 499     def isPCPart(self):
 500         return self.isPCState() and self.reg_spec
 501 
 502     def hasReadPred(self):
 503         return self.read_predicate != None
 504 
 505     def hasWritePred(self):
 506         return self.write_predicate != None
 507 
 508     def getFlags(self):
 509         # note the empty slice '[:]' gives us a copy of self.flags[0]
 510         # instead of a reference to it
 511         my_flags = self.flags[0][:]
 512         if self.is_src:
 513             my_flags += self.flags[1]
 514         if self.is_dest:
 515             my_flags += self.flags[2]
 516         return my_flags
 517 
 518     def makeDecl(self):
 519         # Note that initializations in the declarations are solely
 520         # to avoid 'uninitialized variable' errors from the compiler.
 521         return self.ctype + ' ' + self.base_name + ' = 0;\n';
 522 
 523 
 524 src_reg_constructor = '\n\t_srcRegIdx[_numSrcRegs++] = RegId(%s, %s);'
 525 dst_reg_constructor = '\n\t_destRegIdx[_numDestRegs++] = RegId(%s, %s);'

The Operand class is a generic class provides various definitions that can be overridden by its children classes. Only handful of them are overridden to tell a type of the current operand class represents. Let’s take a look at IntRegOperand class which inherits the base Operand class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
 528 class IntRegOperand(Operand):
 529     reg_class = 'IntRegClass'
 530 
 531     def isReg(self):
 532         return 1
 533 
 534     def isIntReg(self):
 535         return 1
 536 
 537     def makeConstructor(self, predRead, predWrite):
 538         c_src = ''
 539         c_dest = ''
 540 
 541         if self.is_src:
 542             c_src = src_reg_constructor % (self.reg_class, self.reg_spec)
 543             if self.hasReadPred():
 544                 c_src = '\n\tif (%s) {%s\n\t}' % \
 545                         (self.read_predicate, c_src)
 546 
 547         if self.is_dest:
 548             c_dest = dst_reg_constructor % (self.reg_class, self.reg_spec)
 549             c_dest += '\n\t_numIntDestRegs++;'
 550             if self.hasWritePred():
 551                 c_dest = '\n\tif (%s) {%s\n\t}' % \
 552                          (self.write_predicate, c_dest)
 553 
 554         return c_src + c_dest

The IntRegOperand class represents Integer type operand, thus it overrides isReg and isIntReg definition. One operand can be stored in a register or presented as a constant. Note that the IntRegOperand represents Integer type operand stored in the register.

Finalize function generates actual code statements for operand

One most important definition provided by the base class is finalize. Note that all the Operands and its children classes and methods are defined as python syntax. Therefore, we should require a method to convert python representation to CPP which can be understandable by the GEM5. The finalize definition does this! Although different version of finalize implementation exists depending on the operand type, we will take a look at the finalize of the Operand class. This is because most of the children classes of Operand doesn’t override the finalize method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
 450     def finalize(self, predRead, predWrite):
 451         self.flags = self.getFlags()
 452         self.constructor = self.makeConstructor(predRead, predWrite)
 453         self.op_decl = self.makeDecl()
 454
 455         if self.is_src:
 456             self.op_rd = self.makeRead(predRead)
 457             self.op_src_decl = self.makeDecl()
 458         else:
 459             self.op_rd = ''
 460             self.op_src_decl = ''
 461
 462         if self.is_dest:
 463             self.op_wb = self.makeWrite(predWrite)
 464             self.op_dest_decl = self.makeDecl()
 465         else:
 466             self.op_wb = ''
 467             self.op_dest_decl = ''

The finalize method generates mainly two code bloks: initialization code for operands generated by makeConstructor and code accessing operands such as register read or write retrieved by makeRead and makeWrite. Based on the operand type such as source and destination, either markeRead or makeWrite will be invoked. As a result, the actual CPP code statement that can access the operands will be generated. Let’s take a look at makeRead and makeWrite definitions provided by the IntRegOperand class as an example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
 528 class IntRegOperand(Operand):
 ......
 556     def makeRead(self, predRead):
 557         if (self.ctype == 'float' or self.ctype == 'double'):
 558             error('Attempt to read integer register as FP')
 559         if self.read_code != None:
 560             return self.buildReadCode('readIntRegOperand')
 561 
 562         int_reg_val = ''
 563         if predRead:
 564             int_reg_val = 'xc->readIntRegOperand(this, _sourceIndex++)'
 565             if self.hasReadPred():
 566                 int_reg_val = '(%s) ? %s : 0' % \
 567                               (self.read_predicate, int_reg_val)
 568         else:
 569             int_reg_val = 'xc->readIntRegOperand(this, %d)' % self.src_reg_idx
 570 
 571         return '%s = %s;\n' % (self.base_name, int_reg_val)
 572 
 573     def makeWrite(self, predWrite):
 574         if (self.ctype == 'float' or self.ctype == 'double'):
 575             error('Attempt to write integer register as FP')
 576         if self.write_code != None:
 577             return self.buildWriteCode('setIntRegOperand')
 578 
 579         if predWrite:
 580             wp = 'true'
 581             if self.hasWritePred():
 582                 wp = self.write_predicate
 583 
 584             wcond = 'if (%s)' % (wp)
 585             windex = '_destIndex++'
 586         else:
 587             wcond = ''
 588             windex = '%d' % self.dest_reg_idx
 589 
 590         wb = '''
 591         %s
 592         {
 593             %s final_val = %s;
 594             xc->setIntRegOperand(this, %s, final_val);\n
 595             if (traceData) { traceData->setData(final_val); }
 596         }''' % (wcond, self.ctype, self.base_name, windex)
 597 
 598         return wb

The above two definitions check whether the current operands type matches the type represented by the IntRegOperand class. After that, it generates CPP statements which allow accesses to the operands and returns the string.

Populating proper operand class instances

We now understand GEM5 utilizes various types of operand classes to represent different type of operands independent on the architectures. Then how the each ISA of different architectures can utilize those classes to generate the operands initialization code and proper access codes formatted in CPP syntax? Yeah answer is the finalize method we’ve seen, but where and how can we generate instances of those operand classes?

InstObjParams containing all information required for substitutions

Now it is time to go back to InstObjParams again!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
451     def defineMicroLoadOp(mnemonic, code, bigCode='',
452                           mem_flags="0", big=True, nonSpec=False,
453                           implicitStack=False):
......
461         # Build up the all register version of this micro op
462         iops = [InstObjParams(name, Name, 'X86ISA::LdStOp',
463                               { "code": code,
464                                 "ea_code": calculateEA,
465                                 "memDataSize": "dataSize" })]
466         if big:
467             iops += [InstObjParams(name, Name + "Big", 'X86ISA::LdStOp',
468                                    { "code": bigCode,
469                                      "ea_code": calculateEA,
470                                      "memDataSize": "dataSize" })]
471         for iop in iops:
472             header_output += MicroLdStOpDeclare.subst(iop)
473             decoder_output += MicroLdStOpConstructor.subst(iop)
474             exec_output += MicroLoadExecute.subst(iop)
475             exec_output += MicroLoadInitiateAcc.subst(iop)
476             exec_output += MicroLoadCompleteAcc.subst(iop)

You might remember that InstObjParams is used for substituting the template. As shown in the code line 461-470 of the defineMicroLoadOp python definition, it defines iops which is the array of InstObjParams. After the iops array is populated, it is passed to the subst function of each template shown in the line 471-476. The subst function will replace the microop specific part of the implementation with the information provided by the passed InstObjParams instance. Note that the code snippets defined as python dictionary using { } are passed to the constructor of the InstObjParams python class. When you look up the code and calculateEA variables of the defineMicroLoadOp definition, you can easily find that they are code snippets also. Let’s take a look at InstObjParams python class.

gem5/src/arch/isa_parser.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
1413 class InstObjParams(object):
1414     def __init__(self, parser, mnem, class_name, base_class = '',
1415                  snippets = {}, opt_args = []):
1416         self.mnemonic = mnem
1417         self.class_name = class_name
1418         self.base_class = base_class
1419         if not isinstance(snippets, dict):
1420             snippets = {'code' : snippets}
1421         compositeCode = ' '.join(map(str, snippets.values()))
1422         self.snippets = snippets
1423
1424         self.operands = OperandList(parser, compositeCode)
1425
1426         # The header of the constructor declares the variables to be used
1427         # in the body of the constructor.
1428         header = ''
1429         header += '\n\t_numSrcRegs = 0;'
1430         header += '\n\t_numDestRegs = 0;'
1431         header += '\n\t_numFPDestRegs = 0;'
1432         header += '\n\t_numVecDestRegs = 0;'
1433         header += '\n\t_numVecElemDestRegs = 0;'
1434         header += '\n\t_numVecPredDestRegs = 0;'
1435         header += '\n\t_numIntDestRegs = 0;'
1436         header += '\n\t_numCCDestRegs = 0;'
1437
1438         self.constructor = header + \
1439                            self.operands.concatAttrStrings('constructor')
1440
1441         self.flags = self.operands.concatAttrLists('flags')
1442
1443         self.op_class = None
1444
1445         # Optional arguments are assumed to be either StaticInst flags
1446         # or an OpClass value.  To avoid having to import a complete
1447         # list of these values to match against, we do it ad-hoc
1448         # with regexps.
1449         for oa in opt_args:
1450             if instFlagRE.match(oa):
1451                 self.flags.append(oa)
1452             elif opClassRE.match(oa):
1453                 self.op_class = oa
1454             else:
1455                 error('InstObjParams: optional arg "%s" not recognized '
1456                       'as StaticInst::Flag or OpClass.' % oa)
1457
1458         # Make a basic guess on the operand class if not set.
1459         # These are good enough for most cases.
1460         if not self.op_class:
1461             if 'IsStore' in self.flags:
1462                 # The order matters here: 'IsFloating' and 'IsInteger' are
1463                 # usually set in FP instructions because of the base
1464                 # register
1465                 if 'IsFloating' in self.flags:
1466                     self.op_class = 'FloatMemWriteOp'
1467                 else:
1468                     self.op_class = 'MemWriteOp'
1469             elif 'IsLoad' in self.flags or 'IsPrefetch' in self.flags:
1470                 # The order matters here: 'IsFloating' and 'IsInteger' are
1471                 # usually set in FP instructions because of the base
1472                 # register
1473                 if 'IsFloating' in self.flags:
1474                     self.op_class = 'FloatMemReadOp'
1475                 else:
1476                     self.op_class = 'MemReadOp'
1477             elif 'IsFloating' in self.flags:
1478                 self.op_class = 'FloatAddOp'
1479             elif 'IsVector' in self.flags:
1480                 self.op_class = 'SimdAddOp'
1481             else:
1482                 self.op_class = 'IntAluOp'
1483
1484         # add flag initialization to contructor here to include
1485         # any flags added via opt_args
1486         self.constructor += makeFlagConstructor(self.flags)
1487
1488         # if 'IsFloating' is set, add call to the FP enable check
1489         # function (which should be provided by isa_desc via a declare)
1490         # if 'IsVector' is set, add call to the Vector enable check
1491         # function (which should be provided by isa_desc via a declare)
1492         if 'IsFloating' in self.flags:
1493             self.fp_enable_check = 'fault = checkFpEnableFault(xc);'
1494         elif 'IsVector' in self.flags:
1495             self.fp_enable_check = 'fault = checkVecEnableFault(xc);'
1496         else:
1497             self.fp_enable_check = ''

The main purpose of InstObjParams is defining a particular dictionary. This dictionary stores all the passed information including class name and code snippets, which will be used later in subst definition of template object to replace microcode specific parts of the microcode implementation template. One of the important information managed by the InstObjParams is the operands field (line 1424). Note that constructor of the InstObjParams instantiate another object called OperandList.

OperandList parses operands from code snippets

The OperandList parses code snippets of microop and generates Operand objects. Yeah this is one of the location where the Operand objects are populated. Each Operand provides useful information to constructor creation and defining multiple definitions required for implementing one microop. The OperandList can generate Operand classes based on the operand keywords specified in the code-snippet. Note that OperandList takes second argument of the defineMicroLoadOp definition.

1
defineMicroLoadOp('Ld', 'Data = merge(Data, Mem, dataSize);',

For example, in the above defineMicroLoadOp invocation, ‘Data = merge(Data, Mem, dataSize);’ is passed to the OperandList’s constructor and stored to the operands field of the InstObjParams (populated in the defineMicroLoadOp). Note that this code snippet represents microop’s input and output operands. To understand details, Let’s take a look at OperandList python class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
1127 class OperandList(object):
1128     '''Find all the operands in the given code block.  Returns an operand
1129     descriptor list (instance of class OperandList).'''
1130     def __init__(self, parser, code):
1131         self.items = []
1132         self.bases = {}
1133         # delete strings and comments so we don't match on operands inside
1134         for regEx in (stringRE, commentRE):
1135             code = regEx.sub('', code)
1136         # search for operands
1137         next_pos = 0
1138         while 1:
1139             match = parser.operandsRE.search(code, next_pos)
1140             if not match:
1141                 # no more matches: we're done
1142                 break
1143             op = match.groups()
1144             # regexp groups are operand full name, base, and extension
1145             (op_full, op_base, op_ext) = op
1146             # If is a elem operand, define or update the corresponding
1147             # vector operand
1148             isElem = False
1149             if op_base in parser.elemToVector:
1150                 isElem = True
1151                 elem_op = (op_base, op_ext)
1152                 op_base = parser.elemToVector[op_base]
1153                 op_ext = '' # use the default one
1154             # if the token following the operand is an assignment, this is
1155             # a destination (LHS), else it's a source (RHS)
1156             is_dest = (assignRE.match(code, match.end()) != None)
1157             is_src = not is_dest
1158
1159             # see if we've already seen this one
1160             op_desc = self.find_base(op_base)
1161             if op_desc:
1162                 if op_ext and op_ext != '' and op_desc.ext != op_ext:
1163                     error ('Inconsistent extensions for operand %s: %s - %s' \
1164                             % (op_base, op_desc.ext, op_ext))
1165                 op_desc.is_src = op_desc.is_src or is_src
1166                 op_desc.is_dest = op_desc.is_dest or is_dest
1167                 if isElem:
1168                     (elem_base, elem_ext) = elem_op
1169                     found = False
1170                     for ae in op_desc.active_elems:
1171                         (ae_base, ae_ext) = ae
1172                         if ae_base == elem_base:
1173                             if ae_ext != elem_ext:
1174                                 error('Inconsistent extensions for elem'
1175                                       ' operand %s' % elem_base)
1176                             else:
1177                                 found = True
1178                     if not found:
1179                         op_desc.active_elems.append(elem_op)
1180             else:
1181                 # new operand: create new descriptor
1182                 op_desc = parser.operandNameMap[op_base](parser,
1183                     op_full, op_ext, is_src, is_dest)
1184                 # if operand is a vector elem, add the corresponding vector
1185                 # operand if not already done
1186                 if isElem:
1187                     op_desc.elemExt = elem_op[1]
1188                     op_desc.active_elems = [elem_op]
1189                 self.append(op_desc)
1190             # start next search after end of current match
1191             next_pos = match.end()
1192         self.sort()
1193         # enumerate source & dest register operands... used in building
1194         # constructor later
1195         self.numSrcRegs = 0
1196         self.numDestRegs = 0
1197         self.numFPDestRegs = 0
1198         self.numIntDestRegs = 0
1199         self.numVecDestRegs = 0
1200         self.numVecPredDestRegs = 0
1201         self.numCCDestRegs = 0
1202         self.numMiscDestRegs = 0
1203         self.memOperand = None
1204
1205         # Flags to keep track if one or more operands are to be read/written
1206         # conditionally.
1207         self.predRead = False
1208         self.predWrite = False
1209
1210         for op_desc in self.items:
1211             if op_desc.isReg():
1212                 if op_desc.is_src:
1213                     op_desc.src_reg_idx = self.numSrcRegs
1214                     self.numSrcRegs += 1
1215                 if op_desc.is_dest:
1216                     op_desc.dest_reg_idx = self.numDestRegs
1217                     self.numDestRegs += 1
1218                     if op_desc.isFloatReg():
1219                         self.numFPDestRegs += 1
1220                     elif op_desc.isIntReg():
1221                         self.numIntDestRegs += 1
1222                     elif op_desc.isVecReg():
1223                         self.numVecDestRegs += 1
1224                     elif op_desc.isVecPredReg():
1225                         self.numVecPredDestRegs += 1
1226                     elif op_desc.isCCReg():
1227                         self.numCCDestRegs += 1
1228                     elif op_desc.isControlReg():
1229                         self.numMiscDestRegs += 1
1230             elif op_desc.isMem():
1231                 if self.memOperand:
1232                     error("Code block has more than one memory operand.")
1233                 self.memOperand = op_desc
1234
1235             # Check if this operand has read/write predication. If true, then
1236             # the microop will dynamically index source/dest registers.
1237             self.predRead = self.predRead or op_desc.hasReadPred()
1238             self.predWrite = self.predWrite or op_desc.hasWritePred()
1239
1240         if parser.maxInstSrcRegs < self.numSrcRegs:
1241             parser.maxInstSrcRegs = self.numSrcRegs
1242         if parser.maxInstDestRegs < self.numDestRegs:
1243             parser.maxInstDestRegs = self.numDestRegs
1244         if parser.maxMiscDestRegs < self.numMiscDestRegs:
1245             parser.maxMiscDestRegs = self.numMiscDestRegs
1246
1247         # now make a final pass to finalize op_desc fields that may depend
1248         # on the register enumeration
1249         for op_desc in self.items:
1250             op_desc.finalize(self.predRead, self.predWrite)

OperandList parses code snippets with regular expression. Whenever a new keyword matches, it first checks its cache by invoking find_base definition of the OperandList class. If there has been a match, it will returns a proper Operand object that can represent the found keyword. If there is no matches, it should look up parser.operandNameMap which contains all mappings from specific keyword to particular Operand object (1180-1189). Note that the type of matching keyword can be anything that can be represented by the classes inheriting Operand class. Whenever, a matching Operand object is found, It stores parsed operand to the self.items through the self.append(op_desc) in line 1189.

After parinsg the operands, it iterates every parsed operands stored in the self.items and invokes finalize function of each operand (1249-1250). The finalize function translates each tokens to a code block that updates or accesses register depending on destination, source, and type of the operands.

Operand parsing and operandNameMap

When the new keyword is found in the code snippet, it should look up the operandNameMap to find matching Operand object. Then where and how the operandNameMap has been initialized to contain all required information for mapping keyword to Operand object. The answer is on the parsing!

gem5/src/arch/x86/isa/operands.isa

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
 91 def operands {{
 92         'SrcReg1':       foldInt('src1', 'foldOBit', 1),
 93         'SSrcReg1':      intReg('src1', 1),
 94         'SrcReg2':       foldInt('src2', 'foldOBit', 2),
 95         'SSrcReg2':      intReg('src2', 1),
 96         'Index':         foldInt('index', 'foldABit', 3),
 97         'Base':          foldInt('base', 'foldABit', 4),
 98         'DestReg':       foldInt('dest', 'foldOBit', 5),
 99         'SDestReg':      intReg('dest', 5),
100         'Data':          foldInt('data', 'foldOBit', 6),
101         'DataLow':       foldInt('dataLow', 'foldOBit', 6),
102         'DataHi':        foldInt('dataHi', 'foldOBit', 6),
103         'ProdLow':       impIntReg(0, 7),
104         'ProdHi':        impIntReg(1, 8),
105         'Quotient':      impIntReg(2, 9),
106         'Remainder':     impIntReg(3, 10),
107         'Divisor':       impIntReg(4, 11),
108         'DoubleBits':    impIntReg(5, 11),
109         'Rax':           intReg('(INTREG_RAX)', 12),
110         'Rbx':           intReg('(INTREG_RBX)', 13),
111         'Rcx':           intReg('(INTREG_RCX)', 14),
112         'Rdx':           intReg('(INTREG_RDX)', 15),
113         'Rsp':           intReg('(INTREG_RSP)', 16),
114         'Rbp':           intReg('(INTREG_RBP)', 17),
115         'Rsi':           intReg('(INTREG_RSI)', 18),
116         'Rdi':           intReg('(INTREG_RDI)', 19),
...


As shown in the above operands definition, def operands, each architecture defines operands list that can be used as operands of instructions. Although it could be seen as a function definition in the python, note that its file extension is not py but isa. Also, this is not a correct function definition semantics in python. Yeah parser needs to parse this python like block!

gem5/src/arch/isa_parser.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
2066     # Define the mapping from operand names to operand classes and
2067     # other traits.  Stored in operandNameMap.
2068     def p_def_operands(self, t):
2069         'def_operands : DEF OPERANDS CODELIT SEMI'
2070         if not hasattr(self, 'operandTypeMap'):
2071             error(t.lineno(1),
2072                   'error: operand types must be defined before operands')
2073         try:
2074             user_dict = eval('{' + t[3] + '}', self.exportContext)
2075         except Exception, exc:
2076             if debug:
2077                 raise
2078             error(t.lineno(1), 'In def operands: %s' % exc)
2079         self.buildOperandNameMap(user_dict, t.lexer.lineno)

The def operand block is parsed by the isa_parser as other isa definition. As shown on the above grammar rule, when the def operands block is found, it invokes buildOperandNameMap function and generates operandNameMap. As a result, the operandNameMap can provide mapping between operands keyword to suitable Operand object used for accessing that operands. For example, as shown in the above def operands blocks, Data keyword is translated into IntRegOperand object.

finalize example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
 450     def finalize(self, predRead, predWrite):
 451         self.flags = self.getFlags()
 452         self.constructor = self.makeConstructor(predRead, predWrite)
 453         self.op_decl = self.makeDecl()
 454 
 455         if self.is_src:
 456             self.op_rd = self.makeRead(predRead)
 457             self.op_src_decl = self.makeDecl()
 458         else:
 459             self.op_rd = ''
 460             self.op_src_decl = ''
 461 
 462         if self.is_dest:
 463             self.op_wb = self.makeWrite(predWrite)
 464             self.op_dest_decl = self.makeDecl()
 465         else:
 466             self.op_wb = ''
 467             self.op_dest_decl = ''

After all arguments are translated into proper Operand objects, the finalize definition of those objects should be invoked to generate CPP statements. Let’s take a look at the IntRegOperand object because Data keyword is mapped to this Operand object. Because the IntRegOperand does not override the finalize method, the finalize method of the base class (Operand) will be invoked. As a consequence, either makeRead or makeWrite of the IntRegOperand Because the Data keyword is located on the LHS of the statement, it will be set as destination, and the makeWrite operation will be invoked as a result of the finalize. Also the generated result will be stored in the op_wb field of the IntRegOperand object. We will see how this field will replace the template of the Ld micro-load instruction. Also, note that other fields such as op_xx are generated in the finalize definition (op_decl for declaring variables, op_rd for read operations for example).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
 524 src_reg_constructor = '\n\t_srcRegIdx[_numSrcRegs++] = RegId(%s, %s);'
 525 dst_reg_constructor = '\n\t_destRegIdx[_numDestRegs++] = RegId(%s, %s);'
 526 
 527 
 528 class IntRegOperand(Operand):
 529     reg_class = 'IntRegClass'
 ......
 537     def makeConstructor(self, predRead, predWrite):
 538         c_src = ''
 539         c_dest = ''
 540 
 541         if self.is_src:
 542             c_src = src_reg_constructor % (self.reg_class, self.reg_spec)
 543             if self.hasReadPred():
 544                 c_src = '\n\tif (%s) {%s\n\t}' % \
 545                         (self.read_predicate, c_src)
 546 
 547         if self.is_dest:
 548             c_dest = dst_reg_constructor % (self.reg_class, self.reg_spec)
 549             c_dest += '\n\t_numIntDestRegs++;'
 550             if self.hasWritePred():
 551                 c_dest = '\n\tif (%s) {%s\n\t}' % \
 552                          (self.write_predicate, c_dest)
 ......
 573     def makeWrite(self, predWrite):
 574         if (self.ctype == 'float' or self.ctype == 'double'):
 575             error('Attempt to write integer register as FP')
 576         if self.write_code != None:
 577             return self.buildWriteCode('setIntRegOperand')
 578
 579         if predWrite:
 580             wp = 'true'
 581             if self.hasWritePred():
 582                 wp = self.write_predicate
 583
 584             wcond = 'if (%s)' % (wp)
 585             windex = '_destIndex++'
 586         else:
 587             wcond = ''
 588             windex = '%d' % self.dest_reg_idx
 589
 590         wb = '''
 591         %s
 592         {
 593             %s final_val = %s;
 594             xc->setIntRegOperand(this, %s, final_val);\n
 595             if (traceData) { traceData->setData(final_val); }
 596         }''' % (wcond, self.ctype, self.base_name, windex)
 597
 598         return wb

As shown in the above code, the makeWrite definition of the IntRegOperand class also utilize string substitutions. The final_val local variable is declared as Integer type because it is IntRegOperand class, and the self.base_name which is the name of the keyword Data is assigned to the variable. After that, by invoking setIntRegOperand function, it sets the final_val to the destination register operand which can be accessible by the ExecContext (xc). The substituted string is returned as a result of finalize method, but note that still it is not printed out as CPP statement to the automatically generated code yet. Yeah! The code has been parsed, produced as the OperandList, and stored in the operand field of the InstObjParams Remember that InstObjParams is used to replace generic template to generate microcode implementation!

In a nutshell: generating CPP class for microop

Although we spent a lot of times to cover many details of parser such as Template and Operands, the one of the most important goal of this posting is understanding how the CPP class associated with one microop can be automatically generated. In the previous posting, we only found that the getAllocator of the python class associated with one microop generates constructor code for initiating CPP class defined for the microop. However, to implement the CPP class, we also need class definition and member functions required to implement semantics of the microop in addition to the constructor method of the class.

MicroLdStOpDeclare: generating CPP class for micro-load operations

Although there are several microops related with load operations, the skeleton of those microops are same (represented as Template) because they have similarities because of the characteristics of the load operation. First of all, the MicroLdStOpDeclare template is used to generate CPP class declaration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def template MicroLdStOpDeclare {{
    class %(class_name)s : public %(base_class)s
    {
      public:
        %(class_name)s(ExtMachInst _machInst,
                const char * instMnem, uint64_t setFlags,
                uint8_t _scale, InstRegIndex _index, InstRegIndex _base,
                uint64_t _disp, InstRegIndex _segment,
                InstRegIndex _data,
                uint8_t _dataSize, uint8_t _addressSize,
                Request::FlagsType _memFlags);

        Fault execute(ExecContext *, Trace::InstRecord *) const;
        Fault initiateAcc(ExecContext *, Trace::InstRecord *) const;
        Fault completeAcc(PacketPtr, ExecContext *, Trace::InstRecord *) const;
    };
}};

Based on the InstObjParams passed to the defineMicroLoadOp, microop specific strings will finish the uncompleted parts of the template. Note that the generated class also have the constructor which we were looking for.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
271 def template MicroLdStOpConstructor {{
272     %(class_name)s::%(class_name)s(
273             ExtMachInst machInst, const char * instMnem, uint64_t setFlags,
274             uint8_t _scale, InstRegIndex _index, InstRegIndex _base,
275             uint64_t _disp, InstRegIndex _segment,
276             InstRegIndex _data,
277             uint8_t _dataSize, uint8_t _addressSize,
278             Request::FlagsType _memFlags) :
279         %(base_class)s(machInst, "%(mnemonic)s", instMnem, setFlags,
280                 _scale, _index, _base,
281                 _disp, _segment, _data,
282                 _dataSize, _addressSize, _memFlags, %(op_class)s)
283     {
284         %(constructor)s;
285     }
286 }};

The constructor’s implementation itself can be also generated with the help of another Template substitution, MicroLdStOpConstructor.

MicroLoadExecute: template used to implement micro-load operation

More importantly, in addition to the constructor for the microop, each microop should implement several definitions to have proper semantics of the microop.
Let’s take a look at the MicroLoadExecute template. The definition generated by this template is called execute, and most of the Ld style microcode implements this function. However, depending on the semantics of micro-load instructions, different implementation of the execute will be populated. The different InstObjParams result in different replacement in the template, and the corresponding implementation will be produced as a consequence.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 90 def template MicroLoadExecute {{
 91     Fault %(class_name)s::execute(ExecContext *xc,
 92           Trace::InstRecord *traceData) const
 93     {

 94         Fault fault = NoFault;
 95         Addr EA;
 96
 97         %(op_decl)s;
 98         %(op_rd)s;
 99         %(ea_code)s;
100         DPRINTF(X86, "%s : %s: The address is %#x\n", instMnem, mnemonic, EA);
101
102         fault = readMemAtomic(xc, traceData, EA, Mem, dataSize, memFlags);
103
104         if (fault == NoFault) {
105             %(code)s;
106         } else if (memFlags & Request::PREFETCH) {
107             // For prefetches, ignore any faults/exceptions.
108             return NoFault;
109         }
110         if(fault == NoFault)
111         {
112             %(op_wb)s;
113         }
114
115         return fault;
116     }
117 }};

The above template contains incomplete code-snippets starting with % keyword. When the subst function of the corresponding template object is invoked, all those uncompleted parts will be replaced. For this replacement, we built the myDict dictionary. Note that this myDict initialize itself using the information provided by the InstObjParams such as operands field of it. Therefore, during substitution, if it encounters any keyword starting with %, it should refer to myDict to retrieve proper replacement for that. For example, class_name is provided by the InstObjParams. Also, op_wb is the CPP statements translated from the keyword Data to write back the result to the output register. Let’s take a look at how the execute function will be implemented after substitution.

gem5/build/X86/arch/x86/generated/exec-ns.cc.inc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
19101     Fault Ld::execute(ExecContext *xc,
19102           Trace::InstRecord *traceData) const
19103     {
19104         Fault fault = NoFault;
19105         Addr EA;
19106
19107         uint64_t Index = 0;
19108 uint64_t Base = 0;
19109 uint64_t Data = 0;
19110 uint64_t SegBase = 0;
19111 uint64_t Mem;
19112 ;
19113         Index = xc->readIntRegOperand(this, 0);
19114 Base = xc->readIntRegOperand(this, 1);
19115 Data = xc->readIntRegOperand(this, 2);
19116 SegBase = xc->readMiscRegOperand(this, 3);
19117 ;
19118         EA = SegBase + bits(scale * Index + Base + disp, addressSize * 8 - 1, 0);;
19119         DPRINTF(X86, "%s : %s: The address is %#x\n", instMnem, mnemonic, EA);
19120
19121         fault = readMemAtomic(xc, traceData, EA, Mem, dataSize, memFlags);
19122
19123         if (fault == NoFault) {
19124             Data = merge(Data, Mem, dataSize);;
19125         } else if (memFlags & Request::PREFETCH) {
19126             // For prefetches, ignore any faults/exceptions.
19127             return NoFault;
19128         }
19129         if(fault == NoFault)
19130         {
19131
19132
19133         {
19134             uint64_t final_val = Data;
19135             xc->setIntRegOperand(this, 0, final_val);
19136
19137             if (traceData) { traceData->setData(final_val); }
19138         };
19139         }
19140
19141         return fault;
19142     }

As shown in the Line 19133-19138, the CPP statements translated from Data keyword of the code-snippet are implemented as a result of replacing op_wb.

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.