O3 Cpu Gem5
Python derives O3CPU through DerivO3CPU class
To understand particular processor in the GEM5, it is easy to start from the script that instantiate the processor. We can easily find that lots of GEM5 provided default script utilize DerivO3CPU to attach the O3 CPU to the system.
gem5/src/cpu/o3/O3CPU.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
61 class DerivO3CPU(BaseCPU):
62 type = 'DerivO3CPU'
63 cxx_header = 'cpu/o3/deriv.hh'
64
65 @classmethod
66 def memory_mode(cls):
67 return 'timing'
68
69 @classmethod
70 def require_caches(cls):
71 return True
72
73 @classmethod
74 def support_take_over(cls):
75 return True
76
77 activity = Param.Unsigned(0, "Initial count")
78
79 cacheStorePorts = Param.Unsigned(200, "Cache Ports. "
80 "Constrains stores only.")
81 cacheLoadPorts = Param.Unsigned(200, "Cache Ports. "
82 "Constrains loads only.")
83
84 decodeToFetchDelay = Param.Cycles(1, "Decode to fetch delay")
85 renameToFetchDelay = Param.Cycles(1 ,"Rename to fetch delay")
86 iewToFetchDelay = Param.Cycles(1, "Issue/Execute/Writeback to fetch "
87 "delay")
88 commitToFetchDelay = Param.Cycles(1, "Commit to fetch delay")
89 fetchWidth = Param.Unsigned(8, "Fetch width")
90 fetchBufferSize = Param.Unsigned(64, "Fetch buffer size in bytes")
91 fetchQueueSize = Param.Unsigned(32, "Fetch queue size in micro-ops "
92 "per-thread")
DerivO3CPU is used to instantiate the O3CPU in the runscript of the GEM5. Similar to other m5 objects of the processors, it also inherits from the BaseCPU m5 class. Also it sets the parameters of the O3CPU which will be accessed by the DerivO3CPUParams later in the CPP implementation of this class.
gem5/src/cpu/o3/derive.hh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#ifndef __CPU_O3_DERIV_HH__
#define __CPU_O3_DERIV_HH__
#include "cpu/o3/cpu.hh"
#include "cpu/o3/impl.hh"
#include "params/DerivO3CPU.hh"
class DerivO3CPU : public FullO3CPU<O3CPUImpl>
{
public:
DerivO3CPU(DerivO3CPUParams *p)
: FullO3CPU<O3CPUImpl>(p)
{ }
};
#endif // __CPU_O3_DERIV_HH__
Contrary to my expectation, the DerivO3CPU class doesn’t have any definitions to emulate the O3CPU, but just inherits from the FullO3CPU with O3CPUImpl for the class template instantiation. Therefore, we can reasonably guess that all the implementations are done by the FullO3CPU. Before we go deep down, let’s take a look at the class hierarchies of this CPU.
gem5/src/o3/cpu.hh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
83 class BaseO3CPU : public BaseCPU
84 {
85 //Stuff that's pretty ISA independent will go here.
86 public:
87 BaseO3CPU(BaseCPUParams *params);
88
89 void regStats();
90 };
91
92 /**
93 * FullO3CPU class, has each of the stages (fetch through commit)
94 * within it, as well as all of the time buffers between stages. The
95 * tick() function for the CPU is defined here.
96 */
97 template <class Impl>
98 class FullO3CPU : public BaseO3CPU
99 {
100 public:
101 // Typedefs from the Impl here.
102 typedef typename Impl::CPUPol CPUPolicy;
103 typedef typename Impl::DynInstPtr DynInstPtr;
104 typedef typename Impl::O3CPU O3CPU;
105
106 using VecElem = TheISA::VecElem;
107 using VecRegContainer = TheISA::VecRegContainer;
108
109 using VecPredRegContainer = TheISA::VecPredRegContainer;
110
111 typedef O3ThreadState<Impl> ImplState;
112 typedef O3ThreadState<Impl> Thread;
113
114 typedef typename std::list<DynInstPtr>::iterator ListIt;
115
116 friend class O3ThreadContext<Impl>;
......
The FullO3CPU is the main CPU class for the O3 CPU. We can find that this FullO3CPU inherits BaseO3CPU inheriting BaseCPU. In GEM5, all the CPU classes basically inherits the BaseCPU class. Remember that the DerivO3CPU also inherits from the BaseCPU m5 object, which generates the proper interfaces to access the parameters of the DerivO3CPU and BaseCPU in the CPP implementation of those classes. To understand the relationship of those two classes, it would be good to take a look at the constructor of those two classes.
gem5/src/o3/cpu.cc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
82 BaseO3CPU::BaseO3CPU(BaseCPUParams *params)
83 : BaseCPU(params)
84 {
85 }
......
93 template <class Impl>
94 FullO3CPU<Impl>::FullO3CPU(DerivO3CPUParams *params)
95 : BaseO3CPU(params),
96 itb(params->itb),
97 dtb(params->dtb),
98 tickEvent([this]{ tick(); }, "FullO3CPU tick",
99 false, Event::CPU_Tick_Pri),
100 threadExitEvent([this]{ exitThreads(); }, "FullO3CPU exit threads",
101 false, Event::CPU_Exit_Pri),
102 #ifndef NDEBUG
103 instcount(0),
104 #endif
105 removeInstsThisCycle(false),
106 fetch(this, params),
107 decode(this, params),
108 rename(this, params),
109 iew(this, params),
110 commit(this, params),
111
112 /* It is mandatory that all SMT threads use the same renaming mode as
113 * they are sharing registers and rename */
114 vecMode(RenameMode<TheISA::ISA>::init(params->isa[0])),
115 regFile(params->numPhysIntRegs,
116 params->numPhysFloatRegs,
117 params->numPhysVecRegs,
118 params->numPhysVecPredRegs,
119 params->numPhysCCRegs,
120 vecMode),
121
122 freeList(name() + ".freelist", ®File),
123
124 rob(this, params),
125
126 scoreboard(name() + ".scoreboard",
127 regFile.totalNumPhysRegs()),
128
129 isa(numThreads, NULL),
130
131 timeBuffer(params->backComSize, params->forwardComSize),
132 fetchQueue(params->backComSize, params->forwardComSize),
133 decodeQueue(params->backComSize, params->forwardComSize),
134 renameQueue(params->backComSize, params->forwardComSize),
135 iewQueue(params->backComSize, params->forwardComSize),
136 activityRec(name(), NumStages,
137 params->backComSize + params->forwardComSize,
138 params->activity),
139
140 globalSeqNum(1),
141 system(params->system),
142 lastRunningCycle(curCycle())
Note that FullO3CPU passes the params to the BaseO3CPU which further passes the params
to the BaseCPU’s constructor. Remember that all the processors on the GEM5 should implement the
BaseCPU in addition to their additional semantic. Therefore, after FullO3CPU first initializes
its member field using the passed parameters, it should pass the parameter to the BaseCPU to
finish base processor configurations.
One interesting thing to note is that FullO3CPU is a template class which adopts
another class called Impl that should be replaced with proper class to be used.
Therefore, DerivO3CPU inherits the FullO3CPU
gem5/src/cpu/o3/impl.hh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
38 // Forward declarations.
39 template <class Impl>
40 class BaseO3DynInst;
41
42 template <class Impl>
43 class FullO3CPU;
44
45 /** Implementation specific struct that defines several key types to the
46 * CPU, the stages within the CPU, the time buffers, and the DynInst.
47 * The struct defines the ISA, the CPU policy, the specific DynInst, the
48 * specific O3CPU, and all of the structs from the time buffers to do
49 * communication.
50 * This is one of the key things that must be defined for each hardware
51 * specific CPU implementation.
52 */
53 struct O3CPUImpl
54 {
55 /** The type of MachInst. */
56 typedef TheISA::MachInst MachInst;
57
58 /** The CPU policy to be used, which defines all of the CPU stages. */
59 typedef SimpleCPUPolicy<O3CPUImpl> CPUPol;
60
61 /** The DynInst type to be used. */
62 typedef BaseO3DynInst<O3CPUImpl> DynInst;
63
64 /** The refcounted DynInst pointer to be used. In most cases this is
65 * what should be used, and not DynInst *.
66 */
67 typedef RefCountingPtr<DynInst> DynInstPtr;
68 typedef RefCountingPtr<const DynInst> DynInstConstPtr;
69
70 /** The O3CPU type to be used. */
71 typedef FullO3CPU<O3CPUImpl> O3CPU;
72
73 /** Same typedef, but for CPUType. BaseDynInst may not always use
74 * an O3 CPU, so it's clearer to call it CPUType instead in that
75 * case.
76 */
77 typedef O3CPU CPUType;
78
79 enum {
80 MaxWidth = 8,
81 MaxThreads = 4
82 };
83 };
GEM5 defines structure called O3CPUImpl that instantiates all template classes associated with O3 CPU. One of the instantiated template class is FullO3CPU (Line 71). By instantiating the FullO3CPU class with the O3CPUImpl class, it defines complete FullO3CPU class called O3CPU. The O3CPU type will be used later frequently to indicate the O3CPU in the other parts of the O3CPU implementations. Also, other uncompleted class templates are instantiated with the O3CPUImpl class. Let’s revisit the FullO3CPU class once again to take a look at how the O3CPUImpl class will be utilized as an Impl template.
gem5/src/o3/cpu.hh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
97 template <class Impl>
98 class FullO3CPU : public BaseO3CPU
99 {
100 public:
101 // Typedefs from the Impl here.
102 typedef typename Impl::CPUPol CPUPolicy;
103 typedef typename Impl::DynInstPtr DynInstPtr;
104 typedef typename Impl::O3CPU O3CPU;
......
558 protected:
559 /** The fetch stage. */
560 typename CPUPolicy::Fetch fetch;
561
562 /** The decode stage. */
563 typename CPUPolicy::Decode decode;
564
565 /** The dispatch stage. */
566 typename CPUPolicy::Rename rename;
567
568 /** The issue/execute/writeback stages. */
569 typename CPUPolicy::IEW iew;
570
571 /** The commit stage. */
572 typename CPUPolicy::Commit commit;
First of all, it defines new typenames by utilizing the member field of the
Impl class. Note that those are also typedef of some classes
retrieved by instantiating specific class templates defined for O3 CPU.
For example, CPUPolicy is set an alias of Impl::CPUPol which is a
typedef of SimpleCPUPolicy
gem5/src/cpu/o3/cpu_policy.hh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
51 /**
52 * Struct that defines the key classes to be used by the CPU. All
53 * classes use the typedefs defined here to determine what are the
54 * classes of the other stages and communication buffers. In order to
55 * change a structure such as the IQ, simply change the typedef here
56 * to use the desired class instead, and recompile. In order to
57 * create a different CPU to be used simultaneously with this one, see
58 * the alpha_impl.hh file for instructions.
59 */
60 template<class Impl>
61 struct SimpleCPUPolicy
62 {
63 /** Typedef for the freelist of registers. */
64 typedef UnifiedFreeList FreeList;
65 /** Typedef for the rename map. */
66 typedef UnifiedRenameMap RenameMap;
67 /** Typedef for the ROB. */
68 typedef ::ROB<Impl> ROB;
69 /** Typedef for the instruction queue/scheduler. */
70 typedef InstructionQueue<Impl> IQ;
71 /** Typedef for the memory dependence unit. */
72 typedef ::MemDepUnit<StoreSet, Impl> MemDepUnit;
73 /** Typedef for the LSQ. */
74 typedef ::LSQ<Impl> LSQ;
75 /** Typedef for the thread-specific LSQ units. */
76 typedef ::LSQUnit<Impl> LSQUnit;
77
78 /** Typedef for fetch. */
79 typedef DefaultFetch<Impl> Fetch;
80 /** Typedef for decode. */
81 typedef DefaultDecode<Impl> Decode;
82 /** Typedef for rename. */
83 typedef DefaultRename<Impl> Rename;
84 /** Typedef for Issue/Execute/Writeback. */
85 typedef DefaultIEW<Impl> IEW;
86 /** Typedef for commit. */
87 typedef DefaultCommit<Impl> Commit;
88
89 /** The struct for communication between fetch and decode. */
90 typedef DefaultFetchDefaultDecode<Impl> FetchStruct;
91
92 /** The struct for communication between decode and rename. */
93 typedef DefaultDecodeDefaultRename<Impl> DecodeStruct;
94
95 /** The struct for communication between rename and IEW. */
96 typedef DefaultRenameDefaultIEW<Impl> RenameStruct;
97
98 /** The struct for communication between IEW and commit. */
99 typedef DefaultIEWDefaultCommit<Impl> IEWStruct;
100
101 /** The struct for communication within the IEW stage. */
102 typedef ::IssueStruct<Impl> IssueStruct;
103
104 /** The struct for all backwards communication. */
105 typedef TimeBufStruct<Impl> TimeStruct;
106
107 };
In the above code, I can find that each stage is defined as an instantiation
of one class template. For example, Fetch type is defined as an DefaultFetch
Fetch of the O3CPU
Tick
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
529 template <class Impl>
530 void
531 FullO3CPU<Impl>::tick()
532 {
533 DPRINTF(O3CPU, "\n\nFullO3CPU: Ticking main, FullO3CPU.\n");
534 assert(!switchedOut());
535 assert(drainState() != DrainState::Drained);
536
537 ++numCycles;
538 updateCycleCounters(BaseCPU::CPU_STATE_ON);
539
540 // activity = false;
541
542 //Tick each of the stages
543 fetch.tick();
544
545 decode.tick();
546
547 rename.tick();
548
549 iew.tick();
550
551 commit.tick();
552
553 // Now advance the time buffers
554 timeBuffer.advance();
555
556 fetchQueue.advance();
557 decodeQueue.advance();
558 renameQueue.advance();
559 iewQueue.advance();
560
561 activityRec.advance();
562
563 if (removeInstsThisCycle) {
564 cleanUpRemovedInsts();
565 }
566
567 if (!tickEvent.scheduled()) {
568 if (_status == SwitchedOut) {
569 DPRINTF(O3CPU, "Switched out!\n");
570 // increment stat
571 lastRunningCycle = curCycle();
572 } else if (!activityRec.active() || _status == Idle) {
573 DPRINTF(O3CPU, "Idle!\n");
574 lastRunningCycle = curCycle();
575 timesIdled++;
576 } else {
577 schedule(tickEvent, clockEdge(Cycles(1)));
578 DPRINTF(O3CPU, "Scheduling next tick!\n");
579 }
580 }
581
582 if (!FullSystem)
583 updateThreadPriority();
584
585 tryDrain();
586 }
Comments powered by Disqus.