Early Stage Real-Time SoC Power Estimation Using RTL
Instrumentation
∗
Jianlei Yang
1,2
, Liwei Ma
2
, Kang Zhao
2
, Yici Cai
1
and Tin-Fook Ngai
2
1
Department of Computer Science and Technology, Tsinghua University, 10008 4 , Beijing, China
jerryyangs@gmail.com, caiyc@mail.tsinghua.edu.cn
2
Intel Labs China, Intel Corporation, 100 1 9 0 , Beijing, China
{liwei.ma, kang.z h a o @intel.com, tin-fook.ngai}@intel.com
ABSTRACT
Early stage power estimation is critical for SoC architecture
exploration and validation in modern VLSI design, but rea l-
time, long time interval and accurate estimation is still chal-
lenging for system-level estimation and software/hardware
tuning. This work proposes a model abstraction approach
for real-time power estimation in the manner of machine
learning. The singular value decomposition (SVD) tech-
nique is exploited to abstract the principle components of
relationship between register toggling profile and accurate
power waveform. The abstracted power model is automati-
cally instrumented to RTL implementation and synthesized
into FPGA platform for real-time power estimation by in-
strumenting the register toggling profile. The prototype im-
plementation on three IP cores predicts the cycle-by-cycle
power dissipation within 5% accuracy loss compared with a
commercial power estimation to o l.
Categories and Subject Descriptors
B.8.2 [Integrated Circuits]: Performance and Reliabil-
ity—Performance Analysis and Design Aids
General Terms
Algorithms, Design, Performance, Measurement
Keywords
Real-Time, Power Estimation , RTL Instrumentation, Sin-
gular Value Decomposition (SVD)
1. INTRODUCTION
Power consumption has become one of biggest challenges
for modern chip design. In fac t, power dissipatio n is re-
garded as a likely limiting facto r to the increasing scales of
integration predicted by Moore’s Law [1]. Early visibility
into power b u d g etin g requires appropriate tools support for
power estima tio n and optimization at various design stages.
Extensive research has addressed the power consumption
issues at varying levels of abstraction. At lower level of de-
sign hierarchy, higher accurac y of analysis can be achieved
because more detail on circ u it implementation is available.
These kinds of technologies have been incorporated into var-
ious commercial p ower estimation tools, such as Synopsys
PrimeTime PX [2][3][4][5]. Higher level a p p r o a ches usually
∗
T
his work was s u p ported by the National Natura l Science
Foundation of China (NSFC) under Grant No.61274031 and
No.61106030.
perform functional simulation to calculate the power con-
sumption and subsequently have larger capacity in terms of
both transistor count and simulation time [6][7][8]. As th e
a
dvances in fabrication technologies have led to sh r in kin g
devices sizes, and consequently increasing chip complexities,
the la r g er scale circuit des ig n and verifications become in-
creasingly difficult and time consuming. Th e poor speed o f
power estimation tools limits their u tility in the design flow.
Clearly, such estimation tools cannot be used in an iterative
manner for architectural exploration. Raising the level of
abstraction to the ar chitecture level can lead to substantial
efficiency improvements, and many types of virtual platform
technologies are proposed and r en owned for ea r ly develop-
ment and validation for the chip design . Especially, the ac-
celerated hardware/software co-emulation is the most pop-
ular virtual platform to validate both functionality and per-
formance which is essential for shorting turn-around time.
Netlist Level
PTPX
Post-Silicon
High
Abstraction
Level
Capacity
Accuracy
RTL
Instrumentation
O
b
serv
ab
i
li
t
y
Decr
easi
ng
Figure 1: Solution space of power estimation meth-
ods
However, raising the abstraction level is not universal,
which will bring obvious decreasing estimation accuracy. As
shown in Figure 1, netlist power estimation methods have
high accuracy and excellent observability. But its biggest
drawback is their limited capacity. For post-silicon front, it
has a perfect accur a c y and can run long workloads. But
post-silicon measurements come too late to influence the
SoC architecture and consequently lead to limited observ-
ability.
Many interesting techniques have been develop ed for do-
ing early estimations with reasonable accuracy [9][10][11][12].
I
n [11], a micro-architectural power model fed by activity
c
ounters is programmed into the chip multiprocessors to ex-
plore performance, power, and thermal issues. In [10], a
978-1-4799-7792-5/15/$31.00 ©2015 IEEE