# 实验报告
<center>陈鸿凯 202002001002</center>
## 实验数据
## 1
| trace | LONG-1 | LONG-2 | LONG-3 | LONG-4 | SHORT-1 | SHORT-2 | SHORT-3 | SHORT-4 | SHORT-24 | SHORT-25 | SHORT-27 | SHORT-28 | SHORT-30 | average |
| --------------------------------------------- | --------------------------------------------- | ------- | ------- | ------- | ------- | ------- | -------- | ------- | -------- | -------- | -------- | -------- | -------- | ----------- |
| TAKEN | 22.1126 | 51.5666 | 56.1836 | 2.3293 | 34.346 | 53.3328 | 12.8561 | 12.35 | 0.0009 | 0.0035 | 2.2689 | 1.8285 | 91.6225 | 26.21548462 |
| NOT_TAKEN | 23.3825 | 36.1905 | 50.5379 | 11.5291 | 45.6506 | 95.0392 | 144.3065 | 33.2308 | 7.6923 | 28.5711 | 14.4918 | 14.5707 | 77.9273 | 44.85540769 |
| 1bit fsm | 5.7364 | 2.751 | 12.1055 | 4.6183 | 3.456 | 43.1704 | 8.6456 | 6.029 | 0.0009 | 0.0035 | 4.5423 | 3.6571 | 16.6571 | 8.567161538 |
| 2bit fsm | 2.9956 | 1.8347 | 9.5639 | 2.3103 | 2.7377 | 33.4623 | 4.6308 | 5.5712 | 0.0005 | 0.0018 | 2.287 | 1.8288 | 10.6065 | 5.987007692 |
| 3bit fsm | 3.1476 | 1.7714 | 8.645 | 2.3103 | 2.8295 | 32.1898 | 4.5109 | 5.5039 | 0.0005 | 0.0018 | 2.287 | 1.8288 | 10.9199 | 5.842030769 |
| 1bit fsm + 17bit global | 0.7118 | 2.143 | 9.4827 | 0.0086 | 3.3592 | 13.132 | 6.83 | 5.7608 | 0.0013 | 0.0038 | 1.0733 | 0.0169 | 9.3196 | 3.987923077 |
| 2bit fsm + 17bit global | 0.5654 | 1.3844 | 7.7714 | 0.0049 | 2.9097 | 11.3885 | 3.7173 | 5.1688 | 0.0008 | 0.0021 | 0.625 | 0.0098 | 6.3666 | 3.070361538 |
| 3bit fsm + 17bit global | 0.7129 | 1.3098 | 7.1236 | 0.005 | 3.1955 | 10.6267 | 3.6071 | 4.9068 | 0.0008 | 0.0021 | 0.625 | 0.0098 | 5.6986 | 2.909515385 |
### 结论1
> 1. fsm 优于 静态(默认TAKEN&NOT_TAKEN)
>
> 2. 使用全局历史信息 优于 不适用全局历史信息
>
> 3. 2bit fsm 和 3bit fsm 优于 1bit fsm
>
> 4. 2bit fsm 和 3bit fsm 差距不大,2bit fsm 所用空间更小
## 2
| trace | LONG-1 | LONG-2 | LONG-3 | LONG-4 | SHORT-1 | SHORT-2 | SHORT-3 | SHORT-4 | SHORT-24 | SHORT-25 | SHORT-27 | SHORT-28 | SHORT-30 | average |
| --------------------------------------------- | --------------------------------------------- | ------- | ------- | ------- | ------- | ------- | -------- | ------- | -------- | -------- | -------- | -------- | -------- | ----------- |
| 2bit fsm + 17bit global | 0.5654 | 1.3844 | 7.7714 | 0.0049 | 2.9097 | 11.3885 | 3.7173 | 5.1688 | 0.0008 | 0.0021 | 0.625 | 0.0098 | 6.3666 | 3.070361538 |
| 2bit fsm + 30bit global | 0.1398 | 1.2367 | 7.4378 | 0.0054 | 2.1092 | 5.0625 | 3.1777 | 4.0446 | 0.0011 | 0.0023 | 0.2627 | 0.0114 | 3.0848 | 2.044307692 |
### 结论2
>30bit 全局历史信息 在特定情况下优于 17bit 全局历史信息,但所用内存将急剧增大
## 3
| trace | LONG-1 | LONG-2 | LONG-3 | LONG-4 | SHORT-1 | SHORT-2 | SHORT-3 | SHORT-4 | SHORT-24 | SHORT-25 | SHORT-27 | SHORT-28 | SHORT-30 | average |
| --------------------------------------------- | --------------------------------------------- | ------- | ------- | ------- | ------- | ------- | -------- | ------- | -------- | -------- | -------- | -------- | -------- | ----------- |
| 2bit fsm | 2.9956 | 1.8347 | 9.5639 | 2.3103 | 2.7377 | 33.4623 | 4.6308 | 5.5712 | 0.0005 | 0.0018 | 2.287 | 1.8288 | 10.6065 | 5.987007692 |
| 2bit fsm + 3bit local | 2.492 | 1.6446 | 8.1803 | 4.6119 | 1.3579 | 32.2575 | 4.4379 | 5.6087 | 0.0006 | 0.0018 | 1.8885 | 1.8297 | 9.0792 | 5.645430769 |
| 2bit fsm + 17bit global | 0.5654 | 1.3844 | 7.7714 | 0.0049 | 2.9097 | 11.3885 | 3.7173 | 5.1688 | 0.0008 | 0.0021 | 0.625 | 0.0098 | 6.3666 | 3.070361538 |
| 2bit fsm + 3bit local + 17bit global | 0.4088 | 1.2317 | 8.4514 | 0.0073 | 1.8418 | 11.4317 | 3.7906 | 5.0203 | 0.0019 | 0.003 | 0.6295 | 0.0173 | 2.5584 | 2.722592308 |
| 3bit fsm + 3bit local + 17bit global | 0.3998 | 1.1704 | 7.6918 | 0.0073 | 1.7973 | 10.0933 | 3.6793 | 4.6032 | 0.0019 | 0.003 | 0.6295 | 0.0173 | 2.4663 | 2.504646154 |
| 3bit fsm + 3bit local + 17bit global + hash | 0.387 | 1.1729 | 7.3673 | 0.0057 | 1.7277 | 9.6496 | 3.2273 | 4.9918 | 0.0007 | 0.002 | 0.2174 | 0.011 | 2.5704 | 2.410061538 |
### 结论3
>单独使用局部历史信息效果不明显,但结合全局历史信息同时使用效果比较好,最终通过实验并综合考虑准确率以及使用的内存,决定使用3bit fsm + 3bit local + 17bit global + hash的方案
### 实验结果
| trace | LONG-1 | LONG-2 | LONG-3 | LONG-4 | SHORT-1 | SHORT-2 | SHORT-3 | SHORT-4 | SHORT-24 | SHORT-25 | SHORT-27 | SHORT-28 | SHORT-30 | average |
| ------------------------------------------- | ------- | ------- | ------- | ------ | ------- | ------- | ------- | ------- | -------- | -------- | -------- | -------- | -------- | ----------- |
| 2bit fsm + 17bit global | 0.5654 | 1.3844 | 7.7714 | 0.0049 | 2.9097 | 11.3885 | 3.7173 | 5.1688 | 0.0008 | 0.0021 | 0.625 | 0.0098 | 6.3666 | 3.070361538 |
| 3bit fsm + 3bit local + 17bit global + hash | 0.387 | 1.1729 | 7.3673 | 0.0057 | 1.7277 | 9.6496 | 3.2273 | 4.9918 | 0.0007 | 0.002 | 0.2174 | 0.011 | 2.5704 | 2.410061538 |
| 每项优化 | -0.1784 | -0.2115 | -0.4041 | 0.0008 | -1.182 | -1.7389 | -0.49 | -0.177 | -0.0001 | -1E-04 | -0.4076 | 0.0012 | -3.7962 | -0.6603 |
## 代码
```C++
///////////////////////////////////////////////////////////////////////
//// Copyright 2020 by ChongKai. //
///////////////////////////////////////////////////////////////////////
#include <stdio.h>
#include <stdlib.h>
#include "common.h"
// 饱和计数器:加1
static inline UINT32 SatIncrement(UINT32 x, UINT32 max)
{
if (x < max) return x + 1;
return x;
}
// 饱和计数器:减1
static inline UINT32 SatDecrement(UINT32 x)
{
if (x > 0) return x - 1;
return x;
}
// Gshare分支预测器的状态信息
// 模式历史表
UINT32* pht; // 模式历史表
UINT32 numPhtEntries; // PHT中的项数
#define PHT_CTR_BIT 3 // PHT所占的位数
#define PHT_CTR_MAX ((1<<PHT_CTR_BIT) - 1) // PHT中的最大值
#define PHT_CTR_INIT (PHT_CTR_MAX / 2) // PHT中的初始值
// 局部历史表
UINT32* lht; // 局部历史表
UINT32 numLht
评论0