benchncnn can be used to test neural network inference performance
Only the network definition files (ncnn param) are required.
The large model binary files (ncnn bin) are not loaded but generated randomly for speed test.
More model networks may be added later.
---
Build
```shell
# assume you have already build ncnn library successfully
# uncomment the following line in <ncnn-root-dir>/CMakeLists.txt with your favorite editor
# add_subdirectory(benchmark)
cd <ncnn-root-dir>/<your-build-dir>
make -j4
# you can find benchncnn binary in <ncnn-root-dir>/<your-build-dir>/benchmark
```
Usage
```shell
# copy all param files to the current directory
./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down]
```
run benchncnn on android device
```shell
# for running on android device, upload to /data/local/tmp/ folder
adb push benchncnn /data/local/tmp/
adb push <ncnn-root-dir>/benchmark/*.param /data/local/tmp/
adb shell
# executed in android adb shell
cd /data/local/tmp/
./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down]
```
Parameter
|param|options|default|
|---|---|---|
|loop count|1~N|4|
|num threads|1~N|max_cpu_count|
|powersave|0=all cores, 1=little cores only, 2=big cores only|0|
|gpu device|-1=cpu-only, 0=gpu0, 1=gpu1 ...|-1|
|cooling down|0=disable, 1=enable|1|
Tips: Disable android UI server and set CPU and GPU to max frequency
```shell
# stopping android ui server, can be retarted later via adb shell start
adb root
adb shell stop
# executed in android adb shell
# set cpu performance mode
echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo "performance" > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo "performance" > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo "performance" > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo "performance" > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
# set gpu performance mode (eg. RK3399)
echo "performance" > /sys/class/misc/mali0/device/devfreq/ff9a0000.gpu/governor
# set gpu performance mode (eg. Android Adreno)
echo 1 > /sys/class/kgsl/kgsl-3d0/force_clk_on
echo 10000000 > /sys/class/kgsl/kgsl-3d0/idle_timer
echo "performance" > /sys/class/kgsl/kgsl-3d0/devfreq/governor
echo <max freq> > /sys/class/kgsl/kgsl-3d0/gpuclk
```
---
Typical output (executed in android adb shell)
### NVIDIA Jetson AGX Orin (Cortex-A78AE 2.2 GHz x 12 + Ampere@1.3 GHz Tensor Cores 64)
```
i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 1 0 -1 0
loop_count = 64
num_threads = 1
powersave = 0
gpu_device = -1
cooling_down = 0
squeezenet min = 11.66 max = 11.80 avg = 11.74
squeezenet_int8 min = 12.24 max = 12.39 avg = 12.31
mobilenet min = 19.56 max = 19.73 avg = 19.65
mobilenet_int8 min = 16.06 max = 16.25 avg = 16.14
mobilenet_v2 min = 13.20 max = 13.41 avg = 13.29
mobilenet_v3 min = 11.39 max = 11.57 avg = 11.48
shufflenet min = 8.07 max = 8.18 avg = 8.11
shufflenet_v2 min = 8.41 max = 8.51 avg = 8.45
mnasnet min = 12.74 max = 12.91 avg = 12.79
proxylessnasnet min = 15.18 max = 15.32 avg = 15.25
efficientnet_b0 min = 26.86 max = 26.96 avg = 26.90
efficientnetv2_b0 min = 35.99 max = 36.15 avg = 36.07
regnety_400m min = 16.81 max = 16.98 avg = 16.87
blazeface min = 4.25 max = 4.37 avg = 4.29
googlenet min = 48.73 max = 48.98 avg = 48.87
googlenet_int8 min = 47.39 max = 47.60 avg = 47.49
resnet18 min = 30.93 max = 31.24 avg = 31.08
resnet18_int8 min = 55.44 max = 55.70 avg = 55.56
alexnet min = 44.19 max = 44.43 avg = 44.33
vgg16 min = 173.94 max = 174.97 avg = 174.46
vgg16_int8 min = 475.10 max = 479.37 avg = 477.33
resnet50 min = 89.50 max = 90.11 avg = 89.80
resnet50_int8 min = 106.77 max = 107.14 avg = 106.96
squeezenet_ssd min = 37.78 max = 38.35 avg = 37.93
squeezenet_ssd_int8 min = 50.48 max = 50.88 avg = 50.74
mobilenet_ssd min = 45.62 max = 46.12 avg = 45.74
mobilenet_ssd_int8 min = 37.77 max = 38.00 avg = 37.88
mobilenet_yolo min = 90.23 max = 90.49 avg = 90.35
mobilenetv2_yolov3 min = 47.27 max = 47.48 avg = 47.33
yolov4-tiny min = 60.41 max = 60.75 avg = 60.57
nanodet_m min = 19.26 max = 19.43 avg = 19.35
yolo-fastest-1.1 min = 8.16 max = 8.31 avg = 8.20
yolo-fastestv2 min = 8.26 max = 8.39 avg = 8.32
i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 2 0 -1 0
loop_count = 64
num_threads = 2
powersave = 0
gpu_device = -1
cooling_down = 0
squeezenet min = 6.83 max = 6.98 avg = 6.90
squeezenet_int8 min = 7.39 max = 7.50 avg = 7.45
mobilenet min = 10.40 max = 10.50 avg = 10.45
mobilenet_int8 min = 8.92 max = 9.09 avg = 8.99
mobilenet_v2 min = 7.67 max = 7.80 avg = 7.74
mobilenet_v3 min = 6.86 max = 7.01 avg = 6.93
shufflenet min = 6.34 max = 6.44 avg = 6.39
shufflenet_v2 min = 5.71 max = 5.83 avg = 5.76
mnasnet min = 7.47 max = 7.58 avg = 7.53
proxylessnasnet min = 8.73 max = 8.83 avg = 8.78
efficientnet_b0 min = 14.93 max = 15.13 avg = 15.03
efficientnetv2_b0 min = 20.17 max = 20.70 avg = 20.29
regnety_400m min = 12.50 max = 12.62 avg = 12.57
blazeface min = 2.95 max = 3.06 avg = 3.00
googlenet min = 26.25 max = 26.53 avg = 26.37
googlenet_int8 min = 26.54 max = 26.79 avg = 26.66
resnet18 min = 16.69 max = 16.90 avg = 16.80
resnet18_int8 min = 29.70 max = 29.93 avg = 29.81
alexnet min = 22.96 max = 23.12 avg = 23.03
vgg16 min = 88.39 max = 89.16 avg = 88.79
vgg16_int8 min = 245.86 max = 247.55 avg = 246.62
resnet50 min = 46.55 max = 46.86 avg = 46.70
resnet50_int8 min = 56.28 max = 56.63 avg = 56.43
squeezenet_ssd min = 23.65 max = 24.29 avg = 23.81
squeezenet_ssd_int8 min = 30.86 max = 31.27 avg = 30.99
mobilenet_ssd min = 25.17 max = 25.31 avg = 25.24
mobilenet_ssd_int8 min = 21.77 max = 21.97 avg = 21.84
mobilenet_yolo min = 48.03 max = 48.33 avg = 48.14
mobilenetv2_yolov3 min = 26.58 max = 26.81 avg = 26.66
yolov4-tiny min = 35.31 max = 35.53 avg = 35.41
nanodet_m min = 12.93 max = 13.08 avg = 13.01
yolo-fastest-1.1 min = 6.00 max = 6.10 avg = 6.04
yolo-fastestv2 min = 6.46 max = 6.61 avg = 6.52
i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 4 0 -1 0
loop_count = 64
num_threads = 4
powersave = 0
gpu_device = -1
cooling_down = 0
squeezenet min = 4.54 max = 4.84 avg = 4.61
squeezenet_int8 min = 4.96 max = 5.41 avg = 5.05
mobilenet min = 5.96 max = 6.23 avg = 6.04
mobilenet_int8 min = 5.21 max = 5.50 avg = 5.30
mobilenet_v2 min = 5.05 max = 5.26 avg = 5.15
mobilenet_v3 min = 4.83 max = 5.14 avg = 4.90
shufflenet min = 5.11 max = 5.34 avg = 5.18
shufflenet_v2 min = 4.13 max = 4.44 avg = 4.18
mnasnet min = 4.93 max = 5.27 avg = 5.01
proxylessnasnet min = 5.64 max = 5.89 avg = 5.72
efficient
没有合适的资源?快使用搜索试试~ 我知道了~
2022全国大学生嵌入式芯片与系统设计竞赛0581基于龙芯教育派的隔离区自主送餐机器人相关源代码.zip
共2003个文件
cpp:811个
h:748个
py:400个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 13 浏览量
2023-10-22
20:23:14
上传
评论 1
收藏 207.83MB ZIP 举报
温馨提示
2022全国大学生嵌入式芯片与系统设计竞赛0581基于龙芯教育派的隔离区自主送餐机器人相关源代码.zip
资源推荐
资源详情
资源评论
收起资源包目录
2022全国大学生嵌入式芯片与系统设计竞赛0581基于龙芯教育派的隔离区自主送餐机器人相关源代码.zip (2003个子文件)
mat_pixel_rotate.cpp 229KB
onnx2ncnn.cpp 205KB
gpu.cpp 165KB
command.cpp 135KB
innerproduct_x86.cpp 108KB
convolution_x86.cpp 91KB
binaryop_riscv.cpp 89KB
softmax_x86.cpp 89KB
mat_pixel.cpp 87KB
ncnnoptimize.cpp 84KB
binaryop_arm.cpp 83KB
binaryop_x86.cpp 83KB
ir.cpp 82KB
convolution_arm.cpp 82KB
mxnet2ncnn.cpp 81KB
net.cpp 80KB
requantize_x86.cpp 79KB
convolution_vulkan.cpp 78KB
mat_pixel_affine.cpp 78KB
binaryop_arm_asimdhp.cpp 74KB
requantize_mips.cpp 70KB
requantize_arm.cpp 68KB
innerproduct_arm.cpp 67KB
allocator.cpp 66KB
packing_riscv.cpp 61KB
dequantize_arm.cpp 60KB
eltwise_arm.cpp 60KB
packing_x86.cpp 59KB
innerproduct_mips.cpp 58KB
mlir2ncnn.cpp 58KB
gru_arm.cpp 57KB
packing_arm.cpp 55KB
ncnn2table.cpp 55KB
convolutiondepthwise_arm.cpp 54KB
dequantize_arm_asimdhp.cpp 53KB
eltwise_arm_asimdhp.cpp 47KB
caffe2ncnn.cpp 45KB
deconvolution_arm_asimdhp.cpp 44KB
F_interpolate.cpp 42KB
concat_vulkan.cpp 42KB
convolution_riscv.cpp 41KB
deconvolutiondepthwise_vulkan.cpp 41KB
slice_vulkan.cpp 41KB
gru_arm_asimdhp.cpp 40KB
reshape_vulkan.cpp 40KB
convolution_arm_asimdhp.cpp 40KB
mat.cpp 40KB
c_api.cpp 40KB
padding_vulkan.cpp 39KB
deconvolution_arm.cpp 39KB
mat_pixel_drawing.cpp 39KB
interp_riscv.cpp 38KB
convolutiondepthwise_riscv.cpp 37KB
instancenorm_vulkan.cpp 36KB
reduction.cpp 36KB
crop_vulkan.cpp 36KB
convolution_mips.cpp 36KB
lstm_arm_asimdhp.cpp 36KB
convolutiondepthwise_vulkan.cpp 35KB
quantize_arm.cpp 35KB
cpu.cpp 34KB
gru_riscv.cpp 34KB
lstm_arm.cpp 34KB
binaryop_mips.cpp 33KB
dequantize_mips.cpp 32KB
crop_x86.cpp 32KB
darknet2ncnn.cpp 32KB
innerproduct_riscv.cpp 32KB
deconvolution_vulkan.cpp 32KB
convolutiondepthwise_mips.cpp 32KB
deconvolutiondepthwise_riscv.cpp 31KB
pooling_riscv.cpp 31KB
quantize_arm_asimdhp.cpp 31KB
interp_arm_asimdhp.cpp 31KB
innerproduct_arm_asimdhp.cpp 31KB
interp_vulkan.cpp 31KB
binaryop_vulkan.cpp 30KB
rnn_arm.cpp 29KB
softmax_arm.cpp 29KB
packing_vulkan.cpp 29KB
simpleomp.cpp 29KB
interp_arm.cpp 29KB
rnn_arm_asimdhp.cpp 29KB
interp_x86.cpp 29KB
innerproduct_vulkan.cpp 28KB
pixelshuffle_arm.cpp 28KB
convolution1d_riscv.cpp 28KB
permute_vulkan.cpp 28KB
pooling_arm_asimdhp.cpp 28KB
concat_arm.cpp 27KB
pooling_vulkan.cpp 27KB
slice_arm.cpp 27KB
shufflechannel_arm.cpp 27KB
quantize_x86.cpp 27KB
pooling_x86.cpp 27KB
binaryop.cpp 27KB
reshape_arm.cpp 27KB
padding_arm.cpp 26KB
concat_x86.cpp 26KB
crop_arm.cpp 26KB
共 2003 条
- 1
- 2
- 3
- 4
- 5
- 6
- 21
资源评论
天天501
- 粉丝: 601
- 资源: 4666
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功