ncnn.zip资源包资源-CSDN文库

共2000个文件

out：1206个

cpp：1122个

h：1055个

需积分: 9 193 浏览量 2022-12-29 09:22:49 上传评论收藏 33.17MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

ncnn.zip资源包（2000个子文件）

glslang_tab.cpp 630KB

Initialize.cpp 504KB

hlslParseHelper.cpp 431KB

ParseHelper.cpp 430KB

GlslangToSpv.cpp 411KB

mat_pixel_rotate.cpp 229KB

onnx2ncnn.cpp 206KB

gpu.cpp 165KB

doc.cpp 153KB

SpvBuilder.cpp 141KB

command.cpp 135KB

Intermediate.cpp 135KB

hlslGrammar.cpp 129KB

linkValidate.cpp 103KB

convolution_x86.cpp 91KB

softmax_x86.cpp 89KB

binaryop_riscv.cpp 89KB

intermOut.cpp 88KB

mat_pixel.cpp 87KB

StandAlone.cpp 87KB

hlslParseables.cpp 84KB

ncnnoptimize.cpp 84KB

binaryop_arm.cpp 83KB

binaryop_x86.cpp 83KB

net.cpp 82KB

ShaderLang.cpp 82KB

convolution_vulkan.cpp 81KB

mxnet2ncnn.cpp 81KB

convolution_arm.cpp 80KB

requantize_x86.cpp 79KB

mat_pixel_affine.cpp 78KB

ir.cpp 76KB

binaryop_arm_asimdhp.cpp 74KB

iomapper.cpp 74KB

Scan.cpp 73KB

requantize_loongarch.cpp 71KB

requantize_mips.cpp 70KB

requantize_arm.cpp 68KB

allocator.cpp 67KB

innerproduct_arm.cpp 67KB

Versions.cpp 65KB

dequantize_arm.cpp 60KB

Constant.cpp 60KB

packing_riscv.cpp 60KB

innerproduct_loongarch.cpp 59KB

packing_x86.cpp 59KB

innerproduct_mips.cpp 58KB

main.cpp 58KB

mlir2ncnn.cpp 58KB

gru_arm.cpp 57KB

packing_arm.cpp 55KB

reflection.cpp 55KB

eltwise_arm_asimdhp.cpp 55KB

ncnn2table.cpp 55KB

convolutiondepthwise_arm.cpp 54KB

dequantize_arm_asimdhp.cpp 53KB

SPVRemapper.cpp 53KB

PpScanner.cpp 51KB

HexFloat.cpp 51KB

Pp.cpp 49KB

cpu.cpp 48KB

eltwise_arm.cpp 47KB

caffe2ncnn.cpp 45KB

deconvolution_arm_asimdhp.cpp 44KB

convolutiondepthwise_x86.cpp 43KB

F_interpolate.cpp 42KB

concat_vulkan.cpp 42KB

convolution_riscv.cpp 41KB

c_api.cpp 41KB

deconvolutiondepthwise_vulkan.cpp 41KB

slice_vulkan.cpp 41KB

gru_arm_asimdhp.cpp 40KB

reshape_vulkan.cpp 40KB

convolution_arm_asimdhp.cpp 40KB

mat_pixel_resize.cpp 40KB

mat.cpp 40KB

propagateNoContraction.cpp 40KB

lstm_arm_asimdhp.cpp 40KB

padding_vulkan.cpp 39KB

deconvolution_arm.cpp 39KB

mat_pixel_drawing.cpp 39KB

interp_riscv.cpp 38KB

lstm_arm.cpp 38KB

convolutiondepthwise_riscv.cpp 37KB

convolution_loongarch.cpp 37KB

convolution_mips.cpp 37KB

instancenorm_vulkan.cpp 36KB

dequantize_x86.cpp 36KB

reduction.cpp 36KB

crop_vulkan.cpp 36KB

hlslScanContext.cpp 35KB

convolutiondepthwise_vulkan.cpp 35KB

quantize_arm.cpp 35KB

gru_riscv.cpp 34KB

ResourceLimits.cpp 33KB

binaryop_loongarch.cpp 33KB

binaryop_mips.cpp 33KB

dequantize_mips.cpp 32KB

crop_x86.cpp 32KB

darknet2ncnn.cpp 32KB

共 2000 条

benchncnn can be used to test neural network inference performance Only the network definition files (ncnn param) are required. The large model binary files (ncnn bin) are not loaded but generated randomly for speed test. More model networks may be added later. --- Build ```shell # assume you have already build ncnn library successfully # uncomment the following line in <ncnn-root-dir>/CMakeLists.txt with your favorite editor # add_subdirectory(benchmark) cd <ncnn-root-dir>/<your-build-dir> make -j4 # you can find benchncnn binary in <ncnn-root-dir>/<your-build-dir>/benchmark ``` Usage ```shell # copy all param files to the current directory ./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down] ``` run benchncnn on android device ```shell # for running on android device, upload to /data/local/tmp/ folder adb push benchncnn /data/local/tmp/ adb push <ncnn-root-dir>/benchmark/*.param /data/local/tmp/ adb shell # executed in android adb shell cd /data/local/tmp/ ./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down] ``` Parameter |param|options|default| |---|---|---| |loop count|1~N|4| |num threads|1~N|max_cpu_count| |powersave|0=all cores, 1=little cores only, 2=big cores only|0| |gpu device|-1=cpu-only, 0=gpu0, 1=gpu1 ...|-1| |cooling down|0=disable, 1=enable|1| Tips: Disable android UI server and set CPU and GPU to max frequency ```shell # stopping android ui server, can be retarted later via adb shell start adb root adb shell stop # executed in android adb shell # set cpu performance mode echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor # set gpu performance mode (eg. RK3399) echo "performance" > /sys/class/misc/mali0/device/devfreq/ff9a0000.gpu/governor # set gpu performance mode (eg. Android Adreno) echo 1 > /sys/class/kgsl/kgsl-3d0/force_clk_on echo 10000000 > /sys/class/kgsl/kgsl-3d0/idle_timer echo "performance" > /sys/class/kgsl/kgsl-3d0/devfreq/governor echo <max freq> > /sys/class/kgsl/kgsl-3d0/gpuclk ``` --- Typical output (executed in android adb shell) ### NVIDIA Jetson AGX Orin (Cortex-A78AE 2.2 GHz x 12 + [email protected] GHz Tensor Cores 64) ``` i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 1 0 -1 0 loop_count = 64 num_threads = 1 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 11.66 max = 11.80 avg = 11.74 squeezenet_int8 min = 12.24 max = 12.39 avg = 12.31 mobilenet min = 19.56 max = 19.73 avg = 19.65 mobilenet_int8 min = 16.06 max = 16.25 avg = 16.14 mobilenet_v2 min = 13.20 max = 13.41 avg = 13.29 mobilenet_v3 min = 11.39 max = 11.57 avg = 11.48 shufflenet min = 8.07 max = 8.18 avg = 8.11 shufflenet_v2 min = 8.41 max = 8.51 avg = 8.45 mnasnet min = 12.74 max = 12.91 avg = 12.79 proxylessnasnet min = 15.18 max = 15.32 avg = 15.25 efficientnet_b0 min = 26.86 max = 26.96 avg = 26.90 efficientnetv2_b0 min = 35.99 max = 36.15 avg = 36.07 regnety_400m min = 16.81 max = 16.98 avg = 16.87 blazeface min = 4.25 max = 4.37 avg = 4.29 googlenet min = 48.73 max = 48.98 avg = 48.87 googlenet_int8 min = 47.39 max = 47.60 avg = 47.49 resnet18 min = 30.93 max = 31.24 avg = 31.08 resnet18_int8 min = 55.44 max = 55.70 avg = 55.56 alexnet min = 44.19 max = 44.43 avg = 44.33 vgg16 min = 173.94 max = 174.97 avg = 174.46 vgg16_int8 min = 475.10 max = 479.37 avg = 477.33 resnet50 min = 89.50 max = 90.11 avg = 89.80 resnet50_int8 min = 106.77 max = 107.14 avg = 106.96 squeezenet_ssd min = 37.78 max = 38.35 avg = 37.93 squeezenet_ssd_int8 min = 50.48 max = 50.88 avg = 50.74 mobilenet_ssd min = 45.62 max = 46.12 avg = 45.74 mobilenet_ssd_int8 min = 37.77 max = 38.00 avg = 37.88 mobilenet_yolo min = 90.23 max = 90.49 avg = 90.35 mobilenetv2_yolov3 min = 47.27 max = 47.48 avg = 47.33 yolov4-tiny min = 60.41 max = 60.75 avg = 60.57 nanodet_m min = 19.26 max = 19.43 avg = 19.35 yolo-fastest-1.1 min = 8.16 max = 8.31 avg = 8.20 yolo-fastestv2 min = 8.26 max = 8.39 avg = 8.32 i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 2 0 -1 0 loop_count = 64 num_threads = 2 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 6.83 max = 6.98 avg = 6.90 squeezenet_int8 min = 7.39 max = 7.50 avg = 7.45 mobilenet min = 10.40 max = 10.50 avg = 10.45 mobilenet_int8 min = 8.92 max = 9.09 avg = 8.99 mobilenet_v2 min = 7.67 max = 7.80 avg = 7.74 mobilenet_v3 min = 6.86 max = 7.01 avg = 6.93 shufflenet min = 6.34 max = 6.44 avg = 6.39 shufflenet_v2 min = 5.71 max = 5.83 avg = 5.76 mnasnet min = 7.47 max = 7.58 avg = 7.53 proxylessnasnet min = 8.73 max = 8.83 avg = 8.78 efficientnet_b0 min = 14.93 max = 15.13 avg = 15.03 efficientnetv2_b0 min = 20.17 max = 20.70 avg = 20.29 regnety_400m min = 12.50 max = 12.62 avg = 12.57 blazeface min = 2.95 max = 3.06 avg = 3.00 googlenet min = 26.25 max = 26.53 avg = 26.37 googlenet_int8 min = 26.54 max = 26.79 avg = 26.66 resnet18 min = 16.69 max = 16.90 avg = 16.80 resnet18_int8 min = 29.70 max = 29.93 avg = 29.81 alexnet min = 22.96 max = 23.12 avg = 23.03 vgg16 min = 88.39 max = 89.16 avg = 88.79 vgg16_int8 min = 245.86 max = 247.55 avg = 246.62 resnet50 min = 46.55 max = 46.86 avg = 46.70 resnet50_int8 min = 56.28 max = 56.63 avg = 56.43 squeezenet_ssd min = 23.65 max = 24.29 avg = 23.81 squeezenet_ssd_int8 min = 30.86 max = 31.27 avg = 30.99 mobilenet_ssd min = 25.17 max = 25.31 avg = 25.24 mobilenet_ssd_int8 min = 21.77 max = 21.97 avg = 21.84 mobilenet_yolo min = 48.03 max = 48.33 avg = 48.14 mobilenetv2_yolov3 min = 26.58 max = 26.81 avg = 26.66 yolov4-tiny min = 35.31 max = 35.53 avg = 35.41 nanodet_m min = 12.93 max = 13.08 avg = 13.01 yolo-fastest-1.1 min = 6.00 max = 6.10 avg = 6.04 yolo-fastestv2 min = 6.46 max = 6.61 avg = 6.52 i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 4 0 -1 0 loop_count = 64 num_threads = 4 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 4.54 max = 4.84 avg = 4.61 squeezenet_int8 min = 4.96 max = 5.41 avg = 5.05 mobilenet min = 5.96 max = 6.23 avg = 6.04 mobilenet_int8 min = 5.21 max = 5.50 avg = 5.30 mobilenet_v2 min = 5.05 max = 5.26 avg = 5.15 mobilenet_v3 min = 4.83 max = 5.14 avg = 4.90 shufflenet min = 5.11 max = 5.34 avg = 5.18 shufflenet_v2 min = 4.13 max = 4.44 avg = 4.18 mnasnet min = 4.93 max = 5.27 avg = 5.01 proxylessnasnet min = 5.64 max = 5.89 avg = 5.72 efficient

评论收藏

内容反馈