ncnn-master.zip资源-CSDN文库

共2000个文件

cpp：1050个

h：926个

py：523个

人工智能

15 浏览量 2022-12-21 09:56:58 上传评论收藏 14.61MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

ncnn-master.zip （2000个子文件）

gemm_x86.cpp 247KB

mat_pixel_rotate.cpp 229KB

onnx2ncnn.cpp 206KB

gpu.cpp 165KB

gemm_arm.cpp 153KB

command.cpp 135KB

softmax_x86.cpp 89KB

binaryop_riscv.cpp 89KB

mat_pixel.cpp 87KB

ncnnoptimize.cpp 84KB

binaryop_arm.cpp 83KB

binaryop_x86.cpp 83KB

net.cpp 82KB

convolution_vulkan.cpp 81KB

mxnet2ncnn.cpp 81KB

convolution_arm.cpp 80KB

requantize_x86.cpp 79KB

mat_pixel_affine.cpp 78KB

ir.cpp 76KB

binaryop_arm_asimdhp.cpp 74KB

requantize_loongarch.cpp 71KB

requantize_mips.cpp 70KB

requantize_arm.cpp 68KB

allocator.cpp 67KB

convolution_x86.cpp 67KB

innerproduct_arm.cpp 67KB

dequantize_arm.cpp 60KB

packing_riscv.cpp 60KB

innerproduct_loongarch.cpp 59KB

packing_x86.cpp 59KB

innerproduct_mips.cpp 58KB

main.cpp 58KB

mlir2ncnn.cpp 58KB

gru_arm.cpp 57KB

packing_arm.cpp 55KB

eltwise_arm_asimdhp.cpp 55KB

ncnn2table.cpp 55KB

convolutiondepthwise_arm.cpp 54KB

dequantize_arm_asimdhp.cpp 53KB

cpu.cpp 49KB

eltwise_arm.cpp 47KB

caffe2ncnn.cpp 45KB

deconvolution_arm_asimdhp.cpp 44KB

convolutiondepthwise_x86.cpp 43KB

F_interpolate.cpp 42KB

concat_vulkan.cpp 42KB

convolution_riscv.cpp 41KB

c_api.cpp 41KB

deconvolutiondepthwise_vulkan.cpp 41KB

slice_vulkan.cpp 41KB

gru_arm_asimdhp.cpp 40KB

reshape_vulkan.cpp 40KB

convolution_arm_asimdhp.cpp 40KB

mat_pixel_resize.cpp 40KB

mat.cpp 40KB

lstm_arm_asimdhp.cpp 40KB

padding_vulkan.cpp 39KB

deconvolution_arm.cpp 39KB

mat_pixel_drawing.cpp 39KB

interp_riscv.cpp 38KB

lstm_arm.cpp 38KB

convolutiondepthwise_riscv.cpp 37KB

convolution_loongarch.cpp 37KB

convolution_mips.cpp 37KB

instancenorm_vulkan.cpp 36KB

dequantize_x86.cpp 36KB

reduction.cpp 36KB

crop_vulkan.cpp 36KB

convolutiondepthwise_vulkan.cpp 35KB

quantize_arm.cpp 35KB

gru_riscv.cpp 34KB

deformableconv2d_x86.cpp 34KB

binaryop_loongarch.cpp 33KB

binaryop_mips.cpp 33KB

dequantize_mips.cpp 32KB

crop_x86.cpp 32KB

darknet2ncnn.cpp 32KB

deconvolution_vulkan.cpp 32KB

innerproduct_riscv.cpp 32KB

dequantize_loongarch.cpp 32KB

convolutiondepthwise_loongarch.cpp 32KB

convolutiondepthwise_mips.cpp 32KB

deconvolutiondepthwise_riscv.cpp 31KB

quantize_arm_asimdhp.cpp 31KB

pooling_riscv.cpp 31KB

interp_arm_asimdhp.cpp 31KB

innerproduct_arm_asimdhp.cpp 31KB

interp_vulkan.cpp 31KB

binaryop_vulkan.cpp 30KB

softmax_arm.cpp 30KB

rnn_arm.cpp 29KB

packing_vulkan.cpp 29KB

simpleomp.cpp 29KB

interp_arm.cpp 29KB

rnn_arm_asimdhp.cpp 29KB

interp_x86.cpp 29KB

reshape_x86.cpp 28KB

innerproduct_vulkan.cpp 28KB

pixelshuffle_arm.cpp 28KB

convolution1d_riscv.cpp 28KB

共 2000 条

benchncnn can be used to test neural network inference performance Only the network definition files (ncnn param) are required. The large model binary files (ncnn bin) are not loaded but generated randomly for speed test. More model networks may be added later. --- Build ```shell # assume you have already build ncnn library successfully # uncomment the following line in <ncnn-root-dir>/CMakeLists.txt with your favorite editor # add_subdirectory(benchmark) cd <ncnn-root-dir>/<your-build-dir> make -j4 # you can find benchncnn binary in <ncnn-root-dir>/<your-build-dir>/benchmark ``` Usage ```shell # copy all param files to the current directory ./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down] ``` run benchncnn on android device ```shell # for running on android device, upload to /data/local/tmp/ folder adb push benchncnn /data/local/tmp/ adb push <ncnn-root-dir>/benchmark/*.param /data/local/tmp/ adb shell # executed in android adb shell cd /data/local/tmp/ ./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down] ``` Parameter |param|options|default| |---|---|---| |loop count|1~N|4| |num threads|1~N|max_cpu_count| |powersave|0=all cores, 1=little cores only, 2=big cores only|0| |gpu device|-1=cpu-only, 0=gpu0, 1=gpu1 ...|-1| |cooling down|0=disable, 1=enable|1| Tips: Disable android UI server and set CPU and GPU to max frequency ```shell # stopping android ui server, can be retarted later via adb shell start adb root adb shell stop # executed in android adb shell # set cpu performance mode echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor # set gpu performance mode (eg. RK3399) echo "performance" > /sys/class/misc/mali0/device/devfreq/ff9a0000.gpu/governor # set gpu performance mode (eg. Android Adreno) echo 1 > /sys/class/kgsl/kgsl-3d0/force_clk_on echo 10000000 > /sys/class/kgsl/kgsl-3d0/idle_timer echo "performance" > /sys/class/kgsl/kgsl-3d0/devfreq/governor echo <max freq> > /sys/class/kgsl/kgsl-3d0/gpuclk ``` --- Typical output (executed in android adb shell) ### NVIDIA Jetson AGX Orin (Cortex-A78AE 2.2 GHz x 12 + [email protected] GHz Tensor Cores 64) ``` i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 1 0 -1 0 loop_count = 64 num_threads = 1 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 11.66 max = 11.80 avg = 11.74 squeezenet_int8 min = 12.24 max = 12.39 avg = 12.31 mobilenet min = 19.56 max = 19.73 avg = 19.65 mobilenet_int8 min = 16.06 max = 16.25 avg = 16.14 mobilenet_v2 min = 13.20 max = 13.41 avg = 13.29 mobilenet_v3 min = 11.39 max = 11.57 avg = 11.48 shufflenet min = 8.07 max = 8.18 avg = 8.11 shufflenet_v2 min = 8.41 max = 8.51 avg = 8.45 mnasnet min = 12.74 max = 12.91 avg = 12.79 proxylessnasnet min = 15.18 max = 15.32 avg = 15.25 efficientnet_b0 min = 26.86 max = 26.96 avg = 26.90 efficientnetv2_b0 min = 35.99 max = 36.15 avg = 36.07 regnety_400m min = 16.81 max = 16.98 avg = 16.87 blazeface min = 4.25 max = 4.37 avg = 4.29 googlenet min = 48.73 max = 48.98 avg = 48.87 googlenet_int8 min = 47.39 max = 47.60 avg = 47.49 resnet18 min = 30.93 max = 31.24 avg = 31.08 resnet18_int8 min = 55.44 max = 55.70 avg = 55.56 alexnet min = 44.19 max = 44.43 avg = 44.33 vgg16 min = 173.94 max = 174.97 avg = 174.46 vgg16_int8 min = 475.10 max = 479.37 avg = 477.33 resnet50 min = 89.50 max = 90.11 avg = 89.80 resnet50_int8 min = 106.77 max = 107.14 avg = 106.96 squeezenet_ssd min = 37.78 max = 38.35 avg = 37.93 squeezenet_ssd_int8 min = 50.48 max = 50.88 avg = 50.74 mobilenet_ssd min = 45.62 max = 46.12 avg = 45.74 mobilenet_ssd_int8 min = 37.77 max = 38.00 avg = 37.88 mobilenet_yolo min = 90.23 max = 90.49 avg = 90.35 mobilenetv2_yolov3 min = 47.27 max = 47.48 avg = 47.33 yolov4-tiny min = 60.41 max = 60.75 avg = 60.57 nanodet_m min = 19.26 max = 19.43 avg = 19.35 yolo-fastest-1.1 min = 8.16 max = 8.31 avg = 8.20 yolo-fastestv2 min = 8.26 max = 8.39 avg = 8.32 i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 2 0 -1 0 loop_count = 64 num_threads = 2 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 6.83 max = 6.98 avg = 6.90 squeezenet_int8 min = 7.39 max = 7.50 avg = 7.45 mobilenet min = 10.40 max = 10.50 avg = 10.45 mobilenet_int8 min = 8.92 max = 9.09 avg = 8.99 mobilenet_v2 min = 7.67 max = 7.80 avg = 7.74 mobilenet_v3 min = 6.86 max = 7.01 avg = 6.93 shufflenet min = 6.34 max = 6.44 avg = 6.39 shufflenet_v2 min = 5.71 max = 5.83 avg = 5.76 mnasnet min = 7.47 max = 7.58 avg = 7.53 proxylessnasnet min = 8.73 max = 8.83 avg = 8.78 efficientnet_b0 min = 14.93 max = 15.13 avg = 15.03 efficientnetv2_b0 min = 20.17 max = 20.70 avg = 20.29 regnety_400m min = 12.50 max = 12.62 avg = 12.57 blazeface min = 2.95 max = 3.06 avg = 3.00 googlenet min = 26.25 max = 26.53 avg = 26.37 googlenet_int8 min = 26.54 max = 26.79 avg = 26.66 resnet18 min = 16.69 max = 16.90 avg = 16.80 resnet18_int8 min = 29.70 max = 29.93 avg = 29.81 alexnet min = 22.96 max = 23.12 avg = 23.03 vgg16 min = 88.39 max = 89.16 avg = 88.79 vgg16_int8 min = 245.86 max = 247.55 avg = 246.62 resnet50 min = 46.55 max = 46.86 avg = 46.70 resnet50_int8 min = 56.28 max = 56.63 avg = 56.43 squeezenet_ssd min = 23.65 max = 24.29 avg = 23.81 squeezenet_ssd_int8 min = 30.86 max = 31.27 avg = 30.99 mobilenet_ssd min = 25.17 max = 25.31 avg = 25.24 mobilenet_ssd_int8 min = 21.77 max = 21.97 avg = 21.84 mobilenet_yolo min = 48.03 max = 48.33 avg = 48.14 mobilenetv2_yolov3 min = 26.58 max = 26.81 avg = 26.66 yolov4-tiny min = 35.31 max = 35.53 avg = 35.41 nanodet_m min = 12.93 max = 13.08 avg = 13.01 yolo-fastest-1.1 min = 6.00 max = 6.10 avg = 6.04 yolo-fastestv2 min = 6.46 max = 6.61 avg = 6.52 i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 4 0 -1 0 loop_count = 64 num_threads = 4 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 4.54 max = 4.84 avg = 4.61 squeezenet_int8 min = 4.96 max = 5.41 avg = 5.05 mobilenet min = 5.96 max = 6.23 avg = 6.04 mobilenet_int8 min = 5.21 max = 5.50 avg = 5.30 mobilenet_v2 min = 5.05 max = 5.26 avg = 5.15 mobilenet_v3 min = 4.83 max = 5.14 avg = 4.90 shufflenet min = 5.11 max = 5.34 avg = 5.18 shufflenet_v2 min = 4.13 max = 4.44 avg = 4.18 mnasnet min = 4.93 max = 5.27 avg = 5.01 proxylessnasnet min = 5.64 max = 5.89 avg = 5.72 efficient

评论收藏

内容反馈