【免费】ncnn-master.zip资源-CSDN文库

共2000个文件

cpp：835个

h：606个

py：528个

深度学习

需积分: 0 112 浏览量 2023-11-22 15:22:09 上传评论收藏 15.08MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

ncnn-master.zip （2000个子文件）

mat_pixel_rotate.cpp 229KB

onnx2ncnn.cpp 207KB

gpu.cpp 194KB

gemm_arm.cpp 172KB

gemm_riscv.cpp 152KB

command.cpp 135KB

gemm_arm_asimdhp.cpp 99KB

ir.cpp 91KB

mat_pixel.cpp 87KB

ncnnoptimize.cpp 84KB

mxnet2ncnn.cpp 81KB

net.cpp 80KB

mat_pixel_affine.cpp 78KB

allocator.cpp 73KB

requantize_loongarch.cpp 71KB

requantize_mips.cpp 70KB

fuse_multiheadattention.cpp 69KB

requantize_arm.cpp 68KB

cpu.cpp 67KB

innerproduct_arm.cpp 64KB

dequantize_arm.cpp 60KB

packing_riscv.cpp 60KB

innerproduct_loongarch.cpp 59KB

innerproduct_mips.cpp 58KB

mlir2ncnn.cpp 58KB

gru_arm.cpp 57KB

packing_arm.cpp 55KB

eltwise_arm_asimdhp.cpp 55KB

ncnn2table.cpp 55KB

convolutiondepthwise_arm.cpp 54KB

dequantize_arm_asimdhp.cpp 53KB

convolution_arm.cpp 53KB

deconvolution_arm_asimdhp.cpp 51KB

softmax_arm.cpp 50KB

eltwise_arm.cpp 47KB

caffe2ncnn.cpp 45KB

deconvolution_arm.cpp 45KB

c_api.cpp 43KB

F_interpolate.cpp 42KB

binaryop_riscv.cpp 42KB

gru_arm_asimdhp.cpp 41KB

mat.cpp 40KB

softmax_arm_asimdhp.cpp 40KB

lstm_arm_asimdhp.cpp 40KB

binaryop_arm.cpp 39KB

mat_pixel_drawing.cpp 39KB

interp_riscv.cpp 38KB

lstm_arm.cpp 38KB

slice_arm.cpp 38KB

convolution_loongarch.cpp 37KB

convolution_mips.cpp 37KB

reduction.cpp 36KB

deconvolutiondepthwise_riscv.cpp 35KB

quantize_arm.cpp 35KB

pass_level2.cpp 35KB

gru_riscv.cpp 34KB

dequantize_mips.cpp 32KB

darknet2ncnn.cpp 32KB

dequantize_loongarch.cpp 32KB

innerproduct_riscv.cpp 32KB

convolutiondepthwise_loongarch.cpp 32KB

convolutiondepthwise_mips.cpp 32KB

quantize_arm_asimdhp.cpp 31KB

pooling_riscv.cpp 31KB

interp_arm_asimdhp.cpp 31KB

innerproduct_arm_asimdhp.cpp 31KB

concat_arm.cpp 31KB

rnn_arm.cpp 29KB

simpleomp.cpp 29KB

rnn_arm_asimdhp.cpp 29KB

interp_arm.cpp 29KB

pixelshuffle_arm.cpp 28KB

convolution1d_riscv.cpp 28KB

pooling_arm_asimdhp.cpp 28KB

fuse_dynamic_adaptive_pool.cpp 27KB

shufflechannel_arm.cpp 27KB

padding_arm.cpp 26KB

reshape_arm.cpp 26KB

deconvolutiondepthwise_arm.cpp 26KB

crop_arm.cpp 26KB

pooling_arm.cpp 25KB

fuse_expression.cpp 25KB

convolution_arm_asimdhp.cpp 25KB

F_local_response_norm.cpp 24KB

deconvolutiondepthwise_arm_asimdhp.cpp 23KB

binaryop_arm_asimdhp.cpp 23KB

padding_riscv.cpp 22KB

gemm_arm_vfpv4.cpp 22KB

gridsample.cpp 22KB

convolutiondepthwise_arm_asimdhp.cpp 22KB

flatten_arm.cpp 21KB

deconvolution_riscv.cpp 21KB

convolutiondepthwise.cpp 21KB

binaryop_loongarch.cpp 20KB

nn_MultiheadAttention.cpp 20KB

binaryop_mips.cpp 20KB

packing_mips.cpp 19KB

quantize_loongarch.cpp 19KB

packing_loongarch.cpp 19KB

quantize_mips.cpp 19KB

共 2000 条

benchncnn can be used to test neural network inference performance Only the network definition files (ncnn param) are required. The large model binary files (ncnn bin) are not loaded but generated randomly for speed test. If no model specified, it would benchmark default list. More model networks may be added later. --- Build ```shell # assume you have already build ncnn library successfully # uncomment the following line in <ncnn-root-dir>/CMakeLists.txt with your favorite editor # add_subdirectory(benchmark) cd <ncnn-root-dir>/<your-build-dir> make -j4 # you can find benchncnn binary in <ncnn-root-dir>/<your-build-dir>/benchmark ``` Usage ```shell # copy all param files to the current directory ./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down] [(key=value)...] param=model.param shape=[227,227,3],.. ``` run benchncnn on android device ```shell # for running on android device, upload to /data/local/tmp/ folder adb push benchncnn /data/local/tmp/ adb push <ncnn-root-dir>/benchmark/*.param /data/local/tmp/ adb shell # executed in android adb shell cd /data/local/tmp/ ./benchncnn [loop count] [num threads] [powersave] [gpu device] [cooling down] [(key=value)...] param=model.param shape=[227,227,3],.. ``` Parameter |param|options|default| |---|---|---| |loop count|1~N|4| |num threads|1~N|max_cpu_count| |powersave|0=all cores, 1=little cores only, 2=big cores only|0| |gpu device|-1=cpu-only, 0=gpu0, 1=gpu1 ...|-1| |cooling down|0=disable, 1=enable|1| |param|ncnn model.param filepath|-| |shape|model input shapes with, whc format|-| Tips: Disable android UI server and set CPU and GPU to max frequency ```shell # stopping android ui server, can be retarted later via adb shell start adb root adb shell stop # executed in android adb shell # set cpu performance mode echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor # set gpu performance mode (eg. RK3399) echo "performance" > /sys/class/misc/mali0/device/devfreq/ff9a0000.gpu/governor # set gpu performance mode (eg. Android Adreno) echo 1 > /sys/class/kgsl/kgsl-3d0/force_clk_on echo 10000000 > /sys/class/kgsl/kgsl-3d0/idle_timer echo "performance" > /sys/class/kgsl/kgsl-3d0/devfreq/governor echo <max freq> > /sys/class/kgsl/kgsl-3d0/gpuclk ``` --- Typical output (executed in android adb shell) ### NVIDIA Jetson AGX Orin (Cortex-A78AE 2.2 GHz x 12 + [email protected] GHz Tensor Cores 64) ``` i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 1 0 -1 0 loop_count = 64 num_threads = 1 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 11.66 max = 11.80 avg = 11.74 squeezenet_int8 min = 12.24 max = 12.39 avg = 12.31 mobilenet min = 19.56 max = 19.73 avg = 19.65 mobilenet_int8 min = 16.06 max = 16.25 avg = 16.14 mobilenet_v2 min = 13.20 max = 13.41 avg = 13.29 mobilenet_v3 min = 11.39 max = 11.57 avg = 11.48 shufflenet min = 8.07 max = 8.18 avg = 8.11 shufflenet_v2 min = 8.41 max = 8.51 avg = 8.45 mnasnet min = 12.74 max = 12.91 avg = 12.79 proxylessnasnet min = 15.18 max = 15.32 avg = 15.25 efficientnet_b0 min = 26.86 max = 26.96 avg = 26.90 efficientnetv2_b0 min = 35.99 max = 36.15 avg = 36.07 regnety_400m min = 16.81 max = 16.98 avg = 16.87 blazeface min = 4.25 max = 4.37 avg = 4.29 googlenet min = 48.73 max = 48.98 avg = 48.87 googlenet_int8 min = 47.39 max = 47.60 avg = 47.49 resnet18 min = 30.93 max = 31.24 avg = 31.08 resnet18_int8 min = 55.44 max = 55.70 avg = 55.56 alexnet min = 44.19 max = 44.43 avg = 44.33 vgg16 min = 173.94 max = 174.97 avg = 174.46 vgg16_int8 min = 475.10 max = 479.37 avg = 477.33 resnet50 min = 89.50 max = 90.11 avg = 89.80 resnet50_int8 min = 106.77 max = 107.14 avg = 106.96 squeezenet_ssd min = 37.78 max = 38.35 avg = 37.93 squeezenet_ssd_int8 min = 50.48 max = 50.88 avg = 50.74 mobilenet_ssd min = 45.62 max = 46.12 avg = 45.74 mobilenet_ssd_int8 min = 37.77 max = 38.00 avg = 37.88 mobilenet_yolo min = 90.23 max = 90.49 avg = 90.35 mobilenetv2_yolov3 min = 47.27 max = 47.48 avg = 47.33 yolov4-tiny min = 60.41 max = 60.75 avg = 60.57 nanodet_m min = 19.26 max = 19.43 avg = 19.35 yolo-fastest-1.1 min = 8.16 max = 8.31 avg = 8.20 yolo-fastestv2 min = 8.26 max = 8.39 avg = 8.32 i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 2 0 -1 0 loop_count = 64 num_threads = 2 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 6.83 max = 6.98 avg = 6.90 squeezenet_int8 min = 7.39 max = 7.50 avg = 7.45 mobilenet min = 10.40 max = 10.50 avg = 10.45 mobilenet_int8 min = 8.92 max = 9.09 avg = 8.99 mobilenet_v2 min = 7.67 max = 7.80 avg = 7.74 mobilenet_v3 min = 6.86 max = 7.01 avg = 6.93 shufflenet min = 6.34 max = 6.44 avg = 6.39 shufflenet_v2 min = 5.71 max = 5.83 avg = 5.76 mnasnet min = 7.47 max = 7.58 avg = 7.53 proxylessnasnet min = 8.73 max = 8.83 avg = 8.78 efficientnet_b0 min = 14.93 max = 15.13 avg = 15.03 efficientnetv2_b0 min = 20.17 max = 20.70 avg = 20.29 regnety_400m min = 12.50 max = 12.62 avg = 12.57 blazeface min = 2.95 max = 3.06 avg = 3.00 googlenet min = 26.25 max = 26.53 avg = 26.37 googlenet_int8 min = 26.54 max = 26.79 avg = 26.66 resnet18 min = 16.69 max = 16.90 avg = 16.80 resnet18_int8 min = 29.70 max = 29.93 avg = 29.81 alexnet min = 22.96 max = 23.12 avg = 23.03 vgg16 min = 88.39 max = 89.16 avg = 88.79 vgg16_int8 min = 245.86 max = 247.55 avg = 246.62 resnet50 min = 46.55 max = 46.86 avg = 46.70 resnet50_int8 min = 56.28 max = 56.63 avg = 56.43 squeezenet_ssd min = 23.65 max = 24.29 avg = 23.81 squeezenet_ssd_int8 min = 30.86 max = 31.27 avg = 30.99 mobilenet_ssd min = 25.17 max = 25.31 avg = 25.24 mobilenet_ssd_int8 min = 21.77 max = 21.97 avg = 21.84 mobilenet_yolo min = 48.03 max = 48.33 avg = 48.14 mobilenetv2_yolov3 min = 26.58 max = 26.81 avg = 26.66 yolov4-tiny min = 35.31 max = 35.53 avg = 35.41 nanodet_m min = 12.93 max = 13.08 avg = 13.01 yolo-fastest-1.1 min = 6.00 max = 6.10 avg = 6.04 yolo-fastestv2 min = 6.46 max = 6.61 avg = 6.52 i@orin:~/projects/ncnn/benchmark$ ./benchncnn 64 4 0 -1 0 loop_count = 64 num_threads = 4 powersave = 0 gpu_device = -1 cooling_down = 0 squeezenet min = 4.54 max = 4.84 avg = 4.61 squeezenet_int8 min = 4.96 max = 5.41 avg = 5.05 mobilenet min = 5.96 max = 6.23 avg = 6.04 mobilenet_int8 min = 5.21 max = 5.50 avg = 5.30 mobilenet_v2 min = 5.05 max = 5.26 avg = 5.15 mobilenet_v3 min = 4.83 max = 5.14 avg = 4.90 shufflenet

评论收藏

内容反馈