人工智能-项目实践-人脸检测-1MB轻量级人脸检测模型-可同时识别多个人脸

共229个文件

py：78个

jpg：43个

md：10个

版权申诉

人工智能

人脸识别

多目标检测

人脸检测

144 浏览量 2022-03-23 16:58:36 上传评论 1 收藏 44.63MB ZIP 举报

在当前的数字化时代，人工智能（AI）已经成为科技发展的重要推动力，其中人脸识别技术更是得到了广泛应用。本项目聚焦于一个特定的领域：人脸检测，特别是针对1MB轻量级的人脸检测模型，它能够高效地同时识别多个人脸，极大地提高了在资源有限的设备上的实用性。 1. **人工智能与人脸识别**：人工智能是模拟人类智能的科学，其分支之一是计算机视觉，其中包括了人脸识别技术。人脸识别是通过计算机和图像处理技术，识别或验证个体身份的一种生物特征识别方法。它利用人脸的形状、纹理和颜色等特性进行身份验证，广泛应用于安全监控、社交媒体、移动支付等领域。 2. **多目标检测**：多目标检测是计算机视觉中的一个关键任务，它的目标是识别并定位图像中的多个不同对象。在人脸识别中，这意味着模型不仅需要识别出是否存在人脸，还需要确定每个人脸的位置和大小。多目标检测对于处理包含多个个体的复杂场景至关重要，例如监控视频或集体照片。 3. **轻量级人脸检测模型**：这种模型设计的主要目标是在保持高精度的同时，尽可能减少模型的大小和计算资源需求。1MB的模型大小意味着它可以在低功耗设备如智能手机、嵌入式系统甚至物联网设备上运行，这对于实时应用如实时视频流处理尤其有利。 4. **Ultra-Light-Fast-Generic-Face-Detector-1MB-master**：这个文件名表明模型可能是一个开源项目，可能包含了训练好的模型权重、源代码、示例数据和文档。"Ultra-Light"强调模型的轻量化设计，"Fast"表示其高效的运行速度，"Generic"意味着该模型具有泛化能力，可以适应各种不同的人脸，而"1MB-master"则可能表示这是主版本，大小为1MB的轻量级人脸检测模型。 5. **模型的实现和工作原理**：轻量级人脸检测模型通常采用深度学习算法，如YOLO（You Only Look Once）、SSD（Single Shot MultiBox Detector）或者更适用于小模型的MTCNN（Multi-Task Cascaded Convolutional Networks）。这些模型通过多层神经网络学习人脸的特征，并使用滑动窗口或锚框策略来检测不同尺度和角度的人脸。 6. **应用场景**：这种模型可以用于多种实际场景，包括但不限于： - 安全监控：自动检测并追踪公共场所中的人脸，提高安全防范。 - 移动设备解锁：通过识别用户的脸部快速解锁设备。 - 社交媒体：自动标记和识别照片中的人物。 - 在线教育：在远程视频教学中确认学生身份。 - 自动零售：无人便利店中的人脸支付。 7. **模型优化与评估**：为了提高模型性能，开发者通常会进行模型优化，比如使用量化技术减小模型大小，或者运用迁移学习加速训练过程。同时，模型的评估指标通常包括精度、召回率、F1分数以及运行速度等。这个1MB轻量级人脸检测模型是人工智能和计算机视觉领域的一个重要成果，它展示了在保持高性能的同时，如何兼顾模型的轻量化和多目标检测的能力，为实际应用提供了强大的技术支持。

资源推荐

资源详情

资源评论

收起资源包目录

人工智能-项目实践-人脸检测-1MB轻量级人脸检测模型-可同时识别多个人脸（229个子文件）

RFB-320.bin 1.04MB

slim_320.bin 1008KB

RFB-320.caffemodel 1.05MB

slim-320.caffemodel 1013KB

.clang-format 73B

UltraFace.cpp 8KB

UltraFace.cpp 6KB

cv_dnn_ultraface.cpp 5KB

main.cpp 1KB

data 7B

variables.data-00000-of-00001 1.17MB

variables.data-00000-of-00001 1.09MB

.gitignore 209B

.gitmodules 99B

Matrix.h 55KB

Rect.h 18KB

HalideRuntime.h 11KB

MNNDefine.h 2KB

cv_dnn_ultraface.h 2KB

MNNForwardType.h 2KB

MNNSharedContext.h 701B

Interpreter.hpp 8KB

Tensor.hpp 8KB

Backend.hpp 7KB

ImageProcess.hpp 4KB

UltraFace.hpp 2KB

AutoTime.hpp 824B

revertMNNModel.hpp 629B

ErrorCode.hpp 619B

NonCopyable.hpp 586B

variables.index 13KB

variables.index 10KB

img8.jpeg 529KB

img1.jpeg 439KB

img8.jpeg 285KB

img1.jpeg 271KB

img4.jpeg 231KB

img2.jpeg 189KB

img5.jpeg 179KB

img5.jpeg 121KB

img4.jpeg 119KB

img2.jpeg 112KB

4.jpg 746KB

test_output_RFB.jpg 522KB

test_output_slim.jpg 521KB

test_output_origin_slim.jpg 516KB

test_output_origin_RFB.jpg 516KB

17.jpg 460KB

26.jpg 281KB

test_input.jpg 256KB

27.jpg 239KB

1.jpg 191KB

result.jpg 177KB

result.jpg 176KB

2.jpg 173KB

img3.jpg 158KB

25.jpg 148KB

1.jpg 143KB

11.jpg 142KB

26.jpg 137KB

15.jpg 134KB

16.jpg 128KB

3.jpg 125KB

21.jpg 125KB

6.jpg 123KB

12.jpg 123KB

27.jpg 112KB

22.jpg 106KB

10.jpg 106KB

18.jpg 104KB

2.jpg 92KB

3.jpg 92KB

test.jpg 92KB

5.jpg 90KB

24.jpg 87KB

1.jpg 85KB

9.jpg 85KB

20.jpg 81KB

8.jpg 79KB

23.jpg 67KB

19.jpg 58KB

img3.jpg 53KB

2.jpg 48KB

4.jpg 18KB

13.jpg 15KB

rfb_320.json 13KB

slim_320.json 9KB

LICENSE 1KB

wider_hard_val.mat 414KB

wider_medium_val.mat 403KB

wider_easy_val.mat 399KB

wider_face_val.mat 388KB

README.md 12KB

README_CN.md 11KB

README.md 4KB

README.md 3KB

README.md 2KB

共 229 条

[English](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB ) | [中文简体](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/README_CN.md) # Ultra-Light-Fast-Generic-Face-Detector-1MB # Ultra-lightweight face detection model ![img1](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/readme_imgs/27.jpg) This model is a lightweight facedetection model designed for edge computing devices. - In terms of model size, the default FP32 precision (.pth) file size is **1.04~1.1MB**, and the inference framework int8 quantization size is about **300KB**. - In terms of the calculation amount of the model, the input resolution of 320x240 is about **90~109 MFlops**. - There are two versions of the model, version-slim (network backbone simplification,slightly faster) and version-RFB (with the modified RFB module, higher precision). - Widerface training pre-training model with different input resolutions of 320x240 and 640x480 is provided to better work in different application scenarios. - Support for onnx export for ease of migration and inference. - [Provide NCNN C++ inference code](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/ncnn). - [Provide MNN C++ inference code](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/MNN), [MNN Python inference code](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/MNN/python), [FP32/INT8 quantized models](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/MNN/model). - [Provide Caffe model](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/caffe/model) and [onnx2caffe conversion code](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/caffe). - [Caffe python inference code](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/caffe/ultra_face_caffe_inference.py) and [OpencvDNN inference code](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/caffe/ultra_face_opencvdnn_inference.py). ## Tested the environment that works - Ubuntu16.04、Ubuntu18.04、Windows 10（for inference） - Python3.6 - Pytorch1.2 - CUDA10.0 + CUDNN7.6 ## Accuracy, speed, model size comparison The training set is the VOC format data set generated by using the cleaned widerface labels provided by [Retinaface](https://github.com/deepinsight/insightface/blob/master/RetinaFace/README.md) in conjunction with the widerface data set (PS: the following test results were obtained by myself, and the results may be partially inconsistent). ### Widerface test - Test accuracy in the WIDER FACE val set (single-scale input resolution: **320*240 or scaling by the maximum side length of 320**) Model|Easy Set|Medium Set|Hard Set ------|--------|----------|-------- libfacedetection v1（caffe）|0.65 |0.5 |0.233 libfacedetection v2（caffe）|0.714 |0.585 |0.306 Retinaface-Mobilenet-0.25 (Mxnet) |0.745|0.553|0.232 version-slim|0.77 |0.671 |0.395 version-RFB|**0.787** |**0.698** |**0.438** - Test accuracy in the WIDER FACE val set (single-scale input resolution: **VGA 640*480 or scaling by the maximum side length of 640** ) Model|Easy Set|Medium Set|Hard Set ------|--------|----------|-------- libfacedetection v1（caffe）|0.741 |0.683 |0.421 libfacedetection v2（caffe）|0.773 |0.718 |0.485 Retinaface-Mobilenet-0.25 (Mxnet) |**0.879**|0.807|0.481 version-slim|0.853 |0.819 |0.539 version-RFB|0.855 |**0.822** |**0.579** > - This part mainly tests the effect of the test set under the medium and small resolutions. > - RetinaFace-mnet (Retinaface-Mobilenet-0.25), from a great job [insightface](https://github.com/deepinsight/insightface), when testing this network, the original image is scaled by 320 or 640 as the maximum side length, so the face will not be deformed, and the rest of the networks will have a fixed size resize. At the same time, the result of the RetinaFace-mnet optimal 1600 single-scale val set was 0.887 (Easy) / 0.87 (Medium) / 0.791 (Hard). ### Terminal device inference speed - Raspberry Pi 4B MNN Inference Latency **(unit: ms)** (ARM/A72x4/1.5GHz/input resolution: **320x240** /int8 quantization) Model|1 core|2 core|3 core|4 core ------|--------|----------|--------|-------- libfacedetection v1|**28** |**16**|**12**|9.7 Official Retinaface-Mobilenet-0.25 (Mxnet) |46|25|18.5|15 version-slim|29 |**16** |**12**|**9.5** version-RFB|35 |19.6 |14.8| 11 - iPhone 6s Plus MNN (version tag：0.2.1.5) Inference Latency ( input resolution : **320x240** )[Data comes from MNN official](https://www.yuque.com/mnn/en/demo_zoo#bXsRY) Model|Inference Latency(ms) ------|-------- [slim-320](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/MNN/model/version-slim/slim-320.mnn) |6.33 [RFB-320](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/MNN/model/version-RFB/RFB-320.mnn)|7.8 - [Kendryte K210](https://kendryte.com/) NNCase Inference Latency (RISC-V/400MHz/input resolution: **320x240** /int8 quantization)[Data comes from NNCase](https://github.com/kendryte/nncase/tree/master/examples/fast_facedetect) Model|Inference Latency(ms) ------|-------- [slim-320](https://github.com/kendryte/nncase/tree/master/examples/fast_facedetect/k210/kpu_fast_facedetect_example/slim-320.kmodel)|65.6 [RFB-320](https://github.com/kendryte/nncase/tree/master/examples/fast_facedetect/k210/kpu_fast_facedetect_example/RFB-320.kmodel)|164.8 ### Model size comparison - Comparison of several open source lightweight face detection models: Model|model file size（MB） ------|-------- libfacedetection v1（caffe）| 2.58 libfacedetection v2（caffe）| 3.34 Official Retinaface-Mobilenet-0.25 (Mxnet) | 1.68 version-slim| **1.04** version-RFB| **1.11** ## Generate VOC format training data set and training process 1. Download the wideface official website dataset or download the training set I provided and extract it into the ./data folder: (1) The clean widerface data pack after filtering out the 10px*10px small face: [Baidu cloud disk (extraction code: cbiu)](https://pan.baidu.com/s/1MR0ZOKHUP_ArILjbAn03sw) 、[Google Drive](https://drive.google.com/open?id=1OBY-Pk5hkcVBX1dRBOeLI4e4OCvqJRnH ) (2) Complete widerface data compression package without filtering small faces: [Baidu cloud disk (extraction code: ievk)](https://pan.baidu.com/s/1faHNz9ZrtEmr_yw48GW7ZA)、[Google Drive](https://drive.google.com/open?id=1sbBrDRgctEkymIpCh1OZBrU5qBS-SnCP ) 2. **(PS: If you download the filtered packets in (1) above, you don't need to perform this step)** Because the wideface has many small and unclear faces, which is not conducive to the convergence of efficient models, it needs to be filtered for training.By default,faces smaller than 10 pixels by 10 pixels will be filtered. run ./data/wider_face_2_voc_add_landmark.py ```Python python3 ./data/wider_face_2_voc_add_landmark.py ``` After the program is run and finished, the **wider_face_add_lm_10_10** folder will be generated in the ./data directory. The folder data and data package (1) are the same after decompression. The complete directory structure is as follows: ```Shell data/ retinaface_labels/ test/ train/ val/ wider_face/ WIDER_test/ WIDER_train/ WIDER_val/ wider_face_add_lm_10_10/ Annotations/ ImageSets/ JPEGImages/ wider_face_2_voc_add_landmark.py ``` 3. At this point, the VOC training set is ready. There are two scripts: **train-version-slim.sh** and **train-version-RFB.sh** in the root directory of the project. The former is used to train the **slim version** model, and the latter is used. Training **RFB version** model, the default parameters have been set, if the par

评论收藏

内容反馈

版权申诉