轻量级分段解决方案：MobileSAM_mobilesam资源-CSDN文库

共274个文件

py：213个

yaml：19个

jpg：16个

版权申诉

31 浏览量 2024-02-15 00:30:33 上传评论收藏 73.94MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

轻量级分段解决方案：MobileSAM （274个子文件）

setup.cfg 365B

.DS_Store 6KB

.gitattributes 2KB

.gitignore 39B

predictor_example.ipynb 11.14MB

automatic_mask_generator_example.ipynb 5.1MB

onnx_model_example.ipynb 4.65MB

picture3.jpg 941KB

1.jpg 905KB

2.jpg 644KB

picture5.jpg 368KB

picture6.jpg 327KB

picture1.jpg 280KB

picture2.jpg 228KB

picture4.jpg 217KB

mask_comparision.jpg 184KB

mask_point.jpg 149KB

bus.jpg 134KB

mask_box.jpg 125KB

model_diagram.jpg 85KB

zidane.jpg 49KB

LICENSE 11KB

README.md 11KB

CODE_OF_CONDUCT.md 3KB

README.md 2KB

CONTRIBUTING.md 1KB

notebook2.png 1.06MB

notebook1.png 1.05MB

logo2.png 131KB

mobile_sam.pt 38.84MB

Prompt_guided_Mask_Decoder.pt 15.56MB

_data_loader.py 73KB

v5loader.py 50KB

metrics.py 41KB

exporter.py 40KB

augment.py 36KB

tasks.py 33KB

trainer.py 31KB

ops.py 28KB

__init__.py 27KB

tiny_vit.py 24KB

tiny_vit_sam.py 24KB

plotting.py 24KB

results.py 24KB

autobackend.py 24KB

utils.py 23KB

encoders.py 22KB

torch_utils.py 22KB

model.py 22KB

loss.py 19KB

kalman_filter.py 18KB

__init__.py 18KB

v5augmentations.py 17KB

checks.py 17KB

automatic_mask_generator.py 17KB

predictor.py 16KB

ops.py 16KB

transformer.py 16KB

benchmarks.py 16KB

predictor.py 15KB

head.py 15KB

mask_generator.py 15KB

automatic_mask_generator.py 15KB

stream_loaders.py 15KB

val.py 14KB

instance.py 14KB

image_encoder.py 14KB

byte_tracker.py 14KB

image_encoder.py 14KB

tools.py 14KB

tal.py 13KB

_data_worker.py 13KB

dataset.py 13KB

amg.py 13KB

loss.py 13KB

comet.py 13KB

ops.py 13KB

val.py 13KB

amg.py 13KB

amg.py 12KB

base.py 12KB

autoshape.py 12KB

validator.py 12KB

gmc.py 12KB

downloads.py 12KB

backbone.py 12KB

block.py 12KB

predictor.py 11KB

conv.py 11KB

prompt_predictor.py 11KB

val.py 11KB

base.py 11KB

seg.py 10KB

app.py 10KB

cls_trainer.py 9KB

converter.py 9KB

共 274 条

<img src="assets/logo2.png?raw=true" width="99.1%" /> # Faster Segment Anything (MobileSAM) and Everything (MobileSAMv2) :pushpin: MobileSAMv2, available at [ResearchGate](https://www.researchgate.net/publication/376579294_MobileSAMv2_Faster_Segment_Anything_to_Everything) and [arXiv](https://arxiv.org/abs/2312.09579.pdf), replaces the grid-search prompt sampling in SAM with object-aware prompt sampling for faster **segment everything(SegEvery)**. :pushpin: MobileSAM, available at [ResearchGate](https://www.researchgate.net/publication/371851844_Faster_Segment_Anything_Towards_Lightweight_SAM_for_Mobile_Applications) and [arXiv](https://arxiv.org/pdf/2306.14289.pdf), replaces the heavyweight image encoder in SAM with a lightweight image encoder for faster **segment anything(SegAny)**. **Support for ONNX model export**. Feel free to test it on your devices and share your results with us. **A demo of MobileSAM** running on **CPU** is open at [hugging face demo](https://huggingface.co/spaces/dhkim2810/MobileSAM). On our own Mac i5 CPU, it takes around 3s. On the hugging face demo, the interface and inferior CPUs make it slower but still works fine. Stayed tuned for a new version with more features! You can also run a demo of MobileSAM on [your local PC](https://github.com/ChaoningZhang/MobileSAM/tree/master/app). :grapes: Media coverage and Projects that adapt from SAM to MobileSAM (Thank you all!) * **2023/07/03**: [joliGEN](https://github.com/jolibrain/joliGEN) supports MobileSAM for faster and lightweight mask refinement for image inpainting with Diffusion and GAN. * **2023/07/03**: [MobileSAM-in-the-Browser](https://github.com/akbartus/MobileSAM-in-the-Browser) shows a demo of running MobileSAM on the browser of your local PC or Mobile phone. * **2023/07/02**: [Inpaint-Anything](https://github.com/qiaoyu1002/Inpaint-Anything) supports MobileSAM for faster and lightweight Inpaint Anything * **2023/07/02**: [Personalize-SAM](https://github.com/qiaoyu1002/Personalize-SAM) supports MobileSAM for faster and lightweight Personalize Segment Anything with 1 Shot * **2023/07/01**: [MobileSAM-in-the-Browser](https://github.com/akbartus/MobileSAM-in-the-Browser) makes an example implementation of MobileSAM in the browser. * **2023/06/30**: [SegmentAnythingin3D](https://github.com/Jumpat/SegmentAnythingin3D) supports MobileSAM to segment anything in 3D efficiently. * **2023/06/30**: MobileSAM has been featured by [AK](https://twitter.com/_akhaliq?lang=en) for the second time, see the link [AK's MobileSAM tweet](https://twitter.com/_akhaliq/status/1674410573075718145). Welcome to retweet. * **2023/06/29**: [AnyLabeling](https://github.com/vietanhdev/anylabeling) supports MobileSAM for auto-labeling. * **2023/06/29**: [SonarSAM](https://github.com/wangsssky/SonarSAM) supports MobileSAM for Image encoder full-finetuing. * **2023/06/29**: [Stable Diffusion WebUIv](https://github.com/continue-revolution/sd-webui-segment-anything) supports MobileSAM. * **2023/06/28**: [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything) supports MobileSAM with [Grounded-MobileSAM](https://github.com/IDEA-Research/Grounded-Segment-Anything/tree/main/EfficientSAM). * **2023/06/27**: MobileSAM has been featured by [AK](https://twitter.com/_akhaliq?lang=en), see the link [AK's MobileSAM tweet](https://twitter.com/_akhaliq/status/1673585099097636864). Welcome to retweet. ![MobileSAM](assets/model_diagram.jpg?raw=true) :star: **How is MobileSAM trained?** MobileSAM is trained on a single GPU with 100k datasets (1% of the original images) for less than a day. The training code will be available soon. :star: **How to Adapt from SAM to MobileSAM?** Since MobileSAM keeps exactly the same pipeline as the original SAM, we inherit pre-processing, post-processing, and all other interfaces from the original SAM. Therefore, by assuming everything is exactly the same except for a smaller image encoder, those who use the original SAM for their projects can **adapt to MobileSAM with almost zero effort**. :star: **MobileSAM performs on par with the original SAM (at least visually)** and keeps exactly the same pipeline as the original SAM except for a change on the image encoder. Specifically, we replace the original heavyweight ViT-H encoder (632M) with a much smaller Tiny-ViT (5M). On a single GPU, MobileSAM runs around 12ms per image: 8ms on the image encoder and 4ms on the mask decoder. * The comparison of ViT-based image encoder is summarzed as follows: Image Encoder | Original SAM | MobileSAM :-----------------------------------------:|:---------|:-----: Parameters | 611M | 5M Speed | 452ms | 8ms * Original SAM and MobileSAM have exactly the same prompt-guided mask decoder: Mask Decoder | Original SAM | MobileSAM :-----------------------------------------:|:---------|:-----: Parameters | 3.876M | 3.876M Speed | 4ms | 4ms * The comparison of the whole pipeline is summarized as follows: Whole Pipeline (Enc+Dec) | Original SAM | MobileSAM :-----------------------------------------:|:---------|:-----: Parameters | 615M | 9.66M Speed | 456ms | 12ms :star: **Original SAM and MobileSAM with a point as the prompt.** <img src="assets/mask_point.jpg?raw=true" width="99.1%" /> :star: **Original SAM and MobileSAM with a box as the prompt.** <img src="assets/mask_box.jpg?raw=true" width="99.1%" /> :muscle: **Is MobileSAM faster and smaller than FastSAM? Yes!** MobileSAM is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows: Whole Pipeline (Enc+Dec) | FastSAM | MobileSAM :-----------------------------------------:|:---------|:-----: Parameters | 68M | 9.66M Speed | 64ms |12ms :muscle: **Does MobileSAM aign better with the original SAM than FastSAM? Yes!** FastSAM is suggested to work with multiple points, thus we compare the mIoU with two prompt points (with different pixel distances) and show the resutls as follows. Higher mIoU indicates higher alignment. mIoU | FastSAM | MobileSAM :-----------------------------------------:|:---------|:-----: 100 | 0.27 | 0.73 200 | 0.33 |0.71 300 | 0.37 |0.74 400 | 0.41 |0.73 500 | 0.41 |0.73 ## Installation The code requires `python>=3.8`, as well as `pytorch>=1.7` and `torchvision>=0.8`. Please follow the instructions [here](https://pytorch.org/get-started/locally/) to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. Install Mobile Segment Anything: ``` pip install git+https://github.com/ChaoningZhang/MobileSAM.git ``` or clone the repository locally and install with ``` git clone git@github.com:ChaoningZhang/MobileSAM.git cd MobileSAM; pip install -e . ``` ## Demo Once installed MobileSAM, you can run demo on your local PC or check out our [HuggingFace Demo](https://huggingface.co/spaces/dhkim2810/MobileSAM). It requires latest version of [gradio](https://gradio.app). ``` cd app python app.py ``` ## <a name="GettingStarted"></a>Getting Started The MobileSAM can be loaded in the following ways: ``` from mobile_sam import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor model_type = "vit_t" sam_checkpoint = "./weights/mobile_sam.pt" device = "cuda" if torch.cuda.is_available() else "cpu" mobile_sam = sam_model_registry[model_type](checkpoint=sam_checkpoint) mobile_sam.to(device=device) mobile_sam.eval() predictor = SamPredictor(mobile_sam) predictor.set_image(<your_image>) masks, _, _ = predictor.predict(<input

评论收藏

内容反馈

版权申诉