<p float="center">
<img src="assets/logo2.png?raw=true" width="99.1%" />
</p>
# Faster Segment Anything (MobileSAM) and Everything (MobileSAMv2)
:pushpin: MobileSAMv2, available at [ResearchGate](https://www.researchgate.net/publication/376579294_MobileSAMv2_Faster_Segment_Anything_to_Everything) and [arXiv](https://arxiv.org/abs/2312.09579.pdf), replaces the grid-search prompt sampling in SAM with object-aware prompt sampling for faster **segment everything(SegEvery)**.
:pushpin: MobileSAM, available at [ResearchGate](https://www.researchgate.net/publication/371851844_Faster_Segment_Anything_Towards_Lightweight_SAM_for_Mobile_Applications) and [arXiv](https://arxiv.org/pdf/2306.14289.pdf), replaces the heavyweight image encoder in SAM with a lightweight image encoder for faster **segment anything(SegAny)**.
**Support for ONNX model export**. Feel free to test it on your devices and share your results with us.
**A demo of MobileSAM** running on **CPU** is open at [hugging face demo](https://huggingface.co/spaces/dhkim2810/MobileSAM). On our own Mac i5 CPU, it takes around 3s. On the hugging face demo, the interface and inferior CPUs make it slower but still works fine. Stayed tuned for a new version with more features! You can also run a demo of MobileSAM on [your local PC](https://github.com/ChaoningZhang/MobileSAM/tree/master/app).
:grapes: Media coverage and Projects that adapt from SAM to MobileSAM (Thank you all!)
* **2023/07/03**: [joliGEN](https://github.com/jolibrain/joliGEN) supports MobileSAM for faster and lightweight mask refinement for image inpainting with Diffusion and GAN.
* **2023/07/03**: [MobileSAM-in-the-Browser](https://github.com/akbartus/MobileSAM-in-the-Browser) shows a demo of running MobileSAM on the browser of your local PC or Mobile phone.
* **2023/07/02**: [Inpaint-Anything](https://github.com/qiaoyu1002/Inpaint-Anything) supports MobileSAM for faster and lightweight Inpaint Anything
* **2023/07/02**: [Personalize-SAM](https://github.com/qiaoyu1002/Personalize-SAM) supports MobileSAM for faster and lightweight Personalize Segment Anything with 1 Shot
* **2023/07/01**: [MobileSAM-in-the-Browser](https://github.com/akbartus/MobileSAM-in-the-Browser) makes an example implementation of MobileSAM in the browser.
* **2023/06/30**: [SegmentAnythingin3D](https://github.com/Jumpat/SegmentAnythingin3D) supports MobileSAM to segment anything in 3D efficiently.
* **2023/06/30**: MobileSAM has been featured by [AK](https://twitter.com/_akhaliq?lang=en) for the second time, see the link [AK's MobileSAM tweet](https://twitter.com/_akhaliq/status/1674410573075718145). Welcome to retweet.
* **2023/06/29**: [AnyLabeling](https://github.com/vietanhdev/anylabeling) supports MobileSAM for auto-labeling.
* **2023/06/29**: [SonarSAM](https://github.com/wangsssky/SonarSAM) supports MobileSAM for Image encoder full-finetuing.
* **2023/06/29**: [Stable Diffusion WebUIv](https://github.com/continue-revolution/sd-webui-segment-anything) supports MobileSAM.
* **2023/06/28**: [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything) supports MobileSAM with [Grounded-MobileSAM](https://github.com/IDEA-Research/Grounded-Segment-Anything/tree/main/EfficientSAM).
* **2023/06/27**: MobileSAM has been featured by [AK](https://twitter.com/_akhaliq?lang=en), see the link [AK's MobileSAM tweet](https://twitter.com/_akhaliq/status/1673585099097636864). Welcome to retweet.
![MobileSAM](assets/model_diagram.jpg?raw=true)
:star: **How is MobileSAM trained?** MobileSAM is trained on a single GPU with 100k datasets (1% of the original images) for less than a day. The training code will be available soon.
:star: **How to Adapt from SAM to MobileSAM?** Since MobileSAM keeps exactly the same pipeline as the original SAM, we inherit pre-processing, post-processing, and all other interfaces from the original SAM. Therefore, by assuming everything is exactly the same except for a smaller image encoder, those who use the original SAM for their projects can **adapt to MobileSAM with almost zero effort**.
:star: **MobileSAM performs on par with the original SAM (at least visually)** and keeps exactly the same pipeline as the original SAM except for a change on the image encoder. Specifically, we replace the original heavyweight ViT-H encoder (632M) with a much smaller Tiny-ViT (5M). On a single GPU, MobileSAM runs around 12ms per image: 8ms on the image encoder and 4ms on the mask decoder.
* The comparison of ViT-based image encoder is summarzed as follows:
Image Encoder | Original SAM | MobileSAM
:-----------------------------------------:|:---------|:-----:
Parameters | 611M | 5M
Speed | 452ms | 8ms
* Original SAM and MobileSAM have exactly the same prompt-guided mask decoder:
Mask Decoder | Original SAM | MobileSAM
:-----------------------------------------:|:---------|:-----:
Parameters | 3.876M | 3.876M
Speed | 4ms | 4ms
* The comparison of the whole pipeline is summarized as follows:
Whole Pipeline (Enc+Dec) | Original SAM | MobileSAM
:-----------------------------------------:|:---------|:-----:
Parameters | 615M | 9.66M
Speed | 456ms | 12ms
:star: **Original SAM and MobileSAM with a point as the prompt.**
<p float="left">
<img src="assets/mask_point.jpg?raw=true" width="99.1%" />
</p>
:star: **Original SAM and MobileSAM with a box as the prompt.**
<p float="left">
<img src="assets/mask_box.jpg?raw=true" width="99.1%" />
</p>
:muscle: **Is MobileSAM faster and smaller than FastSAM? Yes!**
MobileSAM is around 7 times smaller and around 5 times faster than the concurrent FastSAM.
The comparison of the whole pipeline is summarzed as follows:
Whole Pipeline (Enc+Dec) | FastSAM | MobileSAM
:-----------------------------------------:|:---------|:-----:
Parameters | 68M | 9.66M
Speed | 64ms |12ms
:muscle: **Does MobileSAM aign better with the original SAM than FastSAM? Yes!**
FastSAM is suggested to work with multiple points, thus we compare the mIoU with two prompt points (with different pixel distances) and show the resutls as follows. Higher mIoU indicates higher alignment.
mIoU | FastSAM | MobileSAM
:-----------------------------------------:|:---------|:-----:
100 | 0.27 | 0.73
200 | 0.33 |0.71
300 | 0.37 |0.74
400 | 0.41 |0.73
500 | 0.41 |0.73
## Installation
The code requires `python>=3.8`, as well as `pytorch>=1.7` and `torchvision>=0.8`. Please follow the instructions [here](https://pytorch.org/get-started/locally/) to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended.
Install Mobile Segment Anything:
```
pip install git+https://github.com/ChaoningZhang/MobileSAM.git
```
or clone the repository locally and install with
```
git clone git@github.com:ChaoningZhang/MobileSAM.git
cd MobileSAM; pip install -e .
```
## Demo
Once installed MobileSAM, you can run demo on your local PC or check out our [HuggingFace Demo](https://huggingface.co/spaces/dhkim2810/MobileSAM).
It requires latest version of [gradio](https://gradio.app).
```
cd app
python app.py
```
## <a name="GettingStarted"></a>Getting Started
The MobileSAM can be loaded in the following ways:
```
from mobile_sam import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor
model_type = "vit_t"
sam_checkpoint = "./weights/mobile_sam.pt"
device = "cuda" if torch.cuda.is_available() else "cpu"
mobile_sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
mobile_sam.to(device=device)
mobile_sam.eval()
predictor = SamPredictor(mobile_sam)
predictor.set_image(<your_image>)
masks, _, _ = predictor.predict(<input
没有合适的资源?快使用搜索试试~ 我知道了~
轻量级分段解决方案:MobileSAM
共274个文件
py:213个
yaml:19个
jpg:16个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 31 浏览量
2024-02-15
00:30:33
上传
评论
收藏 73.94MB ZIP 举报
温馨提示
MobileSAM是使SAM更轻量化和更快速的分段任何项目的官方代码。它提供了高效的分段功能,可以帮助开发人员处理任意对象的分段需求。MobileSAM专注于提供更轻量级和更高效的分段解决方案,使开发人员能够更快地构建分段应用。
资源推荐
资源详情
资源评论
收起资源包目录
轻量级分段解决方案:MobileSAM (274个子文件)
setup.cfg 365B
.DS_Store 6KB
.gitattributes 2KB
.gitignore 39B
predictor_example.ipynb 11.14MB
automatic_mask_generator_example.ipynb 5.1MB
onnx_model_example.ipynb 4.65MB
picture3.jpg 941KB
1.jpg 905KB
2.jpg 644KB
picture5.jpg 368KB
picture6.jpg 327KB
picture1.jpg 280KB
picture1.jpg 280KB
picture2.jpg 228KB
picture2.jpg 228KB
picture4.jpg 217KB
mask_comparision.jpg 184KB
mask_point.jpg 149KB
bus.jpg 134KB
mask_box.jpg 125KB
model_diagram.jpg 85KB
zidane.jpg 49KB
LICENSE 11KB
README.md 11KB
CODE_OF_CONDUCT.md 3KB
README.md 2KB
README.md 2KB
README.md 2KB
CONTRIBUTING.md 1KB
notebook2.png 1.06MB
notebook1.png 1.05MB
logo2.png 131KB
mobile_sam.pt 38.84MB
Prompt_guided_Mask_Decoder.pt 15.56MB
_data_loader.py 73KB
v5loader.py 50KB
metrics.py 41KB
exporter.py 40KB
augment.py 36KB
tasks.py 33KB
trainer.py 31KB
ops.py 28KB
__init__.py 27KB
tiny_vit.py 24KB
tiny_vit_sam.py 24KB
plotting.py 24KB
results.py 24KB
autobackend.py 24KB
utils.py 23KB
encoders.py 22KB
torch_utils.py 22KB
model.py 22KB
loss.py 19KB
kalman_filter.py 18KB
__init__.py 18KB
v5augmentations.py 17KB
checks.py 17KB
automatic_mask_generator.py 17KB
predictor.py 16KB
ops.py 16KB
transformer.py 16KB
benchmarks.py 16KB
predictor.py 15KB
head.py 15KB
mask_generator.py 15KB
automatic_mask_generator.py 15KB
stream_loaders.py 15KB
val.py 14KB
instance.py 14KB
image_encoder.py 14KB
byte_tracker.py 14KB
image_encoder.py 14KB
tools.py 14KB
tal.py 13KB
_data_worker.py 13KB
dataset.py 13KB
amg.py 13KB
loss.py 13KB
comet.py 13KB
ops.py 13KB
val.py 13KB
amg.py 13KB
amg.py 12KB
base.py 12KB
autoshape.py 12KB
validator.py 12KB
gmc.py 12KB
downloads.py 12KB
backbone.py 12KB
block.py 12KB
predictor.py 11KB
conv.py 11KB
prompt_predictor.py 11KB
val.py 11KB
base.py 11KB
seg.py 10KB
app.py 10KB
cls_trainer.py 9KB
converter.py 9KB
共 274 条
- 1
- 2
- 3
资源评论
UnknownToKnown
- 粉丝: 1w+
- 资源: 627
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功