对象跟踪和分割：该项目专注于分割和跟踪视频中的任何对象，利用自动和交互式方法

共177个文件

py：98个

jpg：26个

md：19个

版权申诉

目标跟踪

图像处理

深度学习

197 浏览量 2023-12-14 15:38:15 上传评论收藏 95.41MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

对象跟踪和分割：该项目专注于分割和跟踪视频中的任何对象，利用自动和交互式方法（177个子文件）

setup.cfg 371B

.DS_Store 6KB

.flake8 211B

top.gif 4.29MB

demo_3x2.gif 3.61MB

.gitattributes 66B

.gitignore 337B

.gitignore 160B

predictor_example.ipynb 7.99MB

automatic_mask_generator_example.ipynb 4.18MB

onnx_model_example.ipynb 22KB

demo_instseg.ipynb 13KB

demo.ipynb 11KB

start_tracking.jpg 2.05MB

input_video.jpg 1.41MB

click_segment.jpg 978KB

Drawing_board.jpg 966KB

click_segment_everything.jpg 735KB

enter_text.jpg 726KB

add_positive_points.jpg 714KB

add_positive_points_2.jpg 713KB

segment_everything_blackswan.jpg 695KB

detect_result.jpg 651KB

add_positive_base_on_everything.jpg 647KB

gradio.jpg 618KB

interactive_webui.jpg 602KB

new_object.jpg 506KB

second_object.jpg 500KB

truck.jpg 265KB

add_positive_base_on_everything_cxk.jpg 208KB

groceries.jpg 164KB

masks2.jpg 130KB

dog.jpg 98KB

switch2textT.jpg 36KB

switch2ImgSeq.jpg 35KB

click_input_video.jpg 32KB

upload_Image_seq.jpg 21KB

select_fps.jpg 20KB

use_exa4ImgSeq.jpg 7KB

LICENSE 11KB

LICENSE 1KB

licenses.md 35KB

README.md 17KB

MODEL_ZOO.md 16KB

README.md 10KB

README.md 8KB

tutorial for WebUI-1.0-Version.md 6KB

CODE_OF_CONDUCT.md 3KB

tutorial for WebUI-1.5-Version.md 1KB

tutorial for Image-Sequence input.md 1KB

CONTRIBUTING.md 1KB

README.md 168B

README.md 60B

README.md 50B

README.md 48B

README.md 27B

README.md 20B

cars.mp4 6.54MB

cell.mp4 4.51MB

blackswan.mp4 655KB

masks1.png 3.53MB

notebook2.png 1.17MB

notebook1.png 854KB

overview_deaot.png 595KB

model_diagram.png 568KB

overview.png 483KB

app.py 40KB

attention.py 32KB

trainer.py 29KB

swin_transformer.py 27KB

evaluator.py 25KB

aot_engine.py 25KB

transformer.py 24KB

train_datasets.py 24KB

video_transforms.py 23KB

image_transforms.py 19KB

resnet.py 16KB

automatic_mask_generator.py 15KB

image_encoder.py 14KB

eval_datasets.py 14KB

seg_track_anything.py 13KB

demo.py 13KB

amg.py 12KB

predictor.py 11KB

SegTracker.py 10KB

prompt_encoder.py 8KB

mobilenetv2.py 8KB

transformer.py 8KB

mobilenetv3.py 8KB

aot_tracker.py 8KB

sam.py 7KB

amg.py 7KB

loss.py 7KB

共 177 条

# Segment and Track Anything (SAM-Track) **Online Demo:** [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1R10N70AJaslzADFqb-a5OihYkllWEVxB?usp=sharing) **Technical Report**: [![](https://img.shields.io/badge/Report-arXiv:2305.06558-green)](https://arxiv.org/abs/2305.06558) **Tutorial:** [tutorial-v1.5 (Text)](./tutorial/tutorial%20for%20WebUI-1.5-Version.md), [tutorial-v1.0 (Click & Brush)](./tutorial/tutorial%20for%20WebUI-1.0-Version.md) <p align="center"> <img src="./assets/top.gif" width="880"> </p> **Segment and Track Anything** is an open-source project that focuses on the segmentation and tracking of any objects in videos, utilizing both automatic and interactive methods. The primary algorithms utilized include the [**SAM** (Segment Anything Models)](https://github.com/facebookresearch/segment-anything) for automatic/interactive key-frame segmentation and the [**DeAOT** (Decoupling features in Associating Objects with Transformers)](https://github.com/yoxu515/aot-benchmark) (NeurIPS2022) for efficient multi-object tracking and propagation. The SAM-Track pipeline enables dynamic and automatic detection and segmentation of new objects by SAM, while DeAOT is responsible for tracking all identified objects. ## :loudspeaker:New Features - [2023/5/12] We have authored a technical report for SAM-Track. - [2023/5/7] We have added `demo_instseg.ipynb`, which uses Grounding-DINO to detect new objects in the key frames of a video. It can be applied in the fields of smart cities and autonomous driving. - [2023/4/29] We have added advanced arguments for AOT-L: `long_term_memory_gap` and `max_len_long_term`. - `long_term_memory_gap` controls the frequency at which the AOT model adds new reference frames to its long-term memory. During mask propagation, AOT matches the current frame with the reference frames stored in the long-term memory. - Setting the gap value to a proper value helps to obtain better performance. To avoid memory explosion in long videos, we set a `max_len_long_term` value for the long-term memory storage, i.e. when the number of memory frames reaches the `max_len_long_term value`, the oldest memory frame will be discarded and a new frame will be added. - [2023/4/26] **Interactive WebUI 1.5-Version**: We have added new features based on Interactive WebUI-1.0 Version. - We have added a new form of interactivity—text prompts—to SAMTrack. - From now on, multiple objects that need to be tracked can be interactively added. - Check out [tutorial](./tutorial/tutorial%20for%20WebUI-1.5-Version.md) for Interactive WebUI 1.5-Version. More demos will be released in the next few days. - [2023/4/26] **Image-Sequence input**: The WebUI now has a new feature that allows for input of image sequences, which can be used to test video segmentation datasets. Get started with the [tutorial](./tutorial/tutorial%20for%20Image-Sequence%20input.md) for Image-Sequence input. - [2023/4/25] **Online Demo:** You can easily use SAMTrack in [Colab](https://colab.research.google.com/drive/1R10N70AJaslzADFqb-a5OihYkllWEVxB?usp=sharing) for visual tracking tasks. - [2023/4/23] **Interactive WebUI:** We have introduced a new WebUI that allows interactive user segmentation through strokes and clicks. Feel free to explore and have fun with the [tutorial](./tutorial/tutorial%20for%20WebUI-1.0-Version.md)! - [2023/4/24] **Tutorial V1.0:** Check out our new video tutorials! - YouTube-Link: [Tutorial for Interactively modify single-object mask for first frame of video](https://www.youtube.com/watch?v=DF0iFSsX8KY)、[Tutorial for Interactively add object by click](https://www.youtube.com/watch?v=UJvKPng9_DA)、[Tutorial for Interactively add object by stroke](https://www.youtube.com/watch?v=m1oFavjIaCM). - Bilibili Video Link:[Tutorial for Interactively modify single-object mask for first frame of video](https://www.bilibili.com/video/BV1tM4115791/?spm_id_from=333.999.0.0)、[Tutorial for Interactively add object by click](https://www.bilibili.com/video/BV1Qs4y1A7d1/)、[Tutorial for Interactively add object by stroke](https://www.bilibili.com/video/BV1Lm4y117J4/?spm_id_from=333.999.0.0). - 1.0-Version is a developer version, please feel free to contact us if you encounter any bugs :bug:. - [2023/4/17] **SAMTrack**: Automatically segment and track anything in video! ## :fire:Demos <div align=center> [![Segment-and-Track-Anything Versatile Demo](https://res.cloudinary.com/marcomontalbano/image/upload/v1681713095/video_to_markdown/images/youtube--UPhtpf1k6HA-c05b58ac6eb4c4700831b2b3070cd403.jpg)](https://youtu.be/UPhtpf1k6HA "Segment-and-Track-Anything Versatile Demo") </div> This video showcases the segmentation and tracking capabilities of SAM-Track in various scenarios, such as street views, AR, cells, animations, aerial shots, and more. ## :calendar:TODO - [x] Colab notebook: Completed on April 25th, 2023. - [x] 1.0-Version Interactive WebUI: Completed on April 23rd, 2023. - We will create a feature that enables users to interactively modify the mask for the initial video frame according to their needs. The interactive segmentation capabilities of Segment-and-Track-Anything is demonstrated in [Demo8](https://www.youtube.com/watch?v=Xyd54AngvV8&feature=youtu.be) and [Demo9](https://www.youtube.com/watch?v=eZrdna8JkoQ). - Bilibili Video Link: [Demo8](https://www.bilibili.com/video/BV1JL411v7uE/), [Demo9](https://www.bilibili.com/video/BV1Qs4y1w763/). - [x] 1.5-Version Interactive WebUI: Completed on April 26th, 2023. - We will develop a function that allows interactive modification of multi-object masks for the first frame of a video. This function will be based on Version 1.0. YouTube: [Demo4](https://www.youtube.com/watch?v=UFtwFaOfx2I&feature=youtu.be), [Demo5](https://www.youtube.com/watch?v=cK5MPFdJdSY&feature=youtu.be); Bilibili: [Demo4](https://www.bilibili.com/video/BV17X4y127mJ/), [Demo5](https://www.bilibili.com/video/BV1Pz4y1a7mC/) - Furthermore, we plan to include text prompts as an additional form of interaction. YouTube: [Demo1](https://www.youtube.com/watch?v=5oieHqFIJPc&feature=youtu.be), [Demo2](https://www.youtube.com/watch?v=nXfq17X6ohk); Bilibili: [Demo1](https://www.bilibili.com/video/BV1hg4y157yd/?vd_source=fe3b5c0215d05cc44c8eb3d94abae3ca), [Demo2](https://www.bilibili.com/video/BV1RV4y1k7i5/) - [ ] 2.x-Version Interactive WebUI - In version 2.x, the segmentation model will offer two options: SAM and SEEM. - We will develop a new function where the fixed-category object detection result can be displayed as a prompt. - We will enable SAM-Track to add and modify objects during tracking. YouTube: [Demo6](https://www.youtube.com/watch?v=l7hXM1a3nEA&feature=youtu.be ), [Demo7](https://www.youtube.com/watch?v=hPjw28Ul4cw&feature=youtu.be); Bilibili: [Demo6](https://www.bilibili.com/video/BV1nk4y1j7Am), [Demo7](https://www.bilibili.com/video/BV1mk4y1E78s/?vd_source=fe3b5c0215d05cc44c8eb3d94abae3ca) **Demo1** showcases SAM-Track's ability to take the class of objects as prompt. The user gives the category text 'panda' to enable instance-level segmentation and tracking of all objects belonging to this category. <div align=center> [![demo1](https://res.cloudinary.com/marcomontalbano/image/upload/v1683347297/video_to_markdown/images/youtube--5oieHqFIJPc-c05b58ac6eb4c4700831b2b3070cd403.jpg)](https://www.youtube.com/watch?v=5oieHqFIJPc&feature=youtu.be "demo1") </div> **Demo2** showcases SAM-Track's ability to take the text description as prompt. SAM-Track could segment and track target objects given the input that 'panda on the far left'. <div align=center> [![demo1](https://res.cloudinary.com/marcomontalbano/image/upload/v1683347643/video_to_markdown/images/youtube--nXfq17X6ohk-c05b58ac6eb4c4700831b2b3070cd403.jpg)](https://www.youtube.com/watch?v=nXfq17X6ohk "demo1") </div> **Demo3** showcases SAM-Track's

评论收藏

内容反馈

版权申诉