[ACMMM2022-Demo]使用Swin-UNet恢复模拟视频_Python_下载.zip

共15个文件

py：11个

jpg：1个

license：1个

版权申诉

5星 · 超过95%的资源 186 浏览量 2023-04-28 13:53:50 上传评论 1 收藏 874KB ZIP 举报

：“ACMMM2022-Demo：使用Swin-UNet恢复模拟视频”是一个在Python环境下实现的项目，旨在演示如何利用Swin-UNet模型对模拟视频进行修复与增强。这个项目可能涉及到计算机视觉、图像处理以及深度学习等多个IT领域的专业知识。：该项目的主要目标是通过Swin-UNet模型，将质量较差的模拟视频恢复到更清晰的状态。模拟视频通常由于老化、信号噪声或传输损耗等问题，导致画质模糊、色彩失真。Swin-UNet是一种深度学习网络架构，它结合了Transformer的全局注意力机制和卷积神经网络（CNN）的局部特征提取能力，特别适合于图像分割和恢复任务。： 1. 计算机视觉：这个项目属于计算机视觉领域，涉及图像分析和理解。 2. 深度学习：Swin-UNet是基于深度学习的模型，用于解决图像恢复问题。 3. 图像处理：视频恢复过程包含了多个图像处理技术，如去噪、超分辨率等。 4. Transformer：Swin-UNet使用Transformer结构，实现了对图像的全局信息捕获。 5. CNN：作为深度学习的一部分，Swin-UNet结合了卷积神经网络来处理局部特征。 6. Python编程：整个项目使用Python语言编写，这是目前数据科学和机器学习领域最常用的编程语言。 7. ACMMM会议：ACMMM是国际多媒体领域的顶级学术会议，这个项目可能是会议上展示的成果。【压缩包子文件的文件名称列表】：“analog-video-restoration-main”可能是项目源代码所在的主目录，包含了所有必要的文件和脚本，如模型定义、训练脚本、数据预处理工具等。用户可以通过解压此zip文件并运行其中的代码来复现模拟视频恢复的过程。在项目中，可能包含以下关键部分： 1. 数据集：项目可能提供了模拟视频的数据集，用于训练和测试Swin-UNet模型。 2. 模型定义：Swin-UNet的架构会在一个Python文件中被详细定义，包括其层结构、损失函数和优化器选择。 3. 训练脚本：这个脚本负责加载数据、初始化模型、设置训练参数，并执行训练过程。 4. 测试脚本：用于评估模型在未见过的视频片段上的表现。 5. 可视化工具：可能包括用于查看原始和恢复视频帧的工具，以便直观比较结果。 6. 配置文件：存储项目运行所需的环境变量和参数设置。通过这个项目，开发者和研究人员可以学习到如何利用深度学习技术处理实际的视频恢复问题，同时了解Swin-UNet模型的构建和应用。对于希望提升自己在图像处理和计算机视觉领域技能的人来说，这是一个极好的学习资源。

资源推荐

资源详情

资源评论

收起资源包目录

[ACMMM2022-Demo]使用Swin-UNet恢复模拟视频_Python_下载.zip （15个子文件）

analog-video-restoration-main

readme.png 708KB

src

utils.py 12KB

losses.py 3KB

video_recurrent_dataset.py 5KB

real_world_test.py 5KB

video_data_pl_module.py 2KB

video_swin_unet.py 26KB

vgg_feature_extractor.py 4KB

video_real_world_dataset.py 3KB

utils_models.py 6KB

recurrent_cnn_pl_module.py 4KB

train.py 5KB

LICENSE 1KB

smartvideorestoration_logo.jpg 155KB

README.md 6KB

# Restoration of Analog Videos Using Swin-UNet This application is part of the **ReInHerit Toolkit**. ![ReInHerit Smart Video Restoration logo](smartvideorestoration_logo.jpg "ReInHerit Smart Video Restoration logo") ## Table of Contents * [About the Project](#about-the-project) * [Getting Started](#getting-started) * [Prerequisites](#prerequisites) * [Installation](#installation) * [Usage](#usage) * [Training](#training) * [Test](#test) * [Authors](#authors) * [Citation](#citation) ## About The Project ![restoration example](readme.png) This is the **official repository** of "[**Restoration of Analog Videos Using Swin-UNet**](https://dl.acm.org/doi/10.1145/3503161.3547730)" **[Demo ACM MM 2022]**. In this work, we present an approach to restore analog videos of historical archives. These videos often contain severe visual degradation due to the deterioration of their tape supports that require costly and slow manual interventions to recover the original content. The proposed method uses a multi-frame approach and is able to deal also with severe tape mistracking, which results in completely scrambled frames. Tests on real-world videos from a major historical video archive show the effectiveness of our approach. ## Getting Started To get a local copy up and running follow these simple steps. ### Prerequisites We strongly recommend the use of the [**Anaconda**](https://www.anaconda.com/) package manager in order to avoid dependency/reproducibility problems. A conda installation guide for Linux systems can be found [here](https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html). ### Installation 1. Clone the repo ```sh git clone https://github.com/miccunifi/analog-video-restoration.git ``` 2. Install Python dependencies ```sh conda create -n analog_video_restoration -y python=3.9 conda activate analog_video_restoration pip install -r requirements.txt ``` ## Usage ### Training 1. Make your training dataset have the following structure: ``` <dataset-name> └─── train └─── input └─── 000 | 00000.jpg | 00001.jpg | ... └─── 001 | 00000.jpg | 00001.jpg | ... ... └─── gt └─── 000 | 00000.jpg | 00001.jpg | 00002.jpg | ... └─── 001 | 00000.jpg | 00001.jpg | ... ... └─── val └─── input └─── 000 | 00000.jpg | 00001.jpg | ... └─── 001 | 00000.jpg | 00001.jpg | ... ... └─── gt └─── 000 | 00000.jpg | 00001.jpg | 00002.jpg | ... └─── 001 | 00000.jpg | 00001.jpg | ... ... ``` 2. Get your [Comet](https://www.comet.com/site/) api key for online logging of the losses and metrics 3. Run the training code with ``` python src/train.py --experiment-name video_swin_unet --data-base-path <path-to-dataset> --devices 0 --api-key <your-Comet-api-key> --batch-size 2 --num-epochs 100 --num-workers 20 --pixel-loss-weight 200 --perceptual-loss-weight 1 ``` ### Test 1. If needed, download the pretrained model from [Google Drive](https://drive.google.com/drive/folders/1omIk6qHKqbvO7T09Ixiez7zq08S7OaxE?usp=share_link) and copy it inside the folder ```pretrained_models/video_swin_unet/``` 2. Extract the frames of the video in .jpg images and save them in a folder ``` mkdir <folder-name> ffmpeg -i <video-file-name> -qscale:v 2 <folder-name>/%00d.jpg ``` 3. Run inference on the folder with ``` python src/real_world_test.py --experiment-name video_swin_unet --data-base-path <path-to-folder> --results-path results --patch-size 512 --fps 60 ``` ## Authors * [**Lorenzo Agnolucci**](https://scholar.google.com/citations?user=hsCt4ZAAAAAJ&hl=en) * [**Leonardo Galteri**](https://scholar.google.com/citations?user=_n2R2bUAAAAJ&hl=en) * [**Marco Bertini**](https://scholar.google.it/citations?user=SBm9ZpYAAAAJ&hl=en) * [**Alberto Del Bimbo**](https://scholar.google.it/citations?user=bf2ZrFcAAAAJ&hl=en) ## Citation If you find this work useful for your research, please consider citing: <pre> @inproceedings{10.1145/3503161.3547730, author = {Agnolucci, Lorenzo and Galteri, Leonardo and Bertini, Marco and Del Bimbo, Alberto}, title = {Restoration of Analog Videos Using Swin-UNet}, year = {2022}, isbn = {9781450392037}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3503161.3547730}, doi = {10.1145/3503161.3547730}, abstract = {In this paper we present a system to restore analog videos of historical archives. These videos often contain severe visual degradation due to the deterioration of their tape supports that require costly and slow manual interventions to recover the original content. The proposed system uses a multi-frame approach and is able to deal also with severe tape mistracking, which results in completely scrambled frames. Tests on real-world videos from a major historical video archive show the effectiveness of our demo system.}, booktitle = {Proceedings of the 30th ACM International Conference on Multimedia}, pages = {6985–6987}, numpages = {3}, keywords = {old videos restoration, analog videos, unet, swin transformer}, location = {Lisboa, Portugal}, series = {MM '22} } </pre>

评论收藏

内容反馈

版权申诉