基于深度学习的聋哑人实时手语翻译器.zip资源-CSDN文库

共95个文件

v2：25个

py：13个

png：10个

版权申诉

深度学习

机器学习

人工智能算法

183 浏览量 2024-03-07 23:04:29 上传评论收藏 212.59MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于深度学习的聋哑人实时手语翻译器.zip （95个子文件）

基于深度学习的聋哑人实时手语翻译器

Demo Video.mp4 8.74MB

img

Logo.png 11KB

signara.jpg 870KB

Sign To Text

helper.py 7KB

Notebooks

download data.txt 82B

Sign Lang Translator Word Based.ipynb 38KB

Sign Lang Translator Character Based.ipynb 11KB

Logs

LSTM

train

events.out.tfevents.1634268063.PETER-DESKTOP.30040.2.v2 1.47MB

validation

events.out.tfevents.1634268069.PETER-DESKTOP.30040.3.v2 52KB

vgg16_transfer_learning

train

events.out.tfevents.1634839375.PETER-DESKTOP.20520.0.v2 11KB

events.out.tfevents.1634844986.PETER-DESKTOP.17376.0.v2 284KB

validation

events.out.tfevents.1634845014.PETER-DESKTOP.17376.1.v2 2KB

Conv1d

train

events.out.tfevents.1634318138.PETER-DESKTOP.36432.0.v2 1.5MB

validation

events.out.tfevents.1634318141.PETER-DESKTOP.36432.1.v2 79KB

vgg16_finetuning

train

events.out.tfevents.1634845125.PETER-DESKTOP.17376.2.v2 355KB

validation

events.out.tfevents.1634845136.PETER-DESKTOP.17376.3.v2 3KB

Conv1d_LSTM_custom

train

events.out.tfevents.1634781709.PETER-DESKTOP.24684.6.v2 248KB

validation

events.out.tfevents.1634781746.PETER-DESKTOP.24684.7.v2 3KB

transformer_encoder

train

events.out.tfevents.1634271989.PETER-DESKTOP.30040.4.v2 40B

events.out.tfevents.1634271997.PETER-DESKTOP.30040.5.v2 28.63MB

validation

events.out.tfevents.1634272011.PETER-DESKTOP.30040.6.v2 181KB

Conv1d_LSTM_tf

train

events.out.tfevents.1634789339.PETER-DESKTOP.24684.14.v2 769KB

validation

events.out.tfevents.1634789860.PETER-DESKTOP.24684.15.v2 1KB

LSTM2

train

events.out.tfevents.1634838285.PETER-DESKTOP.28980.3.v2 732KB

events.out.tfevents.1634838271.PETER-DESKTOP.28980.2.v2 394KB

validation

events.out.tfevents.1634838478.PETER-DESKTOP.28980.4.v2 9KB

LSTM3

train

events.out.tfevents.1634849165.PETER-DESKTOP.23756.2.v2 2.47MB

events.out.tfevents.1634846502.PETER-DESKTOP.23756.0.v2 529KB

validation

events.out.tfevents.1634849172.PETER-DESKTOP.23756.3.v2 61KB

events.out.tfevents.1634846696.PETER-DESKTOP.23756.1.v2 4KB

CTC

train

plugins

profile

2021_10_21_16_31_30

PETER-DESKTOP.input_pipeline.pb 2KB

PETER-DESKTOP.xplane.pb 100KB

PETER-DESKTOP.memory_profile.json.gz 19KB

PETER-DESKTOP.kernel_stats.pb 0B

PETER-DESKTOP.trace.json.gz 4KB

PETER-DESKTOP.overview_page.pb 4KB

PETER-DESKTOP.tensorflow_stats.pb 5KB

events.out.tfevents.1634833890.PETER-DESKTOP.profile-empty 40B

events.out.tfevents.1634833879.PETER-DESKTOP.28980.0.v2 911KB

validation

events.out.tfevents.1634833898.PETER-DESKTOP.28980.1.v2 55KB

Archs images

model Transformer.png 240KB

LSTM model.png 33KB

VGGmodel.png 135KB

model_conv1d.png 40KB

model CTC.png 29KB

model_convlstm.png 29KB

Models

LSTM3.h5 2.27MB

model_transformer_encoder.h5 16.3MB

model_conv1d.h5 1.33MB

VGG_transfer_learning.h5 80.73MB

main.py 6KB

models.py 1KB

utils

record vids.ipynb 2KB

Create Videos from Frames and save it in Dataset Folder.ipynb 3KB

Extract Videos and Keypoints.ipynb 14KB

Chars Dataset Gathering.ipynb 6KB

img

Mediapipe.png 31KB

char.png 13KB

const.py 2KB

README.md 3KB

fonts

Sahel.ttf 72KB

SIGNARA_Presentation.pptx 24.57MB

Final Demo.mp4 19.29MB

Text To Sign

input.txt 11B

ner.py 669B

Notebooks

ANERCorp.xlsx 2.11MB

ner.py 669B

ner_model.sav 5.84MB

namedEntity.ipynb 16KB

main.py 3KB

ModifiedVincent2.blend 18.87MB

logo.png 12KB

utils

main.ipynb 4KB

Calculate Transition - accross multiple Moves.ipynb 3KB

Moves

base_coord.bvh 562KB

live.bvh 255KB

egypt.bvh 365KB

hru_motion_coord.bvh 310KB

base_coord_main.bvh 83KB

hello_motion_coord.bvh 213KB

i.bvh 209KB

img

Animation.jpg 213KB

render.py 1KB

multithreading 3KB

maps.py 401B

anime.py 1KB

models

ner_model.sav 5.84MB

bvh.py 8KB

spell_checker.py 162B

How it Works.mp4 3.29MB

test.py 2KB

outputs

output.mp4 1.55MB

README.md 795B

ModifiedVincent2.blend1 18.87MB

README.md 2KB

# Deep Learning Model The goal of SIGNARA is to develop software that is capable of real-time translation of Arabic sign language into text. Due to resources and time constraints, the scope has been limited to implementing 9 Words and the whole Arabic alphabet. We achieved the words sector using [Mediapipe](https://mediapipe.dev/) - a framework for building multimodal, cross-platform, applied ML pipelines. While the alphabets sector Uses transfer learning. <p align="center"> <img src="img\Mediapipe.png" alt="Meidapipe"/> </p> Now we will discuss technical details of the 2 sectors: ## Words Challenges we faced while doing this part for having a dynamic data that can be represented as series of motion we had many questions during the research process ### Dataset After doing online research we found that there is no public dataset for the Arabic Sign Language words which contain continuous motion. So we started collecting our dataset by capturing video that takes 30 and 60 Frames of the word motion. We Captured 120 videos of 2 different data collectors to have a variety ### Model Deep learning methods such as recurrent neural networks like as LSTMs and variations that make use of one-dimensional convolutional neural networks or CNNs have been shown to provide state-of-the-art results on challenging activity recognition tasks with little or no data feature engineering, instead using feature learning on raw data. We applied our knowledge and tried many deep learning algorithms in order to get insghts of the performance by comparing different approaches. - CONV1D - LSTM - CONV1DLSTM - LSTM With CTC Loss - Transformers ## Characters <p align="center"> <img src="img\char.png" alt="Characters"/> </p> We found dataset consists of 54,049 images of [ArSL alphabets](https://data.mendeley.com/datasets/y7pckrw6z2/1) performed by more than 40 people for 32 standard Arabic signs and alphabets. The number of images per class differs from one class to another. The dataset gathered are of size 64 * 64 Pixels of grayscale. Deep convolutional neural network models may take days or even weeks to train on very large datasets. A way to short-cut this process is to re-use the model weights from pre-trained models that were developed for standard computer vision benchmark datasets, such as the ImageNet image recognition tasks. Top performing models can be downloaded and used directly, or integrated into a new model for your own computer vision problems. This way is called Transfer Learning In our Approach we used VGG16 Model ,The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes.

评论收藏

内容反馈

版权申诉