DeafTech Vision: A Visual Computer's Approach to Accessible Communication through Deep Learning-Driven ASL Analysis
Keywords:
Deaf Community, American Sign Language, Convolutional Neural Network, Recurrent Neural Network, Hearing-Impaired, Rectified Linear Unit, Cambridge Hand Gesture
Abstract
Sign language is commonly used by people with hearing and speech impairments, making it difficult for those without such disabilities to understand. However, sign language is not limited to communication within the deaf community alone. It has been officially recognized in numerous countries and is increasingly being offered as a second language option in educational institutions. In addition, sign language has shown its usefulness in various professional sectors, including interpreting, education, and healthcare, by facilitating communication between people with and without hearing impairments. Advanced technologies, such as computer vision and machine learning algorithms, are used to interpret and translate sign language into spoken or written forms. These technologies aim to promote inclusivity and provide equal opportunities for people with hearing impairments in different domains, such as education, employment, and social interactions. In this paper, we implement a DeafTech Vision (DTV-CNN) architecture based on the convolutional neural network to recognize American Sign Language (ASL) gestures using deep learning techniques. Our main objective is to develop a robust ASL sign classification model to enhance human-computer interaction and assist individuals with hearing impairments. Through extensive evaluation, our model consistently outperformed baseline methods in terms of precision. It achieved an outstanding accuracy rate of 99.87% on the ASL alphabet test dataset and 99.94% on the ASL digit dataset, significantly exceeding previous research, which reported an accuracy of 90.00%. We also illustrated the model's learning trends and convergence points using loss and error graphs. These results highlight the DTV-CNN's effectiveness and capability in distinguishing complex ASL gestures.
Published
2024-06-13
How to Cite
Mugdha, S. B. S., Das, H., Uddin, M., Arafat, M. E., & Islam, M. M. (2024). DeafTech Vision: A Visual Computer’s Approach to Accessible Communication through Deep Learning-Driven ASL Analysis. Statistics, Optimization & Information Computing, 12(6), 1795-1811. https://doi.org/10.19139/soic-2310-5070-2020
Issue
Section
Research Articles
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).