RESNA Annual Conference - 2023

Indian Sign Language Detection Using Inceptionv3 Model

Sevanth Gajula *, Aramandla Sravanthi *, Rohith Sirpa*

*B V Raju Institute of Technology, Narsapur, Telangana, India

1. INTRODUCTION

Indian Sign Language (ISL) is a natural sign language used by deaf and hearing-impaired individuals in India. It is a visual language that uses a combination of hand gestures, facial expressions, and body language to convey meaning. ISL is a complete and complex language, with its own grammar, syntax, and vocabulary. ISL has evolved over time and is influenced by the linguistic and cultural diversity of India. Different regions and communities in India have their own distinct sign languages, which share some similarities but also have differences in terms of vocabulary and grammar.

ISL is not based on spoken language, but it can convey the same level of complexity and depth as spoken languages. It has its own unique features, including the use of space and directional signs to convey meaning, and the use of facial expressions and body language to convey emotion. ISL is recognized by the Indian government as the official sign language of the country, and efforts are being made to promote its use and standardize its vocabulary and grammar. Schools for the deaf in India teach ISL as the primary means of communication, and there are also organizations and institutions working to promote the use of ISL in various domains, such as media, education, and healthcare.

Overall, ISL plays a vital role in the communication and identity of the deaf and hearing-impaired community in India, and it is an important part of the country's linguistic and cultural diversity.

The Indian Sign Language (ISL) is the primary means of communication for deaf and hearing-impaired individuals in India. While ISL is a rich and complex language, there are significant communication barriers for those who do not know the language. This can result in social isolation, limited access to education and healthcare, and difficulty finding employment opportunities. 

Recent advances in Deep Learning and computer vision have made it possible to develop tools and applications that can help to break down these communication barriers by predicting and interpreting hand gestures and signs in real-time. Such tools can be used to provide real-time translations, improve access to education and healthcare, and facilitate communication in various other domains. ISL prediction systems typically use Deep Learning algorithms trained on large datasets of annotated ISL gestures and signs. These algorithms analyse video input of the hand gestures and signs made by the user and translate those gestures into text or speech output. While ISL prediction technology has the potential to revolutionize communication for deaf and hearing-impaired individuals, there are also some challenges associated with its development and implementation. One challenge is the lack of standardized vocabulary and grammar in ISL, which can affect the accuracy of prediction models. 

Despite these challenges, the development of ISL prediction systems holds great promise for improving communication and breaking down social barriers for deaf and hearing-impaired individuals in India. By using technology to overcome the limitations of spoken language, ISL prediction has the potential to promote inclusion, diversity, and equal opportunities for all.

2. Literature Review

Indian Sign Language (ISL) has its own set of alphabets and numbers that are used to communicate with sign language. Here is a brief overview of the ISL alphabets and numbers:

  1. Alphabets: The ISL alphabet consists of 26 letters, with each letter represented by a unique hand gesture or sign. The hand gestures for each letter are formed by making specific hand shapes and movements, such as fingerspelling or signing.
  2. Numbers: The ISL number system consists of a combination of hand gestures and facial expressions. The hand gestures are used to represent the digits 1-9, and facial expressions are used to represent larger numbers, such as 10, 100, or 1,000.

In ISL, the letters and numbers are not only used for spelling and counting, but they are also used to represent words and concepts that do not have a direct visual representation. For example, the letter "C" in ISL can be used to represent the concept of "cold", even though it is not a direct visual representation of that concept.

Learning the ISL alphabets and numbers is an important part of communication in sign language, as it enables individuals to spell out words, convey numbers, and communicate more effectively in various settings, such as education, healthcare, and social interactions.

These previous works have provided valuable insights and methods for developing accurate and efficient ISL detection systems using machine learning and computer vision. However, there is still a need for further research and development in this area, particularly in improving the accuracy and robustness of ISL detection models and developing user-friendly interfaces and tools for capturing and interpreting ISL gestures in real-world settings.

Singh and Kaur (2020) [2] provided a review of Indian Sign Language recognition using machine learning techniques. They discussed various machine learning algorithms such as Decision Trees, Random Forest, K-NN, SVM, and Naive Bayes, and highlighted their advantages and limitations in sign language recognition.

Banerjee and Das (2020) [3] conducted a review of sign language recognition using Convolutional Neural Network (CNN). They discussed various CNN architectures and their applications in sign language recognition.

Singh and Gupta (2020) [4] conducted a review of sign language recognition using deep learning techniques. They discussed various deep learning architectures such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and their variants, and highlighted their advantages and limitations in sign language recognition.

Gupta and Chaudhary (2019) [1] conducted a survey of sign language recognition using machine learning techniques. They discussed various machine learning algorithms such as Decision Trees, SVM, K-NN, and their applications in sign language recognition.

Kumar et al. (2020) [5] proposed an Indian Sign Language recognition system using machine learning algorithms. They used various machine learning algorithms such as Decision Trees, Random Forest, K-NN, SVM, and Naive Bayes, and compared their performance in sign language recognition.

3. Methodology

3.1 Data Collection

Figure 1 Sample Data
Figure 1 Sample Data

The first step in this study is to collect a dataset of Indian Sign Language images representing all the English alphabets and numbers. It is important to ensure that the data collection process is standardized and welldocumented, and that the dataset is representative of the diverse range of ISL gestures and signers. This is crucial for developing accurate and robust machine learning models for ISL detection, as well as for advancing research and innovation in the field of sign language technology.

3.2 Pre-processing

The collected dataset must undergo pre-processing to remove noise and unwanted information from the images. The pre-processing steps may include resizing, normalization, and augmentation. Resizing involves adjusting the image size to a specific dimension, while normalization involves adjusting the image pixel values to a standard range. Augmentation involves creating additional samples by applying random transformations to the original images, such as rotation and flipping.

3.3 InceptionV3 Model Architecture

The image depicts the increase of accuracy over number of steps while training the InceptionV3 model.
Figure 2 Training & Validation Accuracy

The Inception V3 model architecture consists of multiple layers of convolutional and pooling operations, followed by fully connected layers. The model uses the concept of "inception modules," which are designed to capture features at different scales and resolutions. The Inception V3 model has shown promising results in detecting defects on steel surfaces, and thus, will be used in this study.

3.4 Model Training

The image depicts the decrease of loss over number of steps while training the
Figure 3 Training & Validation Loss

The pre-processed dataset will be used to train the Inception V3 model. The training process involves feeding the images to the model, and the model learns to identify the defects by adjusting its weights during the backpropagation process. The model will be trained using an optimization algorithm such as stochastic gradient descent (SGD).

3.5 Model Evaluation

Accuracy of different models have been mentioned in the picture.
Results: -
InceptionV3 96.7% Decision Tree 86% Random Forests 88% PCA & SVM 87% Xception 94%
Figure 4 Accuracy Comparison of Different Models

To assess the effectiveness of the proposed Inception V3 model for Indian Sign Language detection, the model will be evaluated using various metrics. The evaluation process will involve testing the model on a set of images and comparing its predictions with the correct labels. The results of this evaluation will be compared with those of existing methods to determine the performance of the proposed model.

The study methodology consists of several steps, including collecting a dataset of Indian Sign Language images, pre-processing the dataset to remove unwanted information and noise, training the Inception V3 model on the preprocessed dataset, and assessing the model's performance using various metrics. The overall goal of this methodology is to develop a reliable and accurate method for Indian Sign Language detection.

4. Results 


The figure depicts the given input is for letter ‘A’ according to Indian Sign Language
Figure 5 Predicted Character 'A'

Accuracy, precision, recall, and F1 score will be used to assess Inception V3's ability to recognise and predict the images to corresponding classes. Accuracy quantifies how well an image classifier performs at making right identifications, whereas precision evaluates how many times a prediction is correct out of every time a prediction is correct.

The figure depicts the given input is for number ‘3’ according to Indian Sign Language
Figure 6 Predicted Character '3'

The results obtained from the evaluation of the proposed Inception V3 model will be discussed in this section. The discussion will include an analysis of the model's performance in predicting characters from the gestures, as well as its ability to generalize to new datasets. The discussion will also compare the model's performance with existing methods and highlight the strengths and limitations of the proposed approach. 

Overall, the results and discussion section will provide insights into the effectiveness of the proposed Inception V3 model for predicting Indian sign language. It will also discuss the implications of the study for practical applications and future research directions.

4. References

  1. M. &. K. G. (. Singh, "Sign Language Recognition using Deep Learning Techniques: A Review," Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1-6, 2020. 
  2. R. &. M. D. Mishra, "Indian Sign Language Recognition System using Machine Learning Techniques.," Proceedings of the 2019 3rd International Conference on Inventive Systems and Control (ICISC), pp. 1139-1143, 2019. 
  3. M. &. S. D. Samal, "Sign Language Recognition using Convolutional Neural Network: A Review.," Proceedings of the 2021 2nd International Conference on Innovative Computing and Communication (ICICC), pp. 97-102, 2021. 
  4. S. &. K. G. Singh, "Sign Language Recognition using Machine Learning and Deep Learning Techniques: A Review.," Proceedings of the 2021 International Conference on Innovative Computing and Communication (ICICC), pp. 111-116, 2021. 
  5. M. &. K. G. Singh, "Indian Sign Language Recognition using Machine Learning Techniques: A Review," 2020.