Keywords

1 Introduction

According to the study presented by the World Health Organization (WHO) 2018, approximately 1.3 billion people suffer from some form of vision vitiation. Visually impaired people face many problems in their day-to-day lives. People with disability cannot move from one place to another independently. Traditionally, these persons use guiding sticks for detecting objects in front of them. When visually challenged people navigate in an unfamiliar environment, a system which assists or guide such people is needed [10].

  • Furthermore, visually disabled people require assistance in the form of a volunteer to be guided in an unknown environment [8].

  • Our objective - To develop an application for blind people.

2 Ease of Use

2.1 Application Requires

This app is designed in order to make the day to day chores especially for blind people easy. Application of visually challenged peoples with minimal interface complexities. The application should be able to track objects. Since the person is blind, the detected objects would be informed to the user by voice. Additional features like location tracer, the ability to detect logo and expiry date analysis of a product, etc. [5] (Fig. 1).

Fig. 1.
figure 1

Application flow diagram

2.2 Relevance of the Project

The application developed can detect the objects in the user’s surroundings. It can alert the user of the obstacles in his pathway and this way helps the user to navigate from one place to another saving him from tripping anywhere. It will also solve the problem of keeping a special device or a walking stick. The reason it is more reliable is that it is developed on the Android operating system and Android-based smartphones are very common and highly available almost everywhere [7].

3 Related Work

3.1 Smartphone Application to Assist Visually Impaired People

This paper delineates about the smartphone application for visually impaired people which is based on similar technology as proposed for our project. This application uses various sensory modules which detects the obstacles and thus offers more precise guidance in a certain direction. The technology which we are going to use for object detection would be Deep Learning. The image captured by the camera will be processed in real time and the object would be identified.

One feature of this app is OCR. This is used to scan a document and convert it into text which is then converted to speech using TTS. The method includes Text To Speech (TTS) conversion of the scanned image using character recognition techniques. The author of this paper has discussed using technologies which are low cost and are portable. As discussed above authors have used sensory MEMS modules for implementing the android application.

After all the tests were done the following system performance was found. They were as follows: The interface of the application was user-friendly, even though the ambient noise was high the communication did not tamper, just because a visually challenged person had good hearing abilities, the navigation in indoor premises required the user to travel slowly. In some areas where the phone signal is weak, it is required that the speed should be enough to convey proper information. The authors have also written about how this system differs from the existing systems and how it is superior to them. They stated that this aiding application with the help of small external sensory modules proves to be a viable solution [1].

3.2 A Review of Object Detection and Tracking Methods

This paper consists of a typical object tracking framework, generally consisting of three modules: Object Detection, Object Modeling, and Object Tracking. They interact with each other during a tracking process. These are discussed in detail in the following sections as follows:

Object detection, a prerequisite for initializing a tracking process, refers to locate the object of interest in every frame of a video sequence. There are generally two approaches of object detection strategies commonly used to initialize a tracking process: manually locating the object in the first frame and let the system detects features, such as corners, to track the object in the next frame and automatic detection of the object using predefined features, such as color.

There are many techniques to detect moving objects:

  1. 1.

    Background Subtraction

Background Subtraction is widely used in video sequences having a static background. The method divides the extracted source i.e. the image into foreground and background. The foreground contains moving objects such as moving people, cars while the background contains static objects, like road, building, trees, stationary cars, etc. So in this technique, they have used a reference background image which is first captured when the objects of interest are not present in the scene. Thus by extracting the current image frame from its background the object which is moving is found. The resulting image has values below a predefined threshold in the background area of the current image except the area occupied by the object.

  1. 2.

    Temporal differencing

Temporal differencing is a method most suitable for situations where the camera is in motion. It detects objects by taking differences of consecutive frames (two or three), pixel by pixel. In a moving camera situation, the movement of the camera and the object are mixed up. Therefore, some researchers proposed to estimated and adjust camera motion first and then apply the background subtraction method. This method fails to detect the overlapping areas of the moving objects and wrongly detects the trailing region of the object, known as ghost region, for a fast moving object.

  1. 3.

    Optical flow

Optical flow is another technique to find moving objects in video frames. It gives a two-dimensional vector field, also called motion field, that represents velocities and directions of each point in consecutive image sequences. The image is segmented into sections due to the discontinuities in the flow. The method, being computationally expensive, has an advantage that it can detect motion in video sequences having a dynamic background.

  1. 4.

    Object detection

It can be done by training a classifier that learns different object views and appearances by means of supervised learning methods. After a classifier is trained, the decision is made on the test region whether it is a target object or not [2].

3.3 Smart Guiding Glasses for Visually Challenged People in Indoor Environment

A multi-sensor fusion based obstacle avoiding algorithm is proposed, which solves one of the major problems that is detecting small obstacles and transparent obstacles e.g. the French door which also includes glass doors, using depth and ultrasonic sensors. Three kinds of auditory cues were developed for the ones who are completely blind so that they are given information about the direction where they can move ahead. Whereas for weak-sighted people, a visual enhancement which leverages the AR (Augmented Reality) technique and integrates the direction that is capable of being traversed and then adopted. The results of various experiments show that smart guiding glasses can efficaciously improve the user’s traveling experience in an environment such as a college corridor that is complicated indoor and such everyday places. Thus our system serves a purpose of helping visually impaired people to ease their day to day casual tasks.

The Smart guiding device in the shape of a pair of eyeglasses for blind people for giving guidance efficiently and safely.

Though the ultrasonic method can measure the distance between the objects it cannot determine the accurate direction and suffers through inference problems when the system is tested in a local environment. Laser scanner based method is of high precision and resolution and hence are highly used in mobile robot navigation. However, they are expensive and heavy. Another disadvantage of high power consumption makes them unsuitable for mobile applications. As for camera-based method, there are many choices such as stereo-camera, mono-camera, and RGB-D camera [3].

3.4 Navigation System for Visually Impaired People

This paper is based on a navigation system which comprises an indoor and outdoor positioning of the common object detector system for detecting the position of the user. Most of the outdoor navigation systems use GPS for positioning the objects. Unfortunately, GPS can only be used outside environment basically outdoor environment because the employed radio signals cannot pass through solid walls. Navigation systems in an outdoor environment generally depend upon GPS signals and for indoor systems, it depends upon different methods for finding the position of the user, as GPS signals cannot be received in indoor premises because it does not cover large walking area range for the end users. It is possible that in the indoor environment there may exist things that are quite near to each other to be able to get distinguished.

Currently, indoor navigation systems always use radio signal for positioning of the common objects, which may experience the signal impairment problems, such as Radio Frequency interference and multipath propagation. A location finding system with talking assistance is for both navigations of the indoor and outdoor environment. The System consists of a walking stick with a GSM module for sending a message to the authorized person at the time of tragedy, RF transmitter and receiver, and sonar sensors. RFID is used for indoor localization and GPS system is used for outdoor localization. Thus, this walking stick with GPS system decreases the installing costs of many RFID tags in outdoor for place identification. “Drishti” is a technique based on GPS which can switch the system with a simple vocal command from an indoor to an outdoor environment and vice versa. Authors extend indoor version of Drishti to the outdoor versions for blind pedestrians with the addition of two ultrasonic transceivers to provide a complete navigation system, which is smaller than a credit card and are attached on user’s shoulder [4].

The table given above summarizes all the approaches proposed by different authors (Table 1).

Table 1. Analysis of various navigation applications for visually challenged people.

4 Conclusions and Future Works

We started with the motivation and the idea to solve the problems of visually impaired people. We found many methods to implement object detection and found the usage of OpenCV Library and Google Cloud Vision API as the best choice. Our project will be developed on Android and since it is developed by Google, there would be almost no compatibility issues.

Our expected result will be an android application which will be having a user-friendly interface so that a blind person can easily use it for navigation and for different purposes as well. During navigation, the objects that are detected by the application are made known to the user by an automated voice. Thus the user will come to know about the obstacle ahead and can navigate accordingly. It will be also used for knowing the details of a product after it’s details are scanned by the camera of the phone. We will be adding a few audio clips of books so that the blind people can hear them in their free time. We will also be using OCR (Optical Character Recognition) for reading and recognizing the printed text and then converting it into automated voice [9].

When we compare and look at the other traditional methods like a modified guiding stick, our application provides more data and functionalities for the visually impaired person using this app. It is a user-friendly application with many functionalities. We will be training our model for various day to day life obstacles in the path of a blind person and provide a huge dataset for the same so that the object recognition is achieved efficiently and successfully. The percentage of recognition will be higher than any other applications [6].

The proposed system will be compatible with different environments. Users don’t need to worry about unfamiliar areas. It will warn users of the obstacles ahead. Feature extraction is one more important feature of this application which has many uses. We are working on to improve the recognition efficiency of this application so that every object is detected accurately and the user won’t get any kind of injury or something because of false detection of an object [12].

The point of comparison between various papers is to understand the drawbacks of existing applications. Various points are as follows:

  • Smartphone application which is already existing works on manual assistance while our application will be using neural networks to make entire system automatic.

  • Smart guiding glasses uses a complex IOT based hardware whose only concept of depth acquisition will be used in our application.

  • The object detection and tracking paper helps us to understand

    • Background subtraction for detection

    • Various visual tracking methods for classification

    • Feature-Based tracking.

Hence we are eliminating the use of manual assistance and creating an application entirely by using deep learning. From the above surveys between various papers, we found that the YOLO model for detecting various objects is the best way out of all the ways as it is fast and accurate and also gives results in very less time. Also, the paper suggesting the depth calculation is very useful as the visually impaired person can get the idea of how far the obstacle is from him/her [11].