Introduction

Vision allows people to perceive and understand the surrounding world. However, according to WHO, 36 million people are estimated as blind worldwide [1]. It is a condition where the person lacks discernment. Researchers have been focusing on this problem and developed several aiding tools for blind/visually challenged people. Among them, the most popular assistants or tools used by visually challenged/impaired people are white canes and dogs. However, their usage has limitations in the form of coverage, efficiency, and speed. Recently, walking assistants based on sensor technology has become popular. These devices sense the surrounding environment using special sensors like cameras, ultrasonic devices and help in guiding the user with greater speed, coverage, and reliability. In general, the assistance technology has three important functions: (1) sensing, (2) processing, and (3) alerting. Sensing is the basic function done by any assistant tool to know the surrounding environment. Popular sensing devices used are (1) image processing devices (RGB camera, Depth camera etc.), (2) non-image processing devices (ultrasonic, lasers etc.). After capturing the information about the surrounding environment, it is processed and enhanced to guide the user. Finally, this processed information is used to alert/guide the user with the help of synthetic speech or vibrations.

Over the past few decades, many researchers developed various efficient techniques and devices for aiding visually impaired and blind people. Shovel et al. [2] developed ‘Navbelt’ an obstacle avoidance system which is only for indoor navigation. Yuan et al. [3] discussed about a device that acts as a virtual white cane. It helps in measuring distance and path uncertainties using a jump-Markov model. Bolgiano et al. [4] introduced a laser cane using high-efficiency G.E. lasers for efficient detection of obstacles. The device even has a range adjustment from 3 to 12 feet. Sabarish et al. [5] developed a microcontroller-based navigation aid that can provide feedback to the user using vibrations and speech. Pooja Sharma [6] presented a review of popular obstacle detection techniques and the limitations arising due to angles and distance of obstacles. All the above-discussed ideas are not completely reliable and have limitations.

In general, walking aids are classified into three groups [1] such as:

  • Sensor-based walking assistants: In these systems, the sensing is done using ultrasonic sensors, which gives good results for high-density surfaces. Cardillo et al. [7] proposed an electromagnetic sensor device using microwave radar that notifies the presence of obstacles in the way. Kwiatkowski et al. [8] investigated a radar-based navigation tool that can be used in unfamiliar surroundings. Patil et al. [9] proposed an ETA (Estimated Time Arrival) named Navguide that aids in avoiding the obstacles in the way. Bai et al. [10] proposed a low-cost wearable device that guides the user to the destination by avoiding obstacles using a sub-selection strategy. Kaushalya et al. [11] developed an assistant called AKSHI which uses RFID reader and tags, ultrasonic sensor, GSM module, Raspberry Pi 2. Bhatlawande et al. [12] proposed an electronic mobility cane. Zhou et al. [13] developed a smart system called Smart Eye which consists of an embedded wearable sensor and smartphone modules.

  • Image processing-based system: the design is based on wearable vision assistance system for the visually impaired using image sensors like webcams, binocular sensors. Aladren et al. [14] introduced a novel navigation system using a depth sensing camera that acquires visual and depth details of the vicinity. Yang et al. [15] proposed a real-time assistance system that utilizes parsing networks and 3D segmentation. Mekhalfi et al. [16] developed a prototype that offers indoor multi-object recognition using machine learning algorithms.

  • Smart phone based walking assistance systems: they are similar to that of computer vision-based systems. In these systems, the smartphone camera or sensors help in detecting objects. Cheraghi et al. [17] presented a model called Guide Beacon, which is primarily used for indoor environments. Tepelea et al. [18] discussed about a smartphone-based navigation system which utilizes MEMS integrated into a smartphone.

Proposed System

System Description

Considering all the types of walk assistance systems, the proposed system is a combination of traditional stick and computer vision-based technology. Since, it is a combination of two different techniques, the proposed system is named as hybrid system. It works on the process of object detection, object identification, and feedback. The hybrid system consists of all the components which are simple and easy to work with. The proposed system also provides easy adaptability and reliability for the user as it still embeds traditional stick. The computer vision algorithm is implemented on Raspberry Pi and the whole system can be described using the following block diagram (Fig. 1).

Fig. 1
figure 1

Block diagram of hybrid system

  • Ultrasonic sensor: Ultrasonic sensor also known as a distance sensor is used to identify the distance from an object identified. These are designed to sense object proximity or range using ultrasound reflection i.e. sound waves. It is mainly used because of its inaudibility to the human ears and accuracy within short distances. It sends the high-frequency ultrasonic sound waves through the transmitter which are reflected by the object. The object’s distance is obtained using the time taken for the reception of the echo signal. The ultrasonic sensor being used in this paper is HCSR04.

  • Pi camera: Pi camera is a cheap and efficient camera module that supports Raspberry Pi. It is normally used in image processing, machine learning. It is used for capturing images required for object identification. By using digital image processing, the images captured by the camera are processed. Here a 5MP camera is used.

  • Raspberry Pi model 3B: It is an embedded board that is responsible for carrying out the instructions through logical and mathematical operations. Here it is used to perform the object detection and identification to give audio output. The GPIO pins are used to connect the peripheral devices to the board. All the required connections are explained in the coming sections.

  • Switch and stick: The switch used here is a push button which is attached to the traditional stick used by the user. This switch gives ON signal to camera when it is pressed. By default, the user takes the help of the stick in walking through uneven surfaces and obstacles. When the user wants to know the details of the obstacle or object then the switch is pressed.

  • Audio output: The audio output is used to give the audio feedback signal after the object identification. It can be earphones or a speaker according to user preference.

  • Power supply: It is essential for the working of Raspberry Pi. The power supply of 5 V, 2.4 mA is used which can be in the form of a portable battery.

Software Requirements

  • NOOBS is a software which is required for the functioning of Raspberry Pi. It installs all the python packages required for the Pi programming. It is provided officially by the Raspberry Pi organization. Using an SD card, we install NOOBS to Pi module.

  • Tensor flow lite model, SSDLite-MobileNet-v2, trained on MSCOCO dataset is used for object detection and identification. In Raspberry Pi, to avoid the errors due to different software, we installed a virtual environment by using pip install:

    sudo pip3 install virtualenv–

  • All the codes for object detection and identification are performed in this virtual environment. The entire procedure is inspired from [19]. The requirements for the program to be executed is done by using the bash installation that contains all the library files required for object detection. When the object is identified, and label is obtained it is to be given in the form of audio output thus we use gTTS.

  • gTTS is the text to speech library which is most commonly used for the conversion of text to speech. gTTS works on python3 with the combination of mpg321, a speech library. It also supports various languages and speeds. But, the present version does not support changing the voice of audio.

Hardware and Connections

All the peripheral components are connected to the Raspberry Pi using GPIO pins and the brief details on connections are given as follows:

  • Ultrasonic sensor has 4 pins. VCC is connected to + 5 V which is PIN 2. TRIG is connected to PIN 7 (GPIO 4). ECHO is connected to PIN 11 (GPIO 17) through a shunt resistance connection. GND is connected to PIN 6.

  • Switch is connected to PIN 1 which is + 3.3 V and to the GND PIN 6.

  • Ear phones and Pi camera are connected to the respective ports provided on Raspberry Pi board.

  • Any other devices like display, Ethernet etc., can be connected to the ports provided on the board.

Working

The algorithm of working is given as follows:

Step 1: Equipment is installed properly, and power supply is provided.

Step 2: When an obstacle is detected, distance is obtained from the sensor.

Step 3: When the distance is less than threshold or the switch is pressed, the camera is activated.

Step 4: The camera gives the input to the object detection program, where the object is identified.

Step 5: The detected object label is given as the output through the audio device (earphones).

The flowchart for the same is given in Fig. 2.

Fig. 2
figure 2

Flow chart

From Table 1, one can see that most of the systems are mainly developed for the indoor navigational purposes and are not suitable for object recognition. They are also not cost-effective and under performs in low light conditions. The proposed system is low cost, works in indoor and outdoor environments and can be used for navigation and object recognition system for visually impaired people.

Table 1 A comparison of existing methods and proposed system

Results and Discussion

Figure 3 shows the prototype of the proposed system. The system is built in such a way that it can be placed near the neck region of the user with a switch attached to the stick (as shown in Fig. 4). Earphones are used for auditory feedback to the user.

Fig. 3
figure 3

Prototype of the system

Fig. 4
figure 4

Human model with the hybrid system

Figure 5 presents the screenshot showing the measured distance using an ultrasonic sensor. This is the distance used for decision-making to decide whether it is less than the threshold or not. If the threshold condition is TRUE then it will run the object detection program and sends the auditory feedback of the detected object. Figure 6 presents the screenshot when an auditory feedback is generated by the object detection program. A similar type of auditory feedback is also produced when the switch on the stick is pressed by the user.

Fig. 5
figure 5

Output of ultrasonic sensor

Fig. 6
figure 6

Screenshot showing details of auditory feedback

Figures 7 and 8 present the output generated by the object detection program. In Fig. 7, we can see that the program is able to detect the person. Similarly, Fig. 8 presents the details of the objects detected. From both the figures it can be observed that the object detection program is able to successfully detect the objects/obstacles.

Fig. 7
figure 7

Screenshot showing person detection

Fig. 8
figure 8

Screenshot showing object detection

Conclusion

We combined multiple techniques of the previous designs to configure a device for the visually impaired which helps to know the presence of obstacles in indoor and outdoor surroundings. This system also acts as both a navigational and object recognition system. The object/obstacle detection is done using Pi camera and ultrasonic sensor, respectively. To provide the audio feedback gTTS library and earphones are used. From the results, it can be observed that the proposed hybrid system can successfully detect the objects/obstacles present. This aids in the safe movement of visually impaired people. Sometimes the accuracy of the output decreases due to the noise and disturbances in the surroundings. Thus the effect of noise and proper object detection under the noisy environment can be taken as future work. Also, for better performance, the model should be trained for larger data sets containing regular objects and obstacles.