Keywords

1 Introduction

There are poverty-stricken regions in our country where the necessities are neglected or are not provided. These issues cannot be resolved by a bot. There comes a time back in 1966 when the first-ever Chatbot was created ‘ELIZA’ [1]. It was capable of mimicking human conversation using pattern matching. Be that as it may, a bot can assist by improving a circumstance. The genuine intensity of a Chatbot is to give data just on your order, which could help one lead a better life, a healthier life and install people with basic knowledge of proper health care. A sufficient number of populations need information about safe sex and have no mindfulness about an infection that is explicitly transmitted, because it is as yet considered as a No–No in the family to discuss sex. According to a WHO report, very nearly 11 million individuals infuse drugs, out of which 1.3 million people are surviving with HIV. It is additionally realized that a great region of the total populace does not have a clue about the right use of essential medications and anti-infection agents, also according to WHO [2] approximately 31 million people across the globe each year suffer from drug use disorder, which at later stage prompts clinical maltreatment and by implication cause to be in certain conditions the infused therapy more or less ineffective. In a recent report, it is proved that artificial intelligence is playing a crucial role in battling against these chronic diseases which start at a small level and grow up to be life threatening [3]. Internet work and the entrance to enormous chunks of medical resource are proved revolutionary to medical science and artificial intelligence helping to curb and cure people to live a substantial and better life. Bots are intelligent agent formed by deep learning and machine learning as a parent domain by this train and test our dataset to derive our accurate results and predict the accuracy of different models imposed on it. In this study, we have built a Health-Bot using Keras classifier and RN network to help others in the health sector [4]. The primary domain of our bot is in healthcare domain to provide people with their necessities.

Organization of the paper is as follows: Sect. 1 includes previous work in this field aka related work; in Sect. 2, we discuss various methodologies, designs, activation functions and neural networks applied; Sect. 3 walks us through features offered by our model; in Sect. 4, we talk about result generated methodology. Next, this work is concluded and future prospects of the presented project.

2 Related Work

Chatbots, or Health-Bots, interact with humans in a humane way. The presence of Chatbot has been increased significantly and has a great potential to be used in education [5], entertainment [6] and the public sector like this one. The development of Chatbot has increased by many folds. Chatbot respond to the doubt or query related to health which a person is afraid to discuss; recently due to the progress in development, there are many methods to implement Health-Bot visually by using web frameworks such as Django. Bots designed using deep learning [7] have a vast amount of data to train. In this, we have used recurrent neural networks (RNN) for the understanding of both encoding and [8] decoding. Eliza was the first Chatbot created by the Massachusetts Institute of Technology. On the off chance that a patient said ‘my head hurts,’ Eliza would react with, ‘Why do you say your head hurts?’ Eliza, with just 200 lines of code, works like a [1] therapist. It is a typical inclination to have a human association in practically the entirety of our day-by-day exercises; be that as it may, innovation enhances our capacities. Today, on account of AI and NLP, we can utilize Chatbot innovation to give practically human-like discussions. The discussions are fueled by AI (man-made reasoning). A decent human services foundation is the basic for any country’s metro life, and the must wipe out be conceded an indigenous arrangement to get to better social insurance, without the need to hang tight for quite a long time or months only for a little while. These calculated issues can easily be solved with the help of the Internet work and access to large chunks of medical resources—which are primarily free. These intelligent personal assistants (IPA) on our phones suddenly become definite responses for certain needs which are supported by machine learning and neural networks. It is close to unfeasible to determine these troubles with a bot, be that as it may, a bot can assist by improving the circumstance. The genuine favorable position of these Chatbots is the capacity to give appropriate direction and data to have a sound existence, as there are numerous individuals who despite everything do not have the essential information on legitimate social insurance [9]. AI is the parent space from which profound learning is inferred, which when joined with calculations of structure and working of the human mind clears approach to fake neural systems. Profound learning design comprises neural systems comprised of neurons, initiation capacities and loads that learn on their utilizing learning calculations [10]. Bots are only savvy operators dwelling on a worker to speak with people or different bots to make the human assignment a lot simpler, without the need of a particular convention or APIs nor with any ‘ace bots,’ for example, Google Assistant [11]. They convey in plain English, and profound learning makes them more precise in tossing the proper reaction to the given inquiry. In this examination, we have manufactured a relevant Chatbot utilizing TensorFlow [10] and Python to contribute to the well-being area. Our bot is equipped for diagnosing the medical problem, recommending a suitable doctor, giving updates about medicine and making an online meeting with the doctor. The most significant and essential advantage of Chatbots in the medicinal services area is the preeminent capacity to give exhortation and data to a solid life to help those individuals who need fundamental information on social insurance (Fig. 1).

Fig. 1
figure 1

Recurrent Neural Networks

3 Methodology and Design

3.1 Neural Networks

Neural networks have a main working concept which is to allow distribution of activity throughout the link with the help of a learning algorithm that is similar to the working of the human brain. Neural networks are trained and tested by a simulator [1]. It works along with the definition of neural network topologies. Neural units are of three types: input, hidden or output units. The relation that is connecting them is unidirectional, even though it supports recurrent links. There are many algorithms used for training a dataset that are available. Standard back-propagation and momentum back-propagation are only used. A pattern file directly receives the values of the input. Result file gets passed the output unit values, or values from pattern files can be received by it. Many input values and linked output values are contained in a pattern file. The simulator can also support a dynamic mode [13]. N-gram model can be implemented using these features. Values from the previous units can be received by units and hidden units. Further changes to that value are specified by the output function.

3.2 Recurrent Neural Network

A sequence or a collection of most frequently occurring consecutive words are fed into an RNN. It analyzes the data with the technique of finding the words occurring more frequently and creates a model that predicts the next or upcoming word in the sentence, i.e., it auto-fills the most probable data [14].

Recurrent neural networks are preferred because in feed forward neural network, it only considers the current input and cannot memorize previous outputs. So, in RNN [15] it memorizes what is going on the hidden layers and that produces a data to feed into the next one. Therefore, it allows to handle sequential data.

Types of RNN

  1. 1.

    ONE TO ONE NEURAL NETWORK: one to one type of RNN is also known as the most basic (Vanilla) form of artificial neural network. It is required for regular machine learning problems.

  2. 2.

    MANY TO ONE NEURAL NETWORK: Many to one neural network takes in a sequence of inputs. For example, in sentiment analysis where a given sentence can be classified as expressing positive or negative sentiments.

  3. 3.

    MANY TO MANY NEURAL NETWORKS: Many to many networks takes in a sequence of outputs, for example, in machine translation.

3.3 Label Encoding

Sometimes, our label is not a number but a string. We want to convert these strings to numbers that start from zero and one. If the classification is three classes, then the label is zero, one and two. Label encoder can help encode labels with a value between zero and n_classes − 1.

3.4 ReLU Activation Function

The ReLU function also can be expanded as rectified linear unit function. It is signified as follows:

$$R\left( x \right) = \max \left( {0,x} \right)$$
(1)

ReLU function basically avoids and solves vanishing gradient and removes the negative part [12].

$$\Rightarrow y = \max \left( {0,o_{i} } \right)$$
(2)
$$\Rightarrow \arg \max f\left( x \right)$$
(3)
$$\Rightarrow i \in 1,2, \ldots ,N$$
(4)

3.5 Softmax Activation Function

Softmax activation function is similar to the sigma function. A sigmoid clamps the value in between zero and one, but it does not represent the probability of something happening. In this function, we take the sum of the wakes times the previous output and plug that into the sigmoid function. This sum that we just quoted is like before we put it in an activation function. So, there is a need to find an activation function that can deal with real probability. That is where Softmax activation function comes handy. It does not really have a proper graph [16].

Softmax work is equivalent to the exponential of the component at position ‘k’ partitioned by the whole of the exponentials of all components of the vector. It is unique in relation to other actuation capacities which is on the grounds that while the other enactment capacities get an info esteem and change it paying little mind to different components, and the Softmax considers the data about the entire set of numbers one has. In this sense, Softmax is special because the output depends on the entire set of elements of input. A key part of Softmax transformation is that the estimations of yield are in the range from 0 to 1 and their aggregate is one [17]. The point of the Softmax transformation is to transform a lot of discretionarily big or small values that come out of previous layers and fit them into a valid probability distribution. This makes everything so intuitive and useful that the Softmax activation function is often used as the activation of the final output layer in classification problems [12].

$$p_{k} = \frac{{\exp \left( {o_{k} } \right)}}{{\mathop \sum \nolimits_{{k = 0}}^{{n - 1}} \exp \left( {o_{k} } \right)}}$$
(5)

3.6 Keras

Keras is a Python-based deep learning framework which is widely used. It basically runs on top of TensorFlow and is very simple to work in as building models in Keras are as simple as stacking layers and later connecting these graphs. It is an open source project which is actively developed by developers and contributors across globe, also the documentation offered is vast and new features are added almost daily. It reduces the cognitive load which ensures that that the APIs are simple and consistent [12]. Keras provides clear feedback upon occurrence of any error, and this minimizes the number of user actions required for the majority of the common use cases. Keras also provides high flexibility to all of its developers by integrating with the lower-level deep learning framework languages like TensorFlow or Theano. This ensures that you can implement anything that you actually built in your base language. Keras supports multi-platform and lets its users work with multiple back ends. It feels like a tailor-made API for framework. The code can be run on the CPU or the GPU as well. Producing models on Keras is very effective and beneficial as it has total support to run with TensorFlow serving, GPU acceleration, example CUDA, native support to develop Android and iOS apps using TensorFlow and core ML and a full-blown support to work with Raspberry Pi as well.

Working Principle

Features: Computational graphs are used for expressing complex expression as a combination of simple operations for Keras to work with. It is mainly useful for calculating the derivatives during the phase of back-propagation and hence it makes it easier to implement distributed computing on a whole [18]. So, all it takes is to specify the inputs and outputs and to make sure that the graph is connected throughout (Fig. 2).

Fig. 2
figure 2

Data Pipelining Model

3.7 Sequential Model

The working of the sequential model is like a linear stack of layers. This model is majorly useful for building simple classification network and encoder/decoder model. So, here we treat every layer as an object that feeds into the next layer and so on [19]. Features: model.fit() is used to train network. Bunch size is the quantity of training models in one forward and in reverse pass, so higher the batch size, the more memory you require.

Functional Model: It is a widely used model and holds good for about 95% of use cases. This model supports multi-input, multi-output and arbitrary graph topology. It has branches so wherever there is a complex model, it is folded onto two or more branches based on the requirement.

4 Features

  1. 1.

    Build an interactive real-time chat system [20]

Chatbots give customers a more amicable encounter. It has been very much wanted that individuals lean toward the feeling of discussion and association in which Chatbot furnishes overlooking with a mouse and snap. Regardless of whether it is something profound established in our mankind that we anthropomorphize, the truth of the matter is that individuals feel more joyful with a correspondence experience by means of a bot than something else. Clients have to search a ton and bother around which is disappointing and tedious; however with Chatbots, they get a human language-type communication. This is significantly more agreeable for patients as the cooperation is refined, giving them a customized proactive encounter. Through Chatbots, answers are gotten rapidly and productively. Who has the opportunity to be required to be postponed on the telephone or hustled starting with one office then onto the next? None of us. We need the data as fast as could be expected under the circumstances. A Chatbot spares time, letting loose patients for other activities [20]. There are two types of Chatbot’s unintelligent ones that act using predefined conversation flows written by people and intelligent AI Chatbot’s [21] that use machine learning.

  1. 2.

    Made for all types of OS devices

The services [22] must be available at all times on any type of operating system. Hence, this app will be made on Flask/Django web framework which will intend that it can be used on computers and on mobile platforms such as macOS, iOS, Android and Windows such that more and more can be connected to this application and can be benefitted from it.

  1. 3.

    Effective symptom-based disease prediction

As we all know that each disease at some later point shows its unique kind of problem which might convert into a life-threatening disease if not treated or judged early, the most common diseases can be easily identified by analyzing the symptoms. The symptoms can be anything like headache, itching, etc. So, by reading the symptoms and analyzing them, any possible health problem can be predicted [23], if any. If a person’s body is analyzed periodically, it is possible to predict [24] any possible problem even before they start to cause any damage to the body.

  1. 4.

    Easily integrable and updatable

The system should be integrated [24], which signifies it has many individual modules since we use different types of modules which is used in performing a task, so they can be upgradeable individually. This will be helpful in increasing the efficiency of the system which will help to predict the data better.

5 Result-Generated Methodology

Initially, we created a custom dataset by consulting various healthcare practitioners where we asked them about commonly occurring illnesses and diseases and created a dataset out of it. We feature engineered our data as in machine learning one of the challenges is selecting the best features which are most appropriate and suitable for.

Figure 3 gives us the graphical representation of the dataset how it is mapped to identify the symptom with the help of its ID. This shows the plots of various symptoms ID (i.e., 1, 2 and 3) on the Y-axis and Disease ID on X-axis. This gives us the graphical representation of the dataset how it is mapped to identify the symptom with the help of its ID.

Fig. 3
figure 3

Scatter plot representation of symptom ID w.r.t Disease ID

The type of algorithm you are trying to model, also many times the existing features, is not enough which results in re-engineering new features to be used to train the machine learning model which can be called feature engineering (FE). Then we label encoded our mapping of the symptoms to diseases [Sect. 3.3]. Once we got our data label encoded, then we implemented the recurrent neural network model. Recurrent neural network model [Sect. 3.2] is a type of neural network, and they require data to learn. The more the data is provided, more accuracy will be given back. On RNN, we used Keras framework [Sect. 3.6] which runs on top of TensorFlow. In this RNN model, we used activation functions like ReLu and Softmax. This function is attached to each neuron in the network and determines whether it should be activated (‘fired’) or not, based on whether each neuron's input is relevant for the model’s prediction. Also in recurrent neural network (RNN), we used sequential model as it is used in problems related to classification. Here we treat every layer as an object that feeds into the next layer and so on. Once the symptoms were mapped with the disease, corresponding precautions and descriptions were displayed. Then using JSON file, we implemented a chat function to engage with the user. By doing this, the Chatbot will calculate the similarity between the trained text sequence and the user’s input. We integrated our chat application with StreamLit which is an open-source framework for machine learning and deep learning models to make it effective for real-world users.

6 Conclusion

In this paper, we have effectively actualized a powerful Health-Bot which is where we also were successfully able to implement a train test model from sklearn model selection, where 20% of the model was given for testing and the rest 80% for training our model. Then we were able to implement cross-validation where we resample the procedure to assess the AI model on a constrained information test. Hence, we also used the K-fold method where k represents the number of folds. Further, we were successfully able to implement the Keras framework on the sequential model. In the coming years, prospects of Chatbot or Health-Bot specifically are very high by looking at the present needs of the people. The way this industry is prospering it is more prominent to be seeing it in other people’s life. In addition, there are further prospects that Chatbots should be offering setup with different regional languages to people who are not having knowledge of the English language and more into the regional tongue. As on looking the need of peoples knowledge of replying to what their mother tongue is, but there is much room to improve quality in terms of generalization of the chat templates by clustering similar topics and grouping similar replies and improving coherence among the consecutive chat replies by understanding the styles of replies. Also, on looking at the fact that how vast can we pull off our datasets meeting the need of people, we plan to improve the qualities of the extraction of data, the method of selecting varied diseases and extracting suited precautions.