1 Introduction

The human brain suffers from a disorder termed Epilepsy; it is a central nervous illness [1]. When the second brain stops to act normal formerly, Epilepsy arises in the brain with abnormal conditions. Around 50 million individuals worldwide likely undergo epilepsy disorder, which is believed to be a global disease. According to WHO, 70% of people can be diagnosed and make seizures accessible by predicting illness at its early stages to escape premature death rate among the patients [2]. Abnormal conditions derivate the seizure of unusual behavior in human activities. Sometimes it may be aware, but at a maximum time, loss of awareness occurs with the provoked person. The death rate during seizures is high, so the prediction of Epilepsy is considerably significant before it occurs to aware the patient and avoids the risk of losing a life. Seizures are defined as an impulsive alteration in the performance of humans due to temporary changes performed by electrons in the brain. Practically brain generates a tiny electric impulse continuously to maintain an orderly pattern. However, when there is a sudden burst, these electrons produce unexpected electric energy, leading to a seizure onset condition. Early prediction of seizures can save the life of patients. The most common medical test to diagnose Epilepsy is the electroencephalogram (EEG) (EEG) [3]. It is a mainly active diagnostic tool to study the practical anatomy of the brain for the duration of epileptic seizures attack. EEG signals are recorded to monitor brain function and investigate internal brain activity.

There are two customs of monitoring EEG signals; the first custom is by introducing the electrodes directly to the scalp [4] of the patient and recording the signals. This type is known as scalp EEG signals. EEG is done traditionally to find out the brain disorders and there are no such discomforts or harms by performing this test. Besides, it has all the desired dynamical required information that has been found useful in investigating the pre-ictal state of seizures [5]. The second category requires the implantation of electrodes inside the brain; this is ended during the brain surgery, which is known as intracranial EEG signals (iEEG). It performs well in the ictal state for the outcome of high specificity with efficient sensitivity. The iEEG signal facilitates an accurate result concerning signal-to-noise ratio and frequency band when compared with scalp EEG signal. Depending on the frequency band, unwanted waveforms typically form, which may be the reason for an increase in the false prediction rate of seizures attack during an inter-ictal state [6]. EEG signals are also used for other similar brain disorders like Alzheimer’s disease [7] and tumors encephalitis [8] to check and record the brain's electrical signals activity and its changes due to a bust of abnormal electric charges. In numerous ongoing researches, datasets of long-term intracranial EEG signals are similarly used [9]. There has been an assumption by the neuroscientist for the last few decades believing seizures are developed just a few seconds or a few minutes before the epileptic attack. However, later on, as the research started in-depth, the fact came in advancing that seizures can occur hours or a few minutes prior to the attack. So it is recommended to record and observe the brain activity signal using EEG [10] [11].

1.1 Contribution of the paper

Several researchers have previously prepared exhaustive research on epileptic seizures. However, this review paper is defined on the following contributions.

  • To prepare and stretch a clear idea of the basics of introducing epilepsy attacks to different stages and their types.

  • This paper's contribution is to compare Accuracy and sensitivity using the different methodology used in epileptic seizures prediction with the classifiers used.

  • It also provides a survey on various resources of available input datasets used for Epileptic seizures.

The manuscript is organized with a detailed introduction on epileptic seizure in section I, reflecting on the contributions and seizure occurrences. The section II is detailed with the types of seizure and seizure characteristics followed by a review section on recent development and technological enhancements in section III. The approach of datasets and inputs is discussed in section IV with the supporting methodologies and resources in section V. The supportive learning models and decision support approaches are discussed in section VI followed by a conclusive summary of the literature findings in section V.

2 Epileptic seizures and representation

2.1 Stages of epileptic seizures

The brain's function depends on electrical impulses, which enable communication with the spinal cord. However, a sudden abnormal discharge of electrical charges can cause seizures, leading to changes in sensation and behavior and making the person act abnormally. This sudden change in the nervous system disrupts normal brain function.

Epileptic seizures can be categorized into four stages: pre-ictal, ictal, post-ictal, and inter-ictal [13]. The pre-ictal stage occurs before the seizure attack, the ictal stage is when the actual seizure happens, and the post-ictal stage follows the seizure. The inter-ictal period is the time between seizures. Predicting seizures before they occur is crucial, making the pre-ictal stage particularly important for research [14]. However, false predictions can also occur. Figure 1 illustrates the different stages of seizures [12]. At the beginning of the pre-ictal state, patients may experience symptoms like mood changes, anxiety, and feeling dull or irritated without reason, which can indicate an impending seizure. Models have been designed to obtain raw EEG signals from various sources, apply noise removal techniques to enhance the required signal, and then extract and reduce features based on CSP and LDA for scalp EEG signals [15]. The signal is then classified to determine if it is affected by seizures or is in a normal condition. Several classifiers are used for this purpose, with a few discussed in the literature section of this paper. The main goal of classification is to accurately detect true positive epileptic seizures. False predictions must be minimized to ensure high sensitivity and accuracy.

Fig. 1
figure 1

Stages of epileptic seizures [12]

2.2 Types of seizures

The brain consumes disorders that are up to long-term damage. Epilepsy is one such amongst them, which occurs in the long term but for a short duration of time. Seizures are burst of electrons in the brain causing high electricity. Seizures can be provoked and unprovoked. Based on the type of seizures, Epilepsy is determined. There is a chance person can suffer from single or multiple seizures. Based on which part of the brain and how much part of the brain is affected, seizures can be classified [16] into two types: partial and generalized. Seizures occur partially or are generalized. Partial seizures are also known as focal seizures. When occurred partially then, only a part of the brain is affected by the attack. There are chances for spreading from one area to another of the brain. Clapping and chewing and biting lips are some of the common symptoms.

In generalized seizures, the entire brain is affected, leading to a high risk of loss of awareness [17]. This can impact either the right or left side of the brain, with symptoms such as jerking, rigid muscles, or muscle twitching. It is observed that children experience generalized seizures more frequently compared to adults. In partial seizures, there is no loss of consciousness, whereas in generalized seizures, unconsciousness occurs, with or without an aura. In colic seizures, the upper limb, face, and neck are affected, causing contractions. In tonic seizures, rigidity occurs, often resulting in an immediate fall. Figure 2 below provides a clear classification of seizures.

Fig. 2
figure 2

Types of seizures

3 Latest literature and developments

Epileptic seizure prediction was necessary, started decades ago by several researchers. It's difficult to predict as soon as the attack might happen, but many researchers have tried their best to make it possible for earlier prediction of seizures. However, there is a different time consumed for the prediction. Many researchers have used machine learning and deep learning methodologies to forecast seizures as early as possible. Along with the ongoing methods, researchers have started combining and using a new methodology like fuzzy neural networks. In [12], the authors proposed an EEG module with 92.23% sensitivity with an average prediction time of 23.6 min before the seizures onset and a max time of 33.46 min.

The paper tackles two main challenges in Epileptic seizure prediction. Firstly, do the noise removal from the obtained raw EEG signal and then do feature extraction. The concept of the paper was to predict the pre-ictal state of seizures before the actual epileptic seizures could occur. Therefore, the signal-to-noise ratio is performed in preprocessing step. This empirical mode decomposition (EMD) is applied to the surrogate channel. Both time domain and frequency domain were used in the research paper. Spectral features are obtained using the frequency-domain method, and statistical features are obtained using the time-domain method. A support vector machine (SVM) is used as a classifier due to its superior performance in terms of sensitivity. Data is sampled at the sampling rate of 256 Hz. The input datasets used were taken from the source CHB-MIT. But the main drawback was it had more false prediction rates.

In [15], predict time. Features extraction and dimension reduction were given a higher preference. Common spatial pattern (CSP) was used for feature extraction and dimension reduction. The author has used three prediction intervals of 60, 90, and 120 min with 3,5,10 min of interval length. The best prediction performance was obtained with 3 min pre-ictal size. LDA classifier has been used to differentiate between pre-ictal and inter-ictal states of stages. The datasets used are CHB-MIT datasets. Though the average sensitivity achieved by the model is high, the system detects a pre-ictal state just before the seizures could occur. This is a disadvantage and has to be overcome with the model which predicts epileptic seizures before they occur.

In [18], the author has proposed a system to predict epileptic seizures through computed roughly based on the probability distribution of certain positive zero crossing intervals recorded from EEG signals. The data obtained is first divided into epochs of 15 s each with the concept of no overlapping data. A histogram of several bins that contains positive zero crossing 9internals has been constructed using epoch. Bayesian Gaussian mixture model (GMM) was used as a classifier for differentiating bins with inter-ictal and pre-ictal states. The experiment was performed and tested on 20 patients of Vancouver General Hospital datasets. 88.34% of sensitivity level was obtained with a 22.5 min average time prediction of epileptic seizures. The false prediction rate was 0.155/hour. The data sets used were of short and discontinuous recordings, which have to be overcome in the future with long and continuous recordings so that the seizures can be predicted on or before they could occur. Therefore, the module was not suitable for patients with a low number of seizures occurring.

In [19], the objective of this paper was to accentuate the primary advances in employing machine learning methods for epileptic seizures prediction. The survey provides a comprehensive answer to why there is a need for machine learning techniques for epileptic seizures and even how to involve deep learning techniques. A brief introduction to neuroscience is given, and various tools used for studying the brain and how they could be used for prediction are discussed. It is important to detect epileptic seizures as early as possible to save the patient's life. The review paper provided insights by considering the aspects of feature selection, prediction techniques, and evaluation methodologies. The development of an optimal strategy is still an open research problem. The cost of data sharing and management is huge and leads to new challenges of data integrity, availability, and privacy.

In [20], the researcher gave brief details of types of seizures (partial and generalized). Epilepsy is a brain disorder that can be predicted using EEG signals, but these signals are not simple; rather, they are complex, nonlinear, and non-stationary and have a high volume of data which gives the challenge to detect true positive predictions of epileptic seizures. In this paper, the author has mainly concentrated on machine learning methodologies. The different sources of EEG datasets explained in this paper are Children Hospital Boston Massachusetts Institute of Technology (CHB-MIT), Electrocardiography (ECoG) dataset Epilepsy center university of California, The Freiburg, Born University, BERN BARCELONA EEG datasets. Different classifiers have been compared and studied in detail, and the best is nominated for the designed model. Although there are different classifiers, the Non-black box classifier (decision forest and random forest) are the most effective classifiers as they give knowledge of logic rules and are comparable to easily understood by humans. Using a classifier, high Accuracy can be obtained. The model's cost depends on the classifier, so it's necessary to choose a relevant classifier. The datasets used for the prediction are still challenging for the researcher.

In [14], the Deep learning methodology has been used by the researcher for the proposed system. The data sets used are from CHB-MIT. The designed model has an average sensitivity of 92.7% and an average specificity of 90.8%, with an average time of 21 min. The author has proposed the model in which the EEG signals have been preprocessed using filters to remove the unwanted signals. Then feature extraction is done automatically in Deep learning using a convolution neural network (CNN). The Support vector machine (SVM) is used as a classifier to differentiate between seizures and normal patients without seizures. The performance of a model is acceptable when the false positive alarm is less with high insensitivity. This paper gives high Accuracy of true positive alarm for the prediction of epileptic seizures; hence, it is an effective prediction for epileptic seizures.

In [21], the researcher has included epileptic seizure diagnoses and the prediction. A unified framework is designed in this paper for the early detection of epileptic seizures. There are two main phases in this model. Firstly, the EEG database is preprocessed by calculating signal intensity for each point of EEG signals. An autoregressive moving average (ARMA) model is used with a common null hypothesis testing for decision-making in seizure detection results. In the second phase, pattern recognition techniques are used to classify the suspicious EEG segments. A novel classifier based on a pairwise one-class SVM is used [21]. This proposed model is more robust and classifies other brain disorders that are not included in the training samples. The average time taken is 31.7 min, and the average Accuracy is 94%, with a specificity of 97.9%. The datasets used in this paper are taken from Bern Barcelona [22] and the CHB-MIT EEG database.

4 Inputs and datasets

EEG signal has now become a common and important step to acquire brain signals to detect and predict brain disease. Various research centers and hospitals publish several open-access datasets. EEG signals with and without the presence of epileptic seizures are available for research purposes. EEG signals can be recorded in two ways. One is by placing multiple electrodes on the scalp of the patients and other intracranial EEG; this is done by placing electrodes within the brain during surgery [23]. Few available data sets used for epileptic seizures research used are discussed below.

4.1 American epilepsy society datasets

As the name states, it is an American society that is prepared for recording EEG for the epileptic seizures research purpose. It is being even collaborated with the University of Freiberg. The data sets have been recorded in the form of intracranial EEG type. The electrodes are implemented inside the brain tissues during the brain surgery to record the data sets for research. It has both human and animal (dog) brain recordings. The data is recorded with 16 electrodes for different sampled frequencies concerning humans and dogs i.e. at 5000 Hz for humans and 400 Hz for dogs. The data is saved in the form of. mat files [24].

4.2 CHB-MIT dataset

Children's Hospital Boston and the Massachusetts Institute of Technology provide online data sets. It uses EEG scalp type. The data have been recorded using 23 electrodes placed on the scalp. The scalp EEG data sets have been recorded and saved in EDF (European Data Format) which can be converted. mat files in MATLAB. The datasets contain the following files, chbmit_ictal_raw_data.csv, chbmit_preictal_raw_data.csv chbmit_ictal_23channels_data.csv, chbmit_preictal_23channel_data.csv, and chbmit_preprocessed_data.csv. These files have a different number of patients with different channels used. [25] all the scalp EEG datasets are done at the sampled frequency of 256 Hz. It consists of both genders male and female patients recorded with 22 subjects.

4.3 Temple university hospital datasets

These are the datasets publically available for researchers who wish models in machine learning and deep learning. TUH-EEG Seizures [26] Corpus has a signal that uses manual EEG with carefully annotated data for seizures. The data sets can be found using the link given below. TUH EEG data sets are available for MATLAB and python software in EDF files [26].

4.4 Kaggle

Kaggle is an online platform from which the epileptic seizures data sets can be obtained for the research related to machine learning and deep learning. It provides various data sets for all kinds of research areas. It is free of cost but few are payable services, researches with a current affair have very good opportunities to use the data sets provided here. The Table 1 shows different values obtained using different datasets.

Table 1 Input datasets used with different classifiers to calculate the performance parameters

5 Methods and resources

5.1 Eeg signal analysis technique

Hans Berger, in 1923 contrived EEG signals with non-invasive functional imaging, a methodology used to study brain activity. EEG has a lower spatial resolution when compared to MRI; EEG provides a higher temporal in sigh into neural networks. Usually, five frequency bands are used in processing EEG signals. Delta up to 4 Hz: It is usually found in infants and adults in sleep mode. Theta 4-8 Hz: It is when the adults are awake and have a high value at abnormal activities. Alpha 8-12 Hz: It is seen in the posterior region of the brain. Condition is when normal and relaxed. Beta 12-26 Hz: These occur in the frontal region of the brain. It is when a person is anxious. Gamma 26-100 Hz: It may be when the person is over-stressed or happy.

According to researchers’ EEG analysis methods have been categorized into four methods, Time domain, frequency domain, time–frequency domain, and nonlinear method. In the represented module, new methodology techniques, including deep neural networks (DNNs) have been used [27]. Time-domain method: EEG datasets of recordings are a nonlinear function of time and non-stationary. The output is predicted using the present input and previous output. In EEG data most commonly used unsupervised time-domain methods are based on their dimensional features vector. Principal component analysis (PCA) [28] transforms high dimensional data to low dimensional data and independent component analysis is used (ICA) to transform high dimensional data into statistically independent components [29].

Frequency domain: epileptic seizures occur when there is sudden changes in the frequency level, which can be measured by applying frequency domain methods such as Fourier transform (FT) [30]. FT can be used parametric or non-parametric methods to estimate the power spectrum. Time–frequency domain: there are some limitations to time domain and frequency domain methods [31] for extracting features at a particular time instant. To overcome this time–frequency domain method is used; the most commonly used is the wavelet transform [32]. Nonlinear method: There will be the presence of coupling among harmonics in the spectrum of the signal. To detect this nonlinear method of analysis is used. Entropy and largest Lyapunov exponent (LLE) is widely used as features for epilepsy classification [33].

5.2 Performance evaluations

Epileptic seizures prediction on time is an open challenge for the researchers, yet after predicting, there must be an evaluation of the true positive rate of prediction. The model has to undergo a performance check. Several formulas to calculate and estimate the seizures, among them the most efficient, are formulated below. Accuracy is considered to be the most important performance parameter. It is calculated using the correct prediction ratio to the overall prediction (true and false). The false prediction rate must always be less for the designed model. It is a condition where epileptic seizures have not arrived, but the system predicts an attack, and another condition is when there is an attack. However, the model doesn't predict both situations.

There is a false prediction, and with this Accuracy of the designed model will be affected. The formula for Accuracy [34], sensitivity, specificity, and false prediction is as given below.

$$\mathrm{Accuracy}=\frac{True\;Positive+True\;Negative}{True\;Positive+True\;Negative+False\;Positive+False\;Negative}$$
(1)
$$\mathrm{Sensitivity}=\frac{True\;Positive}{True\;Positive+False\;Negative}$$
(2)
$$\mathrm{Specificity}=\frac{True\;Negative}{True\;Negative+False\;Positive}$$
(3)
$$\mathrm{False}\;\mathrm{predictive}\;\mathrm{rate}=\frac{False\;Positive}{True\;Negative+False\;Positive\;}$$
(4)

Where True positive means an ES attack has come, and the model has predicted an attack. True negative means there is no ES attack and the model has not predicted. False-positive means No attack but the model has predicted it as an ES attack. False-negative means there is an ES attack but the model doesn't predict it. The Table 2 below gives a clear overview of the model prediction of Epileptic seizures.

Table 2 Conditions for prediction of seizures

A person who is facing epileptic seizures disorder is continuously monitored with the use of high-end technologies. When the brain signal has some disturbance from its usual signals, the system predicts it can be a seizure attack, so it forwards the data to the model designed to predict epileptic seizures. If the data is predicted to be sure and epileptic seizures attack arises then it takes immediate action and information is passed to the concerned guardian or doctor of the patient to take measures and avoid the risk of emergencies. If the disturbance in the signals is just unwanted noise or imbalances, then there is a chance of non-epileptic seizures. The patients are not concerned to buy any such conditions. The model is designed so that the prediction is accurate and there is a negligible false prediction rate. A system's overall Accuracy and sensitivity are dependent on the true prediction of epileptic seizures. The system must work on real-time datasets with a time concept, as shown in Fig. 3.

Fig. 3
figure 3

Working algorithm process for detection of seizure

6 Learning models

A model can be designed using Deep learning and machine learning. The major difference can be illustrated between Deep learning and Machine learning method. Prediction of Epileptic seizures is an ongoing challenge under research for predicting attack before the time such that the patients can be saved from the risk. No doubt there is a full documentary to predict and evaluate seizures still the research has its demand to predict at an earlier time. Traditionally machine learning algorithms and models were used in many research papers to predict an attack. The models are shown below in the Figs. 4 and 5.

Fig. 4
figure 4

A model for machine learning

Fig. 5
figure 5

Model for deep learning

6.1 Signal processing

The model consists of different blocks as shown in the Fig. 6, the data collected first undergoes preprocessing stages because EEG signals required for the input is important in forecasting epilepsy. The EEG signal must be strong and continuous with a long operation time. EEG signals are so sensitive that variants added up when there is a slight movement of eyeballs or blinking of eyes, there are even sensitive for the heartbeat. Other muscular acts will also cause an imbalance in the signal, for this reason, a preprocessing stage is done to eliminate all unwanted noise and artifacts from the raw EEG signal. Different types of Filters are used in this stage to obtain the desired signal part.

Fig. 6
figure 6

Process diagram of signal processing

6.2 Features extraction and selection

The crave EEG signal is obtained after the filtering process and sent for the other stage of features extraction and selection basic ways are as shown in Fig. 7. To predict a feature, the features must be predefined and analyzed on which decision can be taken. Features extractions are designated based on the EEG signals and channels used. It can be either a multivariate or univariate feature. In multivariate, two or more EEG signals are combined for computation whereas in univariate features estimation is done separately for every channel. Further, they both are sub-classified with linear and nonlinear features. Features extraction is done under the time domain, frequency domain, and time & frequency domain. The main purpose of features selection is to reduce the dimensions for easy simplification. To select the desired features principal component analysis (PCA) and independent component analysis (ICA) are used.

Fig. 7
figure 7

Features extraction and features selection

6.3 Classification

Machine learning algorithms are used to categorize pre-ictal, ictal, and post-ictal stages of EEG signals in epileptic seizures. The signals obtained have faced all the blocks and now the difference has to classify to decide whether it is a seizure or non-seizures signal. The decision-making part is done during this stage. Support vector machine (SVM), decision tree, fuzzy logic, and k-means clustering are different Machine learning algorithms used as classifiers. Graphical representations of overall work with Accuracy, sensitivity, and specificity of papers are shown below in the Fig. 8 [42]. The author used a decision tree classifier and the empirical mode decomposition to obtain the desired outputs. In 2017, [43] the researcher used CHBMIT datasets with long hour of EEG signals in a time–frequency plane to classify oversampling and under sampling.

Fig. 8
figure 8

Graphical representation of outputs in different years

In 2022, Sameer et al., conducted research that concentrated only on a set of delta bands [44] of frequency range from 0.5 to 4 Hz to find out the epileptic attack with the use of random forest classifier using Bonn University datasets. Classification of seizures in the normal and pre-ictal state is performed using 13 layers DCNN using the implementation of a tenfold cross-validation strategy is considered the first model for 13 layers DCNN [45]. Epilepsy prediction used a Computer-aided diagnosis tool (CAD) along with the linear features and nonlinear features such as mobility and complexity in MATLAB software to extract features from EEG signals classifier used was decision tree [46]. There are chances of epileptic attack while sleeping in the patients and may lead to the risk of death.

The research was carried out based on the deep learning approach during sleeping with the wavelet energy in the time–frequency domain using the bi-directional long short term memory (Bi-LSTM) [47] networks a model is designed and obtained results with an accuracy of 99.47% it furthermore avoids false prediction rate [47]. The research was carried out only for delta bands and the outputs are graphically represented. The overall comparison is briefly listed in the Table 3 along with the years and upgrading technology but yet it is still a challenging and ongoing research to predict true positivity of seizures attack before it would initiate and lead to risk factor.

Table 3 Comparisons of methodologies

7 Conclusion

This research cum survey article is based on the objective to provide understandability in epileptic seizure from basic to its essential stages required for the research purpose. It gives a clear idea of different stages and types of epileptic seizures. The process of classifying, categorizing and predicting the epileptics is enhanced under machine learning and deep learning models. The models and relevant datasets dependability relationship is discussed and highlighted in this review article. The author has made an initiative to provide various analytics ratios and matrix to justify the decision-making' reliability. In near future, the study can be extended on the Generative AI datasets generated via the augmentation techniques for medical patterns.