Introduction

Nanofluids (NFs) are prepared by the dispersion of nanoparticles (NPs), nanotubes, metallic and nonmetallic nanosheets in the base fluid (BF). The NP dispersion in the BF can improve the thermophysical properties of the BF [1,2,3,4,5]. Thermal conductivity (TC) and the viscosity are of great importance among the thermophysical properties. Adding the NPs into the BF improves TC and the viscosity. High TC coefficient improves heat transfer, while in most studies, increasing the viscosity reduces heat transfer. Studies show that these properties are function of the shape, size, diameter and amount of the NPs, temperature and type of the BF. The importance of the NF properties is more evident for numerical simulations. In numerical simulations of NFs in enclosures [6,7,8,9,10,11,12,13,14,15], channels [16,17,18,19,20], heat exchangers [21,22,23,24,25,26,27] and many other devices, the researchers have used the relations for the NF properties. There are many relationships between the viscosity and TC, but they are different for different NFs [28,29,30,31,32,33]. Therefore, various methods are used to predict the NF properties in order to achieve a comprehensive relationship. ANN is one of these methods [34,35,36,37,38]. Artificial neural networks (ANNs), which predict the NF properties using the human brain function, are of particular interest to the researchers in the past few decades [39,40,41,42]. ANNs are one of the approaches to data modeling and have nonlinear ability to communicate between different factors and provide an accurate mathematical relationship. Today, ANNs are widely used in various fields, including prediction of thermophysical properties, prediction of flow behavior and heat transfer [43,44,45,46,47].

Here, a brief review on the history of ANNs is presented. Artificial intelligence is a completely young scientist. Many believe that artificial intelligence has begun in 1395, when Alan Turing wrote his article on how to make an intelligent car (which later became known as the Turing test). In that paper, he proposed a method for intelligence recognition. This method was more like a game. Suppose you are on one side of a wall, curtain or any other obstacle and you are connected to the other side of the wall, and someone from the other side of the wall is in contact with you in this way. Naturally, a conversation can take place between you and the person on the other side of the wall. Now, if after the end of the conversation, one told you that there was not a person in the other side of the wall (you do not know the identity of the one over the wall), a car answered you, it will be smart. Otherwise (i.e., if you recognize that the conversation is artificial), the car on the other side of the wall is not smart and has not passed the Turing test. It should be noted that Turing chose this type of the communication (text communication instead of the audio) for two very precise reasons: First, it eliminates the perceptual issue entirely from the problem, and this intelligence test does not engage in discussions on audio reception and processing. The second is to emphasize the orientation of artificial intelligence toward a kind of natural language processing. Although several efforts have already been made to implement the Turing test, such as the Eliza program or the AIML language program, to write programs that are capable of automated chatting, no machine has yet succeeded in passing such a test [48,49,50,51,52,53,54,55,56]. As it is known, this test also contains two basic assumptions:

  1. 1.

    The complete sample of intelligence is human.

  2. 2.

    The most important feature of intelligence is the ability to process and understand natural language.

The fact that the ability to understand language is a sign of intelligence has a background in the philosophy. From the earliest days of epistemology investigation, language has always been at the forefront of cognitive activity. From the ancient Greeks that used Logos as the language and the truth together to now that the philosophers assume the language is their home or the root of philosophical issues, language has always maintained itself as the most privileged ability of the most intelligent beings. The earliest scientific application of neural networks was introduced in the late 1950s, when Frank Rosenblatt introduced the perceptron’s network in 1958. Rosenblatt and his colleagues created a network that could identify patterns. At the same time, in the 1960s, Bernard Widrow introduced adaptive ADALINE linear neural network with a new learning law that was structurally similar to the perceptron’s network [57,58,59,60].

The progress of the neural networks continued until the twentieth century. In 1972, Teo koohonen and James Anderson independently introduced new neural networks that were able to act as “storage elements.” Stephen Grossberg worked on self-organization networks during the decade. The idea of using a random mechanism to explain the function of a wide range of backward networks that can be used to store the information was introduced by American physicist John Hopfield in 1982. The second most important idea that was the development key of ANNs in the 1980s is the backpropagation algorithm error, which was introduced by David Ramhallard and James Mackland in 1986. With the emergence of these two ideas, neural networks have evolved [61,62,63,64,65].

In this study, based on previous studies, we introduce different types of neural networks and how to use them to predict the NF properties. Then, a general overview of the application of ANNs and the prediction of the NF properties by neural networks is carried out.

Artificial neural network (ANN)

ANNs are a pattern for processing information that is made by imitation of biological nerve networks like the human brain. The important point is the new construction of its information processing system that consists of many neurons with strong local connections working together to solve specific issues. ANNs transfer information or rule over the data to the network construction that is called the learning by treating on experimental results. Fundamentally, the learning capability is the most significant property of an intelligent scheme. A system that can be learned is more flexible and more focused, so it can better respond to new issues and equations. Human always wanted to understand the bio-physiology of the brain, because human intelligence and the ability to learn, generalization, creativity, flexibility and parallel processing in the brain are interesting to human beings. It is very desirable to use these features in machines. Algorithmic methods are not suitable for the implementation of these features in machines. Consequently, methods should be based on the same biological models. Similar to the humans, they teach using examples, just as a kid that can detect the animals by watching them. In other words, the ANN is a data processing system that is brought from human brain. It sends the information to large and small processors that are connected with each other as a parallel continuous network to solve a problem. In ANNs, the structure of data is designed by programming knowledge to act as a neuron (Fig. 1).

Fig. 1
figure 1

A schematic of the neuron [55]

This data structure is called a node. Then, they will educate the network by making a link between the nodes and exerting an educational algorithm on it (Fig. 2). There are two active modes for each node including: on or 1 and off or 0 in this neural network. Moreover, there is a weight for each edge. Inactive nodes are activated by previous positive-weight edges, while the next connected node will be disable or inhibit by negative-weight edges [65,66,67,68,69,70].

Fig. 2
figure 2

Graph of neural math function [55]

Structure of neural networks

A neural network involves layer and weight modules. The behavior of the network is also dependent on the connection between the data. Generally, there are 3 types of neuronal layers in neural networks [48, 69] (Fig. 3):

Fig. 3
figure 3

Multi-layer neural network diagram [55]

  1. 1.

    Input layer: This receives the raw data that have been given to supply network.

  2. 2.

    The functions of hidden layers are controlled by the connection between inputs and the hidden layers.

  3. 3.

    Output unit performance is a function of the hidden unit action and the connection weight between the hidden unit and the output one.

There are several types of connection or weight bonding in the neural networks:

Forward: Most of the connections are forward in which the signals move in one way only. There is no feedback (loop) from the input to the output. Each layer output does not affect the same one.

Backward: Data are retrieved from top layer nodes to low layer nodes.

Side: The output of each layer nodes is used as input of the same layer nodes.

Neural networks in comparison with the conventional computers

A neural network generally differs from a computer as follows:

  1. 1.

    Nervous networks do not execute commands in series; they do not include memory for storing data and instructions.

  2. 2.

    Parallel response to a set of inputs.

  3. 3.

    It deals with transformations and mappings than algorithms and methods.

  4. 4.

    It does not include complicated computing tools. It consists of a large number of tools and often performs a little bit more than a weighed summation.

ANNs have a different method to solve a problem. Formal computers use an algorithmic method to solve a problem that is followed by a series of unambiguous instructions to solve the problem. These commands are converted into high-level languages and then into the machine language that the system can recognize. If the solution steps are not known and there is no specific algorithm, the system will not be able to solve the problem. Computers have more advantages if they can do things that we do not have a background about them. Neural networks and computers have not a competition; they can be applied complementary. There are some problems that should be solved algorithmically, and there are also ones that cannot be solved except through the ANN. Of course, a combination of the above methods can be used to obtain a maximum efficiency. Typically, a formal computer is employed to display the ANN. Neural networks do not miracles. If they are tangibly used, they do strange things. ANNs with their considerable ability to deduce results from complex data can be applied for extracting patterns and identify different styles that are very difficult for humans and computers to identify. The benefits of neural networks are as follows:

  1. 1.

    Adaptive learning: This is the capability for learning how to perform your tasks based on given information or the initial experience.

  2. 2.

    Self-organization: An ANN automatically is organized the data obtained during the learning. The neurons are adapted to the learning law and the reply to the input variations.

  3. 3.

    Real-time operators: Neural network calculations can be done in parallel using special hardware that are designed and constructed to obtain optimal results for ANN capabilities.

  4. 4.

    Error tolerance: Failure in the network reduces its performance, but some features are kept despite major problems.

  5. 5.

    Classification: ANNs are able to categorize the inputs to obtain suitable outputs.

  6. 6.

    Generalization: This one enables the network to obtain a general rule by dealing with a limited number of instances, and generalize the results of these learning to previewing views. In the absence of this, the system should remember the infinite facts and relationships.

  7. 7.

    Sustainability and flexibility: A neural network is also sufficiently stable to maintain its information. It has a flexibility and adaptability and can accept new information without losing previous one [68].

Neural network classification

They are divided into four categories based on the teaching method [70]:

  1. 1.

    Fixed weight: Training does not exist and weighing values are not updated. Its application is in optimization of information, reduction in the volume, separation and compression.

  2. 2.

    Unprotected learning: Weights are corrected only by inputs. There is no optimal output to compare with the network output and determine the error value of the weights. Weights are only updated based on the input pattern data. The purpose is to extract the characteristics of the input patterns based on the clustering strategy or classify and recognize the similarities (generation of groups with similar patterns). The output or classes corresponding to the input patterns are not identified. The learning is usually based on the best practices. A non-organized network changes its weights based on the output results in the input to have a good response to this input for further interaction. As a result, the network learns how to respond to the input. Basically, the goal is to select a neuron that has the most stimulus using the dominant neuron technique. Therefore, finding the dominant neuron is one of the most important tasks in unprotected networks.

  3. 3.

    Protected learning: For each category of inputs, the conforming outputs are also presented to the network and the weight changes are done pending the net output discrepancy for the learning forms from the favorite results is as suitable as the error.

  4. 4.

    Boost training: The quality of the system’s performance is improved step by step over time. There are no learning patterns, but it is created using a signal called expression critique of the good or bad behavior of the system (a state between the protected earning and unprotected earning).

Neural network capabilities

Among the features of neural networks, the following ones are considered [48]:

  • Calculation of a definite function

  • An approximation of an unknown function

  • Pattern identification

  • Signal processing

  • Learning

Application of neural networks

Neural networks are quantitatively and qualitatively growing in terms of structural analysis and hardware implementation. Various neural calculation techniques are increasing from the point of view of numbers. ANNs have a wide range of applications such as electronics, aerospace, oil and gas, defense, transportation and finance [48].

Types of neural networks

Neural network types are as follows (Fig. 4):

Fig. 4
figure 4

Types of neural networks [48]

  • Perceptron neural network

  • Hopfield neural network

  • Hamming neural network

  • Backpropagation neural network

  • Time delay neural network

  • ANN mathematical model

During the modeling of nerves, the intricacies are neglected and only the simple concepts are considered; otherwise, the modeling approach will be very difficult. The model or nerve should contain inputs that act as a synapse. The inputs are multiplied by weights to specify the strength of signal. Finally, a mathematical operator decides whether the neuron becomes activate or not. Figure 5 illustrates a simple model for describing a neuron. There are many variations in the model presented in Fig. 5. For example, the weights of a neural network, which transmit the output value, can be positive or negative. Moreover, the functions used for threes holding can vary. The most famous functions include sigmoid, Arctan and Arcsin. Also, the number of input nodes can vary. Of course, as the number of nodes rises, the determination of the weights will be difficult [63, 71,72,73].

Fig. 5
figure 5

Schematic of a neuron [59, 60, 63]

A layer of neurons

A network layer with input R and S neuron is presented in Fig. 6. In this network, each input P is linked to all neurons by the weight matrix \(w_{i}\). The neuron has net value equal to the sum of all weighted inputs to this neuron and the neuron bias. Finally, each neuron output has the value of a, which is obtained after passing through the function f. Usually, the number of layer inputs varies with the neuron number in that layer and there is no relation for the neuron number in a layer. The input matrix P is entered into the network by the weight matrix given from Eq. 1 [59, 60, 63]:

$$W = \left[ {\begin{array}{*{20}c} {w_{1,1} } & \cdots & {w_{{1,{\text{R}}}} } \\ \vdots & \ddots & \vdots \\ {w_{{{\text{s}},1}} } & \cdots & {w_{\text{s,R}} } \\ \end{array} } \right]$$
(1)
Fig. 6
figure 6

Input vector with the length of R [60]

It should be noted that the first index of the matrix W indicates the number of target neuron, and the second index represents the input number. So \(w_{1,2}\) shows the ratio of the second input weight and the first neuron. As shown in Fig. 6, P is the input vector with the length R, W is the weight matrix with the dimensions S × R, and a and b are vectors with the length S [59, 60, 63].

Perceptron neural network

This is made on the basis of a computational unit named perceptron. This takes inputs with actual amounts and computes a linear combination of inputs. If the result is higher than the threshold value, the perceptron output would be equal to 1; otherwise, it would be equal to -1. Perceptron neural networks, in particular, multi-layer perceptron, are the most useful neural networks. These networks are able to carry out a nonlinear mapping with the desired accuracy by selecting the appropriate number of layers and neuronal cells, which are often not too high [74].

Multi-layer perceptron

Even though the modeling of neurons is one of the most important points in the performance of neural network, network relations and their layouts (topology) are also very important parameters. It should be noted that human brain topology is consequently intricate that it cannot be applied as a model for using in the neural network, because this is a simplified model, while the brain layout utilizes numerous components. One of the easiest and the most effectual suggested layouts for real modeling is the multi-layer perceptron (MLP) model. This model includes all three types of layers. In this model, all layer’s neurons are linked to all subsequent layer neurons. This organization is a network with full connections. Figure 8 shows a schematic of a three-layer perceptron network. It can effortlessly be founded that the neurons number in each layer is independent of those in the other layers. It should be noted that in Figs. 7 and 8, each circle is the summation operator and thresholding (crossing the nonlinear sigmoid function). In fact, any circle is a model of the gatherer and threshold block presented in Fig. 9, which is shown in this form for the simplicity [75].

Fig. 7
figure 7

Block diagram of a neuron network with a hidden layer [75]

Fig. 8
figure 8

Block diagram of a neuron network with a hidden layer [75]

Fig. 9
figure 9

Sample of a block diagram [75]

Perceptron law

Generally, for a learning example \(X = (x_{1} ,\,x_{2} , \ldots ,x_{\text{n}} )\)), the learning algorithm changes according to the perceptron law as follows:

  1. 1.

    Determine the weight coefficients and threshold values randomly.

  2. 2.

    Present an input to the model.

  3. 3.

    Calculate the output value by comparing the total weight of the inputs and the threshold value.

  4. 4.

    Change the weight coefficients according to the following relationships to strengthen the correct decisions and incorrect decisions (reduce the error).

    1. 4.1

      \(w_{\text{i}} \left( {t + 1} \right) = w_{\text{i}} \left( t \right)\) if the output is correct.

    2. 4.2

      \(w_{\text{i}} \left( {t + 1} \right) = w_{\text{i}} \left( t \right) + \eta x_{\text{i}} \left( t \right)\) if the actual output is zero and the desired output is 1.

    3. 4.3

      \(w_{\text{i}} \left( {t + 1} \right) = w_{\text{i}} \left( t \right) - \eta x_{\text{i}} \left( t \right)\) if the actual output is 1 and the desired output is zero. 1 ≤ η ≤ 0 is a positive review factor that controls the moderation rate.

  5. 5.

    Provide the next input to the model [75, 76].

Hopfield neural network

The Hopfield network has a special architecture that distinguishes it from other networks. Generally, these types of networks have a layer of input neurons. The input neurons are the same output ones. Unlike other networks, this network does not specify its weights in the learning algorithm, but it does it by a special formula. In the identification algorithm, the input will be changed by repeating and reaches a definite form. In these networks, at any time, only one neuron is active, and the other neurons are inactive. In other words, because a neuron takes the inputs from other neurons, the neuron is changing and the other neurons are constant. These types of networks are usually used to eliminate noise of the images or other patterns [77,78,79]

Hamming neural network

This network was first presented by Steinbuch in 1961 and has been revisited by people like Lippman in recent years. This network is mainly used for solving the problem of identifying binary patterns (vector patterns whose element contains only 1 or − 1 values). This network relies in the neural network frame, since it consists of a series of neurons as nodes and a series of arithmetic weights between nodes. Each node has an active level that makes the output of the neuron. The Hamming network contains both the feed-backward and feed-forward structures. The main purpose of the Hamming network is to determine that which reference pattern is most closely related to the input pattern, and then, it appears on the network output [79, 80].

The Hamming network consists of three layers:

Feed-forward layer: The first layer of Hamming network that is represented by the weight matrix, bias vector and linear transformation function calculates the internal multiplication between reference vectors with input vectors. The storage of reference patterns in the network is performed by the weight matrix.

Feed-backward layer (WTA): The middle layer of the Hamming network has a backward structure. This structure is known as a competitive structure. Hence, the middle layer of the Hamming network is also called a competitive layer. Once the number of reference vectors or storage capacity of the network is computed by the first Hamming layer, the second subset is activated and the output values of the first layer are considered as initial values. Then, it subtracts its value from the inputs value and repeats this operation until the outputs in all cells become zero except the winning cell (indicating the most similar pattern of reference with the input vector). In such conditions, the middle layer, in other words the entire Hamming network, is in its steady state and the continuation of the repetition in the middle layer is useless. Under this condition, the operations are called the WTA type. Then, in the middle layer, the competition between the neurons begins. A neuron is won, and the rests are lost.

Third layer: This layer in the Hamming network is a feed-backward matrix with a weight matrix and a symmetric two-valued threshold transform function. The third layer’s task is that after the second layer converges, the reference vector stored will appear on the network output. For example, if the second layer indicates that the apple reference model is identified for input, then the third layer takes it as input and creates the vector P in the network output.

Time delay neural network (TDNN)

This network is a kind of multi-layer neural network which has the ability to encounter the dynamic nature of sample data and input signals. Multi-layer neural networks have the following characteristics:

It has several layers and each layer has enough connections between neurons, so that the network has the ability to learn complex nonlinear decision levels.

The behavior of the network is sensitive to the time transmission of samples.

The learning method is sensitive to the exact timing of the input samples

TDNN was first used by Weibel in 1988 and still remains the same. It consists of three layers whose weights are paired with delayed cells. The TDNN stimulus is a function of the sigmoid function and has weighted input [80]. The neuron structure in this network is illustrated in Fig. 10.

Fig. 10
figure 10

Neuron structure in time delay network [80]

For designing the neural networks, especially the TDNN, the designer faces the issue of selecting a suitable network for his design. In general, a network that has minimum complexity and minimum parameters with the highest degree of accuracy in identifying input patterns is called a suitable network. If a problem can be resolved by a particular network theoretically, it can also be solved by larger networks. But because there is not a unique answer for the optimal weights of the learning algorithms for the larger network, the opposite weights are obtained. Hence, it is difficult to identify a smaller-sized network for solving the problem. If the neurons number in the network layers used in a particular problem is low, the network cannot learn because the number of pages and, as a result, the volumes needed to divide the input into different classes is not enough. Conversely, the large neuron number in hidden layer is also not suitable due to the increase in the volume of computations and, consequently, the prolongation of the network’s learning time. In addition, since network learning is according to a limited set of learning patterns, if the network is very large, it tries to maintain the patterns of learning, which reduces the power of generalization and network interpolation to identify new patterns outside of the learning set. Thus, there are a critical number of hidden layers that should be found for any particular application. The neuron number in hidden layer is determined by simulating different networks and measuring the accuracy and interpolation of these networks on patterns that were not in their learning set. The neuron number in the network output layer (the type of coding in the output) should also be appropriate for to solve a specific problem. The best way to coding the output classes is to use primitive vectors.

Backward propagation network

Unlike previous networks, this network consists of several layers. It includes a layer called hidden layer plus to the output and input layers. The network works as an oversight and is not self-organizing. One of the prominent features that distinguish this network from other ones is that the values of input neurons are continuous. This means that non-binary values can be entered as inputs to the network. This network is one of the most utilized networks because it can solve nonlinear problems [70, 80].

Network learning using backpropagation error

In general, ANNs are twofold in terms of learning: fixed weight networks and variable weight networks (learner networks). The learners’ networks are divided into protected and unprotected. In protected networks, samples are used in the learning phase that are known to be the ideal output of the corresponding one. In other words, in these networks, the input data samples have labels. In unprotected networks, the output is placed in a separate class based on a criterion (for example distance) and type of the competition [70, 80].

Since the neural network is a simplified nerve of body, it is just as capable of learning. In other words, the network is able to learn the process in the patterns using the information received by its administrator and input. Therefore, in a similar way to humans, the learning process in the neural network is also inspired by human models, in which many samples have to be presented to the network so that it can follow the desired output by changing the network weights. Providing the sample data input to the neural network is possible by two methods:

  1. 1.

    Batch mode: In this method, all samples are presented to the network, and finally, the network error is calculated relative to all samples and the weight change is based on the error. In the next step, all data are retransmitted to the network once again, and the process is repeated until the error eventually reaches an acceptable value. Definitely, this is a complicated and time-consuming process and requires a lot of memory. There is also the possibility of applying the algorithm to local minima.

  2. 2.

    Pattern mode: In this method, the samples are given individually to the network and the corresponding error with the same data is immediately calculated and the network weights change based on the error. Then, the next samples are presented to the network and the trend is done similarly. Because the weights are corrected based on each sample in each step, the algorithm has a good convergence. Based on the random nature of the single presentation of the data, the local minimum risk is eliminated. There are many methods available to train the network and modify the weights to achieve a meaningful error. One of the most well-known approaches is error backpropagation algorithm that is defined as follows:

Error backpropagation algorithm

This algorithm, proposed in 1986 by Romelhart and Mak keleyland, is used in feed-forward neural network. Feed-forward means that artificial neurons are placed in successive layers and forward their output (signal). The term “backpropagation” also means that the errors are fed back to the network to correct the weights and then reenter the path to the output again. The method of error backpropagation is a protective method in the sense that the input examples are tagged and the predictable output of each of them is already recognized. Therefore, the output of network is compared with these perfect outputs and the error of network is computed. It first assumed in this algorithm that network weights are accidentally selected. In each step, the output of the grid is calculated and the weights are corrected based on its difference with the desired output to eventually minimize the error [70, 80].

Artificial neural network learning

Once the weights and bias of the network are initialized, it is ready to be trained. To do this, a series of data are required as inputs and a series of data as the optimal output. During the training, the weights and network biases are trained to minimize the operation function of the network. Usually the operation function is the mean square difference between the network output and the desired output [70, 81, 82]. Programming was used in the MATLAB environment for designing and training an ANN. To write a program in the MATLAB software for neural network, the following points should be considered:

  1. 1.

    Maximum and minimum input and output values are defined.

  2. 2.

    The input and output outputs are normalized (between -1 and 1).

  3. 3.

    The amount of neurons in hidden layers is defined.

  4. 4.

    Transfer function associated with the each base neuron is determined.

  5. 5.

    The technique of learning the network.

  6. 6.

    Technique of network training is identified.

  7. 7.

    Error function is determined according to the standards.

The points mentioned are necessary to design the appropriate neural network. After performing the previous steps and entering data in the MATLAB software, the network learning begins. In general, the learning steps can be summarized as follows:

  1. 1.

    Initial value for weights is determined.

  2. 2.

    Training output data is obtained for the input vector.

  3. 3.

    The actual output is compared with the test output.

  4. 4.

    The network weights are modified according to the error value.

  5. 5.

    This process is repeated until the error is minimized.

An error function is used using the MSE method mentioned in Eq. (2). \(T_{\text{j}}\) is the output of jth neuron of the neural network and \(O_{\text{j}}\) is the predicted amount of the jth neuron. The total number of network training  has been noted by N (it is epoch in the MATLAB software). Moreover, m denotes the number of output parameters of network [80].

$${\text{MSE}} = \frac{1}{2mN}\mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{m} (T_{\text{j}} - O_{\text{j}} ) ^{2}$$
(2)

All operations performed on this technique selected based on the trial and error approach to attain the minimum error [70]. The error is gained by to Eq. (3):

$${\text{The}} \,{\text{error}} \left( {\text{percent}} \right) = \frac{{ \left( {{\text{Actual }}\,{\text{value}} - {\text{The}}\, {\text{value}}\, {\text{obtained}}\,{\text{from}}\, {\text{ANN}}} \right)}}{{{\text{Actual }}\,{\text{value}}}} \times 100$$
(3)

The time for taking the weight and the bias toward the outputs and returning the outputs to correct the weight and bias toward the inputs is an epoch.

Nanofluids

With the advent of the science, nanofluids preparation using various materials has been potential. One of the physical appearances of NFs is their high ratio of surface-to-volume, which has given them superior capabilities. Nanofluid has begun as a novel category of nano-technology established on heat transfer fluids and has developed extremely over these years. Researchers are attempting to realize the governing rules of the thermo-physical properties of NFs. Accordingly, they have proposed novel instruments and provided uncommon models for clarifying their behaviors. The “nanofluid” is a phrase used by Choi [5] as a novel type of working fluid for heat transfer that has a small amount of nano-particles. In fact, nano-particles were dispersed permanently and homogeneously in the incessant phase. The main NF technology development showed their high potential for use in heat transfer, which led to that the industries and universities work on this research field. The average particle size of nano-particles is usually between 1 and 100 nm. The introduction of NFs in the heat transfer has created a novel approach with its challenges and benefits. The full understanding of NFs rheological behavior is essential for the scientists. Therefore, this section examines the past researches on the thermophysical NFs properties.

Properties of the nanofluids

The TC of NFs has attracted the main attention of researchers. However, this is the most important issue for static fluids, but the NF heat transfer coefficient is the most important issue for NF flow. Other important properties that affect the heat transfer include the density, specific heat and NF viscosity. By assuming the uniform dispersion of NPs into the BF, some thermal and physical properties of the NFs are as follows:

Dynamic viscosity

The viscosity is the fluid resistance against the deformation due to longitudinal stresses or shear stresses. Due to the suspension structure of NFs, the viscosity plays an important role in the design of nanofluidic systems, as its direct effects on the pressure drop in the convection flows are evident. Therefore, for practical applications, the NF viscosity increase relative to its BF must be fully evaluated. Since the viscosity of the working fluid in the heat transfer device plays a noteworthy role in the power of the pump, and it is required for numerical solution of the governing equations, it is necessary to know the viscosity. Many researchers investigated the NF viscosity [31, 83,84,85,86,87,88,89,90,91]. Table 1 summarizes some of the correlations presented for the NF viscosity.

Table 1 Correlations presented for the viscosity of nanofluids

Thermal conductivity (TC)

The main reason of addition of NPs into the BF is to increase its TC. Therefore, it can be stated that the most important property of the NFs that needs to be known is the TC coefficient. Many researchers have been used experimental methods to measure the TC of NFs [102,103,104,105,106,107,108,109]. These researches were performed at various temperatures and volume fractions, and thermal conductivity of NFs for many NPs and BFs was reported. Some of these researchers have expressed a correlation for TC coefficient of NPs using the effective parameters on the TC of the NPs, such as the volume fraction of the NPs, the temperature and size of the NPs. Some other researchers have also reported the correlations TC using theoretical relations and physical phenomena affecting TC such as Brownian motion, thermophoresis and clustering. Table 2 presents a selection of correlations is presented for the TC coefficient of the NFs. These correlations include basic theoretical relations and some relationships obtained from experimental data. The correlations provided by experimental data are only applicable to a specific NF.

Table 2 Some correlations for predicting the effective thermal conductivity of different nanofluids

Application of neural networks in predicting flow and heat transfer properties

The heat exchangers are under vibrational conditions and vibrate. Therefore, the operating conditions are quite different from traditional design conditions. Hence, it is not possible to directly determine the relationship between the various parameters on the heat transfer of these heat exchangers [118,119,120]. As a result, the use of neural networks was presented by Rahman and Zhang [121] for vibrating heat exchangers (Fig. 11) to predict convection heat transfer coefficient. They obtained good results using the neural network (Fig. 12) and experimental data of a heat exchanger. They used a three-layer neural network model with a 2-10-1 configuration, an input layer with two neurons that shows both the input parameters of the oscillation frequency and the average pressure. The output was the heat transfer coefficient. Finally, they showed that the neural network is able to predict the heat transfer coefficient of heat exchangers by comparing their results with previous results and experimental data.

Fig. 11
figure 11

Finned-tube ambient heat exchanger [121]

Fig. 12
figure 12

A multi-layer feed-forward ANN with a backpropagation algorithm [121]

Naphon et al. [122] used the neural networks to evaluate the pressure drop and heat transfer of pulsed NF in a spiral tube (Fig. 13) exposed to a magnetic field experimentally (Fig. 14). They investigated four neural network algorithms that include Levenberg–Marquardt backward propagation (LMB), Resilient Backpropagation (RB), Bayesian regulation backpropagation (BRB) and scaled conjugate gradient backpropagation (SCGB) to characterize the ANN network error. They compared neural network outputs with the experimental data. Their results showed that the Levenberg–Marquardt backpropagation algorithm (Fig. 15) has minimum MSE (mean square error) and maximum R (correlation coefficient) compared to other algorithms. Based on the ANN, the maximum error was for friction coefficient for the majority of data and Nusselt number (Nu) was 2.5 and 5%, respectively. Finally, they used the neural network to predict the performance of a spiral heat exchanger in the presence of magnetic field.

Fig. 13
figure 13

The schematic of spirally coiled tube equipped with magnetic bar arrangements [122]

Fig. 14
figure 14

Schematic diagram of experimental apparatus [122]

Fig. 15
figure 15

Proposed optimal ANN model configuration [122]

Ebrahimi-Moghaddam et al. [123] studied the optimization of minimum entropy using the neural network. They presented optimal conditions using an Al2O3/EG–W hybrid NF inside an absorber tube of a parabolic trough solar collector (PTSC). They considered a neural network whose inputs were three parameters, and provided optimal conditions, including volume fraction of NPs, NP diameter and fluid temperature for inputs. Their results showed that the amount of entropy generation decreases by decreasing the volume fraction of NPs and increasing the diameter of the NPs and the average fluid temperature. In fact, they showed that the addition of NPs into the BF leads to an increase in the friction entropy and a reduction in the entropy generation and an increase in the heat transfer.

Hemmat Esfe [124] predicted the pressure drop and heat transfer in a heat exchanger saturated with NF using the neural network. He modeled his results in terms of Nu and pressure drop in a heat exchanger filled with Ag/water NF. Their results indicated high ability of the neural network to predict the pressure drop and heat transfer. The data regression coefficient for Nu and relative pressure drop was 99.76% and 99.54%, respectively, which shows the high accuracy of the method.

Moya-Rico et al. [125] studied the use of neural networks in predicting the heat transfer coefficient and pressure drop in a three-pipe heat exchanger with a corrugated and simple internal tube. They studied the length and epoch of the corrugations in the heat exchanger. They developed a neural network with a database of 181 experimental data. They introduced the best Bayesian regulation algorithm for the neural network. The neural network consisted of two hidden layers with 15 and 21 nodes in the first and second layers. Finally, their results showed that the neural network is in good agreement with the experimental data, and the absolute average relative deviation is less than 1.91 for the heat transfer coefficient and 3.82 for the pressure drop.

Romero-Méndez et al. [126] used neural networks for predicting the heat transfer coefficient in evaporative mini tubes. They predicted natural convection heat transfer coefficient of a refrigerant into very low diameter tubes using neural networks. Their experiments were performed on the basis of the Rankine refrigeration cycle to gain data of heat transfer from constant heat flux tubes. They used 75 percent of their data to predict the neural network model and 25 percent to predict predetermined goals. They selected several neural networks to predict the heat transfer coefficient. Finally, they found that neural networks are powerful tools for accurately predicting the amount of heat transfer in evaporative mini tubes.

Ghritlahre and Prasad [127] studied two different solar heating systems (Fig. 16) using neural networks. They used the feed-forward neural network (FFNN) model (Figs. 17, 18) to predict the heat transfer coefficient. They used surface roughness to test and obtained 50 samples for the neural network input. They used multi-layer neural network using feed-forward backpropagation (FFBP) of parameters that include five surface roughness, relative surface roughness heights, surface roughness epoch, roughness size and Reynolds number to predict the Nu. They also used the Levenberg–Marquardt (LM) model to find optimal conditions. Their results showed that the model of neural network is very effective in predicting the heat transfer coefficient.

Fig. 16
figure 16

Detail schematic diagram of duct. a One-sided roughened absorber plate duct. b Three-sided roughened absorber plate duct [127]

Fig. 17
figure 17

Basic structure of artificial neurons [127]

Fig. 18
figure 18

Present study ANN model [127]

Many researchers have found that neural networks are very suitable to predict the heat transfer coefficient and pressure drop and the properties of some fluids [128,129,130,131,132,133,134]. Naphon et al. [135] studied the NF jet in a micro-channel heat sink numerically and experimentally. They used Levenberg–Marquardt Backward propagation (LMB) model to predict the heat transfer coefficient using the neural networks. Their results obtained from numerical simulations and neural network were very consistent with experimental results. Finally, they found that the use of neural networks and numerical simulations is very efficient in predicting the heat transfer coefficient of micro-channel heat sinks.

Neural network application to predict thermophysical nanofluid properties

The TC coefficient is a special characteristic of NF. Addition of NPs into the BF leads to an increase in the TC and heat transfer. In addition to the experimental measurements of this property, theoretical methods are used to predict the TC of the NFs. The results of research carried out by 34 organizations [136] around the world on completion and determination of the pattern to study the TC of NFs show that the Maxwell-modified model, which was carried out by Nen, yields acceptable results in advance in agreement with the experimental results.

Papari et al. [43] used ANN for determining the TC of NFs containing multi-layer carbon nanotubes as NPs in oil, distilled water, decane and ethylene glycol. It was shown that the use of model was more accurate than the theoretical models of Zu 2005, Zu 2006, and Yu and Choi, which was 3.26 percent in the mean error absolute index. The correlation between the measured value and the experimental data was 0.991.

Hojjat et al. [41] showed the TC of three types of NFs containing aluminum oxide, copper oxide and titanium oxide using ANNs. Their proposed model is in good agreement with the experimental data. Its accuracy was more than the Hamilton–Croser model that has not the possibility to investigate the effect of BF temperature and the volume fraction.

Languot et al. [137] showed that the TC of titanium oxide/water and alumina oxide/water NFs by selecting four input parameters that include the volume fraction, the NF temperature, the TC of the NPs and the size of the NPs for the training of the network leads to more appropriate results compared to three parameters, regardless of the size of the mass.

Hemmat Esfe et al. [138] studied the effects of temperature, volume fraction and NP size on the TC of manganese oxide/ethylene glycol NF using neural networks and showed that NF temperature has less effect on the increase in TC than the rest of the parameters. Hemmat Esfe et al. [139] investigated the TC coefficient of DWCNTS–sio2/EG and found that the TC coefficient increases by 38% by adding NPs to the BF compared to the pure fluid.Hemmat Esfe et al. [140] presented a high precision model for predicting TC of zirconium oxide/ethylene glycol using a neural network by considering NF temperature and volume fraction as network inputs.

Shahsavar et al. [141] studied the TC and viscosity of paraffin–Fe3O4 NF using neural network method. They performed their research to evaluate the effects of volume fraction of Fe3O4 NPs and temperature on the TC and fluid viscosity of the BF. They used oleic acid as a surfactant to increase the dispersion and stability of NPs. It was found that NFs are shear thinning fluids. In addition, they showed that TC and viscosity increase with the volume fraction of NPs. In addition, they used the ANN to simulate TC and viscosity. They used the experimental data to evaluate the accuracy of neural network models based on well-known statistical indices such as root mean square (RMS), root mean square error (RMSE), mean absolute deviation (MAE) and coefficient of determination (R2) evaluated. Their results showed that RMS, RMSE, MAE and R2 outputs are less than 1% and the proposed model is a suitable model for predicting paraffin-Fe3O4 NF properties.

Rostmian et al. [142] used the neural network to predict the TC of CuO–SWCNTs hybrid NF. They obtained experimental results in the presence of the effects of temperature and volume fraction of NPs. Their investigations were carried out at a temperature range of 20–50 °C and the volume fractions of 0.02–0.75%. The NPs were CuO and SWCNTs (50:50), and the BF was water and ethylene glycol (40:60). Finally, they presented a correlation to predict the TC of CuO–SWCNTs–EG/water and showed that the ANN could well predict the thermophysical NF properties.

Hemmat Esfe and Arani [143] studied the dynamic viscosity of MWCNT (40%)–SiO2 (60%)/5W50 NF experimentally. The temperature range was between 5 and 55 °C, and the volume fraction range was 0–1%. They studied the rheological behavior of the NF and found that the NF experiences non-Newtonian behavior. They used ANNs and mathematical relationships to provide the correlations to predict the NF behavior. They used the multi-layer perceptron (MLP) neural network. Their results showed that neural networks could be more accurately predicted for dynamic viscosity of NF.

In another study, Hemmat Esfe et al. [144] investigated the dynamic viscosity of MWCNTs–ZnO (10–90%)/5W50 NF. They provided correlations using neural networks and mathematical models and presented the minimum viscosity under optimum conditions. The neural network model was multi-layer with 3 neurons in the first layer and 3 neurons in the second hidden layer. Finally, the neural network error was less than 7%.

Ghasemi et al. [145] investigated the TC of the COOH–MWCNTs antifreeze NF. They evaluated their results using neural networks for the TC coefficient, depending on the volume fraction of NPs and the temperature. They considered 9 neurons for the hidden layer and obtained the neural network outputs of TC coefficient. Their results showed that the proposed correlation can be very valuable in engineering applications, but the neural network is in higher agreement with the experimental results.

Ahmadloo and Azizi [146] used 5 inputs for the neural network (Fig. 19) to predict the TC coefficient. They used 15 NFs and 776 experimental data of 21 previous studies to evaluate the neural networks in predicting the TC of NFs. NP diameter, volume fraction, TC of NPs, temperature and several other parameters of NPs and BFs were ANN inputs. Eventually, they showed that the neural network error was about 1.26 and 1.44 percent compared with experimental results.

Fig. 19
figure 19

Architecture of the suggested ANN model for prediction of knf/kf [146]

Khosrojerdi et al. [147] predicted the TC of graphene/water NF using the multi-layer perceptron (MLP) neural network. They used relationships (10–12) to evaluate the mean square error (RMSE), coefficient of determination (R2) and mean absolute percentage error (MAPE), where KP is predicted TC coefficient and Ka is the actual TC. Their results showed high accuracy of ANN modeling compared to experimental results and theoretical models.

$${\text{RSME}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {K_{\text{p}} - K_{\text{a}} } \right)^{2} }$$
(4)
$${\text{MAPE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \frac{{\left( {K_{\text{p}} - K_{\text{a}} } \right)}}{{\left( {K_{\text{a}} } \right)}} \times 100$$
(5)
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {K_{\text{a}} - K_{\text{p}} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {K_{\text{a}} - \overline{{K_{\text{a}} }} } \right)^{2} }}$$
(6)

Hemmat Esfe et al. [148] investigated and predicted the dynamic viscosity of TiO2 NF using experimental results and neural network. They presented a model for predicting dynamic viscosity using the inputs of temperature and mass flow rate of NPs. Their results provided a correlation for dynamic viscosity of the TiO2 NF. They found that the neural network is accurate for predicting the dynamic viscosity of TiO2.

Derakhshanfard and Mehralizadeh [149] used the radial basis function (RBF) and experimental results of NiO, WO3, TiO2, ZnO and FeO3 NPs in oil to study the effects of temperature on viscosity. The morphology and stability of NPs were analyzed by DLS and TEM analysis. Their results showed that the mean diameter of NPs in a range of 10–40 nm for different NPs of the neural network is in good agreement with experimental results. The viscosity decreases by addition of TiO2, ZnO and FeO3 NPs into the BF and increases for WO3 and NiO NPs.

Kanaaiyan et al. [150] studied the TC and density of alumina–silica/water hybrid NF and the application of neural networks. The hybrid NF was prepared by two-stage method and analyzed using XRD, TEM, SEM–EDX and zeta potential. Finally, they found that neural networks are very effective in predicting TC coefficient of hybrid NFs depending on the temperature and volume fraction of NPs. The results showed that the neural network used for the TC with the hidden two layers and 10 neurons has the minimum error and the highest fitting coefficient, while the neural network with a hidden layer of 4 neurons was used for the density as the optimal structure.

Hemmat Esfe et al. [101] had experimental study on rheological behavior of monograde heavy-duty engine oils containing CNTs and oxide nanoparticles. They used RSM (response surface method) to predict viscosity behavior of nanofluids.

Vafaie et al. [151] obtained TC coefficient of MgO–MWCNTs/EG hybrid NF experimentally by comparing with the prediction of neural networks. They considered the volume fractions of NPs in the range of 0.05–0.6 and the temperatures between 25 and 50 and obtained the TC coefficients. They designed four models as 6, 8, 10 and 12 hidden layers using multi-layer neural networks and found that the best model is 12 neurons with an error of approximately 0.8%.

Sedaghat and Yousefi [152] investigated the properties and TC of graphene quantum dots (GQDs) in the BFs water, ethylene glycol and water–ethylene glycol (60:40). They found that the addition of GQD NPs into the BFs increases the TC and viscosity while the viscosity decreases with the temperature. Eventually, they found that neural network models are highly consistent with the experimental results.

Many researchers have been investigated the prediction of NF properties using neural networks. Among them, Hemmat Esfe et al. [138] studied the TC of MgO/EG NF using neural networks and experimental results. In another study, Hemmat Esfe et al. [153] investigated the Al2O3-water NF. Hemmat Esfe et al. studied the Ag–MgO/water hybrid NF. He has conducted many studies [32, 102, 103, 139, 154,155,156,157,158,159,160] to predict the NF properties using neural networks. Their investigations showed that the neural networks can be a very popular approach for predicting the NF properties and hybrid NFs.

Alrashed et al. [161] investigated the NFs with water BF and diamond–COOH and MWCNT–COOH NPs without surfactant or excess materials at temperatures ranging from 20 to 50 and the volume fraction of 0–0.2%. Based on their experimental results, they obtained new correlations for predicting NF properties using experimental data. The results obtained from the experiments and the predictions of neural network showed that the neural network is in good agreement with the experimental results. They investigated the effect of NP types, volume fraction of NPs and temperature on the thermophysical NF properties. In another study, Safaei et al. [162] investigated the effect of temperature and volume fraction of NPs on TC of ZnO-TiO2/EG hybrid NF. They analyzed the data obtained from the neural network. They considered several models of the neural network to predict the TC coefficient. Finally, they used a model for predicting the input temperature and the volume fraction of NPs. Their results were in very good agreement with the experimental results.

Karimi and Yousefi [163] presented the density correlations using neural networks and the backpropagation network algorithm (BPN) and genetic algorithm. Their results presented in the BPN–GA method were in very good agreement with the experimental results. They also showed that the Pak and Choi models are very suitable for prediction of the density of NFs.

Alnaqi et al. [164] used neural networks to predict the heat transfer coefficient and the NF pressure drop of water and multi-walled carbon nanotubes. They designed an optimal neural network with 65 neurons, the inputs of which were Reynolds number and volume fraction. In addition, for modeling 78 data were used which included 62 data for training and 16 data for model testing. Their results showed that the results of the neural network are in great agreement with the experimental results and can be used as models for predicting the thermal NF properties. Al-rashed [165] predicted the heat transfer coefficient and pressure drop in a horizontal pipe. He used the MWCNTS/EG–water non-Newtonian NF as a working fluid and used an empirical relationship to calculate the TC and viscosity of the NF. His results showed that in the volume percentage of 0.725% and temperature of 49.672 °C, the highest heat transfer coefficient occurs at the same time with the least pressure drop. This value is 2.73 times for the TC and is 2.41 times for the pressure drop compared with the BF. Alsarraf et al. [166] introduced a new support vector regression method as an alternative to the ANN to predict the NF properties. They evaluated its performance based on the experimental data for single wall carbon nanotubes in ethylene glycol versus concentration of nanotubes and temperature. Therefore, they trained both LS-SVM and ANN models to evaluate the viscosity of this NFs. They compared the accuracy of these two models. Finally, the generalization of LS-SVM is much better than ANN. In fact, the LS-LSM model has fewer parameters than ANNs. Bagherzadeh et al. [167] used the new EANN method to predict the thermal properties of F-MWCNTs–Fe3O4/EG new hybrid NF. In addition, they used an appropriate sensitivity analysis method, which included a new proposed method for sensitivity analysis through ANNs. Their results showed that the proposed method does not only provide more understandable and accurate results, but also requires less time and cost calculation. They showed that the proposed method could be used for all ANNs with different architectures, training algorithms and input–output data sets. In another work, Bagherzadeh et al. [168] predicted the properties of NPs CuO/liquid paraffin using a combination of neural networks and genetic algorithms. Bahrami et al. [169] used neural networks to predict the properties of non-Newtonian hybrid NF. They provided 24 different ANN methods for introducing appropriate architecture and training algorithms. Finally, the average error of the MSE square between the goals and outputs of the ANN was evaluated in order to provide the best optimization approach among them. Shahsavar et al. [170] used the robust weighted least squares support vector regression to predict the TC of water/graphene oxide–silicon carbide NF. In another study, Hemmet Esfe et al. [171] examined the viscosity of MWCNT (10%)–Al2O3 (90%)/5W50 nano-oil and compared the results of the experiments with the results of neural networks. Moreover, Karimipour et al. [172] predicted the TC of MWCNT–CuO/water hybrid NF using neural networks. Moradikazerouni et al. [173] also predicted the properties of SWCNTs/EG–water nanbofluid. In another work, Rouhani et al. [174] used neural networks to predict the properties of ZnO–Ag (50%–50%)–water NF. Eventually, all of these researchers concluded that the use of neural networks saves time, and on the other hand, results will be very precise.

Quality and comparison of predicted ANN responses with the responses that can be obtained from other technique

According to the studies, different methods have been used to predict different properties. For example, a study by Desai et al. [175] compared artificial neural networks and response surface method. They found that the average error for the artificial neural network method was 5.6% and for the response surface method was 20%. The CC parameter was 0.89 for response surface method and 0.99 for neural networks, respectively, indicating the superiority of neural network methods in predicting the behavior of nonlinear systems. They also showed sensitivity analysis with 2% predictive error in ANN method and 8% for response method.

Karazi et al. [176] compared the ANN and DOE (design of experimental) methods. Based on the results of these two methods and comparing them with the experimental data, it was found that the average error in the ANN method is two percent greater than the DOE method. But the ANN method factor improved prediction was almost two times better than the DOE. However, in another work, Sadollah et al. [177] predicted and optimized the stability parameter of titanium oxide nanofluid using RSM and ANN method and they found that ANN method was in better agreement with experimental results than RSM method.

Hemmat Esfe et al. [90] investigated and compared different ANN methods in predicting TiO2/SAE50 nano-lubricant viscosity with non-Newtonian behavior. They used three parameters of average correlation factor (R2), absolute relative deviation (AARD) and root mean square error (RMSE) to compare genetic algorithm–radial basis function neural networks (GA–RBF), least square support vector machine (LS-SVM) and gene expression programming (GEP). They finally found that the GA–RBF model showed the best accuracy.

As mentioned earlier, artificial neural network techniques are used successfully in prediction problems. Artificial neural network methods have great ability to predict nanofluid properties or other properties. Artificial neural networks have a rich surface of different structures in different applications. It also seems necessary to process basic statistical data to obtain better results. In neural networks, personal choice and experience are important in choosing the optimal parameters and model. In fact, choosing the best neural network structure does not have a specific format and is done with trial and error. Based on studies on the use of artificial neural networks in predicting properties or other issues, we found that the predictions of artificial neural network method are better than other methods. According to the reports provided by the researchers, also due to the similarity and proximity of the predicted values, using different methods can increase the reliability of the predicted values. One of the noteworthy points in the literature on the methods used in recent studies is the use of high number of observations. Therefore, many observations should be used.

Conclusion and future works

In the present study, the neural networks, their types and their applications in prediction were considered. In addition, their applications in prediction of NF properties were mentioned. Based on the results obtained by the researchers for TC, viscosity, etc., of NFs, many parameters, such as temperature, volume fraction of NPs, NP size, temperature, Brownian motion and the shape of NPs, are effective. Since there is no general correlation that can be used for the properties of different NFs, the use of methods to predict these properties is of interest. Neural network method is one of the most practical methods. All previous studies proposed the use of neural network methods to predict the NF properties. The error rate was very low compared to experimental results. Unfortunately, although many researchers have shown that the use of neural networks is very effective to predict the NF properties, it is still difficult to choose the type of neural network and its input and output structure. Thus, a comprehensive study was required to provide the performance of neural networks and their applications.

Neural networks are a model for predicting NF properties or heat transfer coefficients in thermal equipment. However, in previous studies, it was less likely to provide a model for predicting the properties of all NFs. It is proposed to present a model for predicting the properties of all NFs for future research. In addition, studies on the effect of temperature, volume and diameter of NPs have been observed, but less research is concerned with the study of most of the parameters affecting properties. Therefore, it is suggested that future studies using neural networks provide a model that there are effects of the effective parameters in it. It is also suggested using the neural networks to predict the efficiency and heat transfer rate of the thermal equipment to avoid costly testing and save costs. On the other hand, it is suggested to use predicted neural network models to reduce their error.