Keywords

1 Introduction

A cloud provides many resources, in the form of software or infrastructural; all this resource management is done by Cloud Management Broker (CMB). It works as an interface for allocating the resources available on different clouds based on requests. The choice of cloud resources depends on the cost, SLA, QoS, and some other parameters. Multi-cloud uses services from more than one cloud. Multi-cloud could be all private, all public, or a combination of both types of cloud. Companies use multi-cloud to enlarge their business and reduce the fear of data loss. As machine learning presents the main role in the field of IT such as video recommendation, filtering, social network, etc. same as deep learning is also representation learning doing well in these applications. The latest techniques of deep learning give an important role in many other applications such as Visual data processing, natural language processing, audio and speech processing, etc. Deep learning performing well in feature extraction without human intervention. Deep reinforcement learning is the combination of reinforcement learning and deep learning. Reinforcement learning is the intermediate between agent and environment. State, action, space, feedback and environment are the important component of deep reinforcement learning. Following diagram shows the working of deep reinforcement learning. The main Contributions of this paper are as follows.

  • We review the related work of deep reinforcement learning techniques.

  • We discuss the applications of different reinforcement learning techniques.

  • We compare DRL algorithms based on various parameters.

  • We identify various issues and future directions.

2 Related Work

Handling the incoming workload in the cloud and multi-cloud requires an efficient technique. Many authors work in this area [1]. The author presents a scheduler namely SCARL (Scheduler with Attentive Reinforcement Learning) to handle different resources requirements in the cluster. Experimental results show that this approach gives better results when compared with others [2]. The author introduces all the techniques related to cloud intelligence. Traditional approaches are not much efficient in terms of security, optimization, and some other parameters but now a day’s many intelligent techniques such as neural networks, deep learning, etc. are developed to deal with cloud computing efficiently. Most of part of this study focuses on resource utilization, security, and privacy. Many issues related to cloud brokering; VM placement and workflow scheduling, etc. are also discussed [3]. The author presents an intelligent approach for resource allocation for robotics requests is designed. It’s based on the reinforcement learning approach which helps the cloud to decided which request should be fulfilled and which resources needs to be allocated. A Semi-Markov Decision process is proposed which helps for the automatic management of resources and reduces human intervention. The comparison done with GA proposed approach gives better result in limited resources and provide average return time. In [4] the author proposed a deep learning approach for traffic identification and classification of mobile devices. This paper provides the taxonomy of analysis of network traffic, secondly, it proves that deep learning for traffic analysis is good to approach, third this approach provides a proposed model based on deep learning to get these benefits, and finally, it gives some future direction. In [5] proposed a Service Level Agreement (SLA) framework to meet the QoS requirements of all cloud users. This framework includes a reinforcement learning approach for adopting the changes in the requirement of cloud requests. The previous approaches degrade the performance when the request changes its requirement but the proposed approach works well in a dynamic environment. The experiment results show that it gives a better result to meet the QoS and avoid the SLA violation. In [6] the author discussed the resources management technique with the help of reinforcement learning. As the Cloud Management Broker (CMB) responsible to fulfil the user’s request and giving benefits to the cloud user. So the proposed technique which is based on reinforcement learning will help the CMB to gain profit. The author proposed two algorithms SARSA and Q-leaning. The experiment results show that Q-learning is more efficient in terms of computation and SARSA is more efficient.

In [7] the author designed a deep learning-based prediction algorithm for cloud workload (L-PAW).This author proposed two approaches first one is a Top Sparse auto encoder (TSA) used to extract the essential features of workload and the next it is combined with GRU (gated recurrent unit) to accurate prediction of variable workload. The experiment was done on Alibaba and Google data centers. Results show that L-Paw gives excellent accuracy in workload prediction when compared with other approaches. In [8] the author developed an approach for resource provisioning in the cloud environment. Resources provision means to allocate the required resources but it depends on the current workload. So it needs prediction of future workload is allocate the required resources. The author proposed the approach which is a combination of autonomic computing and reinforcement learning. The results show that the proposed method improves 12% utilization of resources and 50% cost reduction as compared to other techniques. In [9] the author suggests an approach for resource scheduling based on traffic, identification is proposed. The author proposed a virtual network framework for request scheduling which works in two-step the first is traffic identification and the second is path selection. mGBDT model used for traffic identification based on Qos and behaviour of request and after identification of traffic, the DRL model selects an efficient path for request. The experiment results show that the framework improves the quality of services and the DRL approach with POKTR algorithm gives better throughput and reduces network congestion [10]. The author proposed an online reinforcement learning approach for task scheduling in a cloud environment. The author discussed that a cloud broker which works as an intermediate with a cloud environment works in three steps task transmission, allocation of the task, and task execution. In the first step, all tasks are kept in a global queue after that with the help of reinforcement learning each task is allocated to the queue of a particular VM. Finally, the task is processed on that VM. So the author proposed an approach named stochastic approximation for allocation of a task in a dynamic environment. The experiment results show that the proposed approach outperforms when compare with others.

In [11] the author surveys deep learning algorithms, techniques, and applications. Deep learning changed the way of working in the field of IT. The author also discussed some weaknesses of deep learning and gives some future directions to make it more efficient. In [12] the author develops an approach for Fog computing. Fog works as an intermediate between cloud users and cloud providers. So resource management is an important concept at the Fog level. Many algorithms have been proposed for resources management but they work on a subset of a parameter such as response time, energy consumption, network bandwidth, and latency not simultaneously. So the author proposed the ROUTER technique to work on these parameters simultaneously. Results show that this approach reduces 10% response time, 12.35% energy consumption, 12% network bandwidth, and 10% response time. In [13] author discussed Virtual Network Embedding (VNE).VNE is a challenging concept in cloud computing as I need an intelligent solution for resources management, the AI-related concepts need to be improved in this field. So the author proposed a prediction model based on a reinforcement learning approach called Multi-stage Virtual Network Embedding (MUVINE) for data centers in the cloud. The author used the SARSA reinforcement learning agent which helps to embed the VM on suitable SNs. The binary VN works as a classifier of the incoming request. RBR is used to predict the features. So the entire approach of MUVINE used for embedding the VN on stable SN outperforms in terms of time-domain and user’s request. In [14] talk about Virtual network embedding. As it is an important term for proper resource utilization. The previous technique follows static techniques which was not much efficient for the optimal solution. The author proposed a reinforcement learning approach for the Virtual network embedding approach which uses historical data and gives an optimal solution. The comparison was made with two other algorithms and found that the proposed techniques outperform others. In [15] talks about intelligent resources management with the help of deep reinforcement learning. The intelligent resources manager consists of three things controller, monitor, and allocator. The controller collects the request requirement and schedules the resources by using an efficient algorithm from the resources scheduling algorithm pool and monitors will the availability of resources as per user request and the allocator will be allocated the requested resources. The DRL algorithm is an online algorithm discussed in this paper which always gives an optimal resource selection. The author proposed DQN named SA Q network which combines the features of SA and Q-learning helps to find a near-optimal solution. SAQN helps in the reduction of iteration for finding the optimal allocation of resources.

In [16] the author talks about the intelligent method for incoming workload prediction in a cloud environment. Cloud workload predictions very important part for the cloud provider to satisfy the customer and reduce energy consumption. The canonical polyadic decomposition based on a deep learning model is proposed. So the proposed approach helps to compress the parameter for efficiently improving the training. The result shows the resource allocation on VM can be done in advance with the help of workload prediction. In [17] the author talks about fast response time in the internet of things (IoT). The traditional approaches and deep learning approach to handling the user’s requests are time-consuming processes. So to handle this issue author proposes a broad reinforcement learning approach. It makes service providers very fast and efficient. Results show that its efficient to take fast action when compared with other approaches. In [18] author discussed the cloud provides many services such as storage, networking, etc. But the most important thing is to proper utilization of cloud resources and workload management. The author proposed a Reinforcement Learning based Enhanced Resource Allocation and Workload Management (RL-ERAWM) approach for resource allocation and managing the workload. It uses Q-learning which considers the arrival rate of the request and VM workload. The proposed approach performs better in terms of VM utilization, response time, and makespan. In [19] the author talks about job scheduling in a cloud environment means allocating the job on the proper resources to meet the QoS requirements of users. The author develops a framework for services providers named an intelligent QoS-aware job scheduling framework. It helps for job scheduling on a suitable VM. The proposed approach reduces average response time up to 40.4%, achieves QoS at a high level, and can work on different workload conditions. In [20] the author talks about task scheduling in cloud computing. Better task scheduling helps to get good results in all aspects of cloud computing. So the deep-Q-learning-based heterogeneous earliest-finish-time (DQ-HEFT) algorithm is proposed. It is closely related to deep reinforcement learning for task scheduling. The experiment results show that it can achieve better makespan and speed with a high volume of data as compared with the existing workflow scheduling algorithm (Table 1).

Table 1. Review of reinforcement learning technique based on various parameters

Review of reinforcement learning technique based on various parameters techniques give good results in resource allocation which cover many phases such as workload prediction, handling heterogeneous user requests, etc.

The above table depicts the work done by many authors in the field of Reinforcement learning. Parameters such as SLA violation, RA, cost, RT, and RP are included to know which author focuses on what type of parameter.

3 Conclusion

Machine learning becoming more and more popular in the field of research in terms of data mining, image processing, text classification, video recommendation, etc. With this machine learning, deep reinforcement learning is known as representation learning used in these applications. Many latest techniques of deep reinforcement present better results in Visual data processing, Natural language process, audio and speech processing, and much related application. So in this paper author prepare a review based on various deep reinforcement learning techniques in the field of cloud and multi-cloud. The parameter such as SLA violation, RA, cost, RT, and RP is included to check the author’s particular reason for the research. This review enlightens that deep reinforcement learning in the field of cloud and multi-cloud is used for resource allocation, task classification, future resources prediction, workload prediction, etc. The future proposed by authors is also depicted in the above table which inspires researchers to work in the field of deep reinforcement learning.