1 Introduction

Wireless communication has gained popularity in the last two decades. The last decade experienced unprecedented growth in the number of mobile users, mobile broadband traffic and also user demand for faster data access [1]. The applications such as 3D holography, machine-to-machine communication, virtual reality, e-learning and e-health, video-based applications and augmented reality demands more bandwidth. While, the existing 4G cellular technology is unable to provide the bandwidth required to meet the growing data demands of the existing mobile users, providing resources to support such applications with greater technology is more difficult. It is therefore evident that there is a need for next-generation wireless cellular communication with the capability to handle a greater number of smart devices and a variety of applications that cannot be handled by 4G. Therefore, there springs the research works on the Fifth-Generation wireless communication which requires new technologies to revolutionise conventional cellular communication.

Wireless communication technology applied in wireless devices has evolved over several decades. Starting from the voice-only 1G, all the way to 4G of today and the upcoming 5G of tomorrow, the wireless communication technology has come a long way [2]. The speed of the data transmissions is increased in each generation and the technology which is used to achieve this is also altered. 1G offered a speed of 2 kbps, 2G offered 64 Kbps and is based on GSM, 3G offered 2 Mbps whereas 4G of today offers 100 Mbps - 1 Gbps [3]. The main aim of 5G is to provide a significantly faster speed which might be in the range of 1Gbps to 10 Gbps and at the same time minimise the power requirements to support a considerably greater number of wireless devices and user demands. The first-generation cellular supported only voice transmissions and the then cellular devices had a very poor battery life and a transmission speed of 2.4 kbps. The second generation was a digital cellular technology that supported both voice as well as text message services (data and voice services) with a transmission speed of 64 kbps. 2G offered a secured as well as a reliable communication service. The third-generation cellular networks were commercially introduced in the year 2001. Smartphone technology services like email, web browsing, video downloading and picture sharing were introduced in the third generation. 3G facilitates a wide range of applications, greater capacity and increased data transmissions at a very low cost. The fourth generation evolved intending to provide high speed and quality, improved security, multimedia, and internet over IP and also a very low-cost voice and data services. Mobile web access, IP telephone, High definition Tv, Online gaming, 3D television, Videoconferencing and Cloud computing are all aided by the fourth-generation wireless communication networks. Though the applications and the advantages of 4G are far more than what was experienced in the 3G wireless networks, there arises a need to get upgraded to the next generation wireless technology (5G) [4]. The applications that are supported by 4G are all Bandwidth thirsty while only limited bandwidth resources are available. Adding up to this constraint, the increase in the number of wireless users and the demand for high-speed services by the users have posted a great urge for the next generation upgradation. Therefore, the current cellular 4G network has been analysed and a transition from the architectural to conceptual approach has been made, which is the next generation 5G networks. 5G is being designed to combine many technologies that are already found in 4G along with many new technologies. 5G is thus a remarkable merge of technologies like Heterogeneous networks, Massive MIMO, Millimetre-wave (mm) spectrum (30-300GHz), Cognitive Radio Network, D2D communication and many more.

1.1 Motivation

As mentioned above, a new change in the architecture, number of wireless devices and technologies will lay a tremendously large traffic burden on the Base Station. The Base Station will have to manage and control such a large platform. This will induce complexity and delay in the connection establishment, session setup, and transmissions. The future 5G which is about to replace today’s 4G in order to make the next-generation cellular networks capable of managing and meeting the demands of a large number of devices and applications that 4G cannot handle should also find a possible way to relieve the Base Station from traffic burden thus increasing the network capacity. One such promising and efficient technology is a Peer to Peer communication type which is Device to Device communication [5].

Device to Device communication was introduced in the 4G LTE. In cellular networks, D2D communication is defined as the direct transmission of data between two devices without the data being relayed by a central Base Station or an Access point. Device to Device communication or P2P or D2D as it is called, enables direct transmission between devices that are in close proximity with each other. Therefore, D2D enables the data traffic to be offloaded from the Base Station thus improving the network capacity and transmission delays. D2D services include Peer-to-Peer communication, Relaying and Proximity services. The applications of D2D include online gaming, content sharing (multimedia-images, audio, and video clips), traffic offloading, disaster relief and much more.

Such an emerging and promising technology with a wide range of applications have dragged our area of focus. We have envisioned in this paper D2D communication in 5G and B5G. Towards enhancing the spectral efficiency and throughput and improving the performance of the system, several resource management strategies have been analysed and summarised in this paper.

Though several review works have made a greater contribution to the research on resource management techniques in D2D wireless communication, a comprehensive survey of the recent existing works on the Resource allocation techniques in 5G and B5G networks is missing in the literature. First, the paper starts from explaining the fundamentals of D2D communication which paves the way to deeply understand the background of D2D which also helps the users to further understand the challenges better, that has to be met, while incorporating the concept of D2D into the wireless communication systems. The ultimate objective of all the research works has always been to improve the performance of the system which is completely dependent on the resource management techniques. A detailed and a modern study of Resource allocation and management is, therefore, necessary to understand and identify the research gap to work on and to frame strategies to improve the system performance using a modern and recent technology. The majority of existing research management works in Peer-to-Peer communication deals with the implementation of D2D communication using algorithms that becomes mathematically complex when applied in a repeated fashion in a dynamic wireless communication system [6].

Therefore, there is a need to determine Distributed resource allocation techniques that can allocate the resources automatically by applying the trendy Learning techniques that have a promising future in the world of machines and devices. The application of learning techniques in wireless communication which has gained more attention has not reached a wide usage in the current practical applications. To make the readers aware of this emerging strategy, we have explained the need for it, promising amazing results that can be obtained, aiding in achieving an efficient and automatic allocation of resources. Our work in this paper is, therefore, a novel survey on various recent resource allocation techniques in D2D communication which also explains the possible solutions motivating the readers by providing a future area of focus to enhance the system performance.

1.2 Contribution

The main contributions of this paper are mentioned below:

  1. 1.

    A detailed explanation about D2D communication is provided which includes the classification, application, advantages, and challenges of D2D communication.

  2. 2.

    Provides a detailed description of the resource allocation in D2D communication.

  3. 3.

    Gives a tabulated description of various optimisation problems.

  4. 4.

    Gives a tabulated review of various research allocation algorithms.

  5. 5.

    Evaluation and comparison of existing methods in D2D Resource allocation.

  6. 6.

    Discusses the outcomes of the survey and the open issues identified in the existing literature.

  7. 7.

    Our Future work is revealed briefly with a short introduction to Machine Learning.

1.3 Scope of the paper

The implementation of machine learning algorithm into D2D communication networks finds many advantages that overcome the limitations of the existing techniques. This paper strongly inflicts the need for machine learning techniques that requires a wider area of focus for the future enhancement of the system performance [7]. Machine learning algorithms find their scope in overcoming the following limitations of the existing strategies.

  • The existing works of literature use algorithms that do not utilise the available large sets of data that contains valuable information or patterns.

  • The existing algorithms take a longer time for decision making due to mathematical complexities in a dynamic environment.

  • The future 5G and B5G networks contain a larger number of devices that make the existing traditional centralised resource allocation techniques infeasible.

The following are the main advantages of machine learning in wireless communication that makes this survey paper to stress the need for machine learning in a D2D enabled wireless communication system.

  • Machine learning utilises the available large sets of data to learn and then uses the pattern and information to automatically take the decisions and actions.

  • Resource management done by Machine learning can adapt to the dynamic nature of the wireless communication systems.

  • Machine learning hands over the management and optimisation properties to the devices. This property allows the devices to allocate the needed spectrum, transmission power, and so on by themselves.

Thus, the need for proper resource management is stressed and a future solution that is framed to overcome the identified limitations of the existing literature makes this paper interesting, serving as a motivation for further new findings in this area.

This paper is organised as follows. Section 2 gives a detailed framework of D2D communication in the wireless communication system which includes the classification, applications, advantages and the issues and challenges that is in D2D communication. Section 3 elaborates on the resource allocation methods applied in D2D communication, Section 4 summaries the review, highlights the open issues and proposes a work for future research and Section 5 draws a conclusion.

2 Framework of D2D communication

After the introduction of D2D communication into the network, the network is divided into two tiers namely macro tier and device tier [8]. The macro tier is the traditional cellular communication while the device tier is D2D communication. As mentioned above D2D communication is defined as the direct communication between the mobile devices that are closer to each other and the transmission of data takes place without the complete involvement or with the partial involvement of the Base Station. Figure 1 gives the system model of D2D integrated Cellular communication. The picture depicts the applications of D2D communication like content sharing, online gaming and multimedia content distribution in the network.

Fig. 1
figure 1

System model of D2D integrated Cellular communication

By doing so, the tremendous data traffic [9] of the next generation wireless communication that could be laid upon the Base station can be offloaded thus increasing the overall network performance and capacity. Due to a short range of communication between two devices that constitute a D2D pair, there is a great improvement in the spectrum and energy efficiency, end to end delay and throughput [10].

The classification of D2D communication, its advantages, applications, issues and challenges are explained in the following sub sections.

2.1 Classification of D2D communication

Device to Device communication can occur in the licensed spectrum as that of a cellular communication and can also occur in the unlicensed spectrum as that of the ad-hoc networks. The Device to Device D2D communication is thus classified into two major types based on the spectrum [11] of operation namely Inband D2D communication and Outband D2D communication as shown in Fig. 2. [8].

Fig. 2
figure 2

Classification of D2D communication

The systematic representation of the classification of D2D communication is shown in Fig. 4. The following subsections will provide a brief overview of the D2D types.

  1. a.

    Inband D2D —As the name implies, Inband D2D defines the communication which takes place between the devices in the licensed spectrum as those of cellular communication. i.e., the same licensed spectrum is used by both the D2D as well as the cellular devices. Inband D2D is preferred over the Outband D2D communication as the Inband communication has higher control over the licensed spectrum which makes the interference management feasible while the uncontrollable unlicensed spectrum degrades the Quality of Service (QoS) of the system. The Inband D2D communication can be further classified as Underlay and Overlay D2D communication [12, 13]. Underlay D2D is based on the spectrum sharing concept. The spectrum that is allocated to the cellular users is also allocated to the D2D users as shown in Fig. 3a. The main issue with the Underlay D2D is the interference that is caused by the D2D user to the cellular user as the same spectrum is being shared. The Overlay D2D communication does not involve spectrum sharing. The licensed spectrum is divided into two non-overlapping portions where one portion is allocated for cellular communication and the other for D2D communication as in Fig. 3b. As a separate dedicated spectrum is allocated for the D2D users, Overlay D2D rules out the concern of interference, but it paves the way to the wastage of the spectrum, as separate resources are allocated for the D2D communication. One part of the band which even when it is idle cannot be used or shared with the other group thus wasting the limited spectrum resources.

  2. b.

    Outband D2D—The Outband D2D communication takes place in the unlicensed band as in the case of Ad-hoc networks (Eg: Bluetooth, Wi-Fi direct, etc.,). Two devices that are in close proximity transfer data in the unlicensed spectrum thereby not disrupting the cellular communication that is operating in the licensed band which is shown in Fig. 3c. This implies that the Outband D2D communication helps in eliminating interference between the cellular and the D2D users though there is still a possibility of interference that can occur between the devices that are operating in the same unlicensed spectrum (2.4 GHz ISM band). These services provide local services as well as internet access faster at a very low cost and in an astonishingly convenient way in the unlicensed ISM band (Industrial, Scientific and Medical radio bands). This type of interference is a threat as interference management in the uncontrollable unlicensed spectrum is very challenging. Thus, connection control, as well as interference management, becomes an issue in the unlicensed band. Wi-Fi direct, Zigbee and Bluetooth are few examples of Outband D2D. Outband D2D communication can be further classified into two types as controlled and autonomous communication. Under the controlled D2D type, the cellular network controls the interface and is responsible for the management of parameters that control D2D communication. The controlled D2D assists also in the improvement of spectral efficiency, system performance, and reliability. In the autonomous D2D type, D2D communication is controlled by the users or the devices that are involved in the communication while the cellular communication is controlled by the network.

Fig. 3
figure 3

Spectrum allocations between Cellular and Inband and Outband D2D communication a Inband Underlay D2D b Inband Overlay D2D c Outband D2D

Apart from Inband and Outband D2D communication types, D2D communication can also be divided based on the origin of a connection request and the transmission type. Based on the request origin, it is classified as network originated and the user originated. Here, the connection setup is initiated by the network in the former case and by the User Equipment in the latter case. Based on the number of receivers involved in communication, it can be of Unicast, Multicast or Broadcast types (Fig. 4).

Fig. 4
figure 4

The systematic representation of the classification of D2D communication

2.2 Applications of D2D communication

The D2D use cases can be broadly classified into two categories namely Peer-to-Peer (P2P) or Direct transmission and relaying type. In the P2P type, the D2D users themselves are the transmitting source and the receiving destination. The communication or the transmission of data takes place between the terminals directly without being routed through any central access points. D2D communication aids wireless cellular communication in many ways. D2D application services include local data services, D2D integrated Internet of Things (IOT) services and Disaster Management.

2.2.1 Local data services

The local services are proximity-based service. The local services of D2D exhibit three major applications which include proximity-based transmission where the data is transmitted directly between two nearby terminals, traffic offloading and also social applications where the device is discovered, the connection is established and the transmission of data or online gaming takes place. Traffic offloading is the most important local service as the number of devices and the network density is expected to increase in the future 5G and B5G networks [14]. With the increasing multimedia services like video streaming and HD video transmissions, a tremendous amount of traffic is laid upon the central access point and the spectrum resources [15]. The downlink traffic in the network can be offloaded by placing media services in a hotspot area that serves popular media contents to the users. The local server serves the multimedia content to the requesting devices in that location thereby reducing traffic at the Base Station. The trending culture of exchanging a huge amount of multimedia contents like pictures, videos and voice notes in a gathering also can be offloaded from the base station by enabling D2D communication.

2.2.2 D2D integrated IOT services

In the near future of wireless communication, the communication link will be established between the terminals which are most likely to be machines. The communication between the machine terminals is an important characteristic of IOT. An interconnected wireless network can be created by integrating both the D2D and IOT features [16]. IOT enhanced by D2D is applicable in the Vehicle to Vehicle (V2V) communication and other smart devices. A network of vehicles is established ensuring that a vehicle can communicate with the nearby vehicles to avoid any accidents on the road during lane transmissions and also to alert the other drivers who are nearby, about the accident or any other hindrance that occurred on the road. The traffic status of a particular path can also be updated by the vehicles participating in the V2V communication enabling a real-time update of the traffic status to the online “Maps” services. IOT and D2D together can also help in emergencies in clearing the traffic as quickly as possible. D2D communication also finds a greater advantage in the field of Wireless Sensor Networks. IOT which is the major concept in the future sensor world establishes a wider connection between the devices themselves without the Base Station acting as an intermediate node [17]. The task of cluster formation in wireless sensor networks which is supported by IOT is also an extension of D2D [18].

2.2.3 Disaster management

D2D communication must be able to provide access to the networks in the absence of cellular communication during the occurrences of natural disasters [19]. Communication becomes an impossible thing in the disaster-hit regions. This constraint is more in the case of wireless communication. Various unforeseen natural calamities like Tsunami, Earthquake, Cyclones, Volcanoes, etc., cause severe damage to the communication systems. Though the disasters that occur in different geographical regions are different, the issues that are faced are almost the same. Damage in the infrastructure of communication networks, Energy and Power constraint, Scarcity of resources, less availability of networks and limited services are the major issues. The repairing and the management of the damaged system takes a long time and thus cannot be corrected immediately. This delays the rescue operations. The damage in the calamity-hit areas can be partial or complete, but the failure in the telecommunication system delays the emergency responses resulting in the unavoidable loss of many lives. The solution to this problem is establishing wireless communication between two terminals based on the D2D communication. This concept, in other words, can be explained as establishing an Ad-hoc network in the network blind-spot areas. The Single and Multi-hop communication can help the users in the disaster-hit area to connect with the devices in the coverage area and they are connected by the wireless networks thereby easing the rescue operation.

2.2.4 D2D in rural areas in the absence of base stations

D2D finds a big application in connecting the rural and urban areas. The rural areas that don’t have access to the internet and other facilities that can support 5G and beyond 5G (B5G) can be connected via D2D communication. Due to the lesser density or capacity of the network, the rural areas have an advantage of low latency. More number of machines and sensors that are installed to support the IoT concept also enhances the facilities in the rural remote areas. Remote surgery is one of the biggest examples where the surgery is performed by a more responsive surgery robot that will react to the instructions of the surgeon who is thousands of miles away from the patient who is in the remote areas. Autonomous farming is also an application of D2D communication where D2D aids communication between the sensors that are installed in the farm and the farm equipment. This type of communication between the sensor devices and the farm equipment does not need to have a Base Station as an intermediate communicating structure for efficient operation. Therefore, D2D communication can be installed in rural areas in the absence of the Base Station.

2.3 Advantages of D2D communication

2.3.1 Spectral and power efficiency

The data being transmitted between the devices directly without being relayed through a central entity enables D2D communication to exhibit three types of gains namely, Hop gain, Proximity gain, and Reuse gain. Direct transmission of data in P2P/D2D results in Hop gain as no central entity or a base station is involved in a communication that is taking place between two devices that are in close proximity to each other. The proximity of the devices also requires very less power or energy for transmission leading to power or energy efficiency [20]. Due to short-distance transmission between the peers, the end-to-end delay during transmission is also minimised thereby increasing the throughput resulting in proximity gain.

The radio resources that are occupied by cellular communication are opportunistically accessed by devices that are involved in D2D communication. As the resources are being shared between the cellular and D2D communication, the limited and scarce spectrum resources are efficiently utilised [21]. This spectrum sharing property of D2D communication results in Reuse gain. Thus, spectral efficiency is a major advantage of D2D communication.

2.3.2 Reduced latency

The short-distance communication between the devices for data sharing, online gaming, smart traffic monitoring systems, emergency services at times of natural disasters requires real-time and a faster network which D2D communication can provide. The shorter distance communication reduces the time required for the transmission of data for P2P or one to many types of content distributions. Thus, D2D exhibits an advantage of reduced latency such that D2D can be applied for real-time applications where delay cannot be acceptable.

2.3.3 Fairness

The devices that are situated in the cell edge, away from the Base Station suffers from a very poor signal and channel quality while the devices that are closer to the Base Station enjoys the full benefits of the network as illustrated in Fig. 5. D2D changes this condition by enabling the cell edge devices to relay the data through a nearby device that experiences good signal quality and in coverage to the Base Station making it a relay-based D2D [22]. Thus, D2D offers fairness in communication [23] where the devices that are closer and in coverage with good signal quality and the devices that are out of coverage and far away from the Base Station also experience the same quality of communication.

Fig. 5
figure 5

Fairness in D2D communication

2.3.4 Flexible infrastructure

Traditional wireless communication requires a complete setup of architecture to enable communication. Any kind of damage or fault in this infrastructure due to any type of calamities can fully disable wireless communication. Fortunately, D2D makes communication possible, by setting up an Ad-hoc like network even when the infrastructure is disturbed, enabling wireless communication. Thus, D2D does not rely completely on a structured architecture where there is no guarantee of proper functioning at all times. Rather, it creates a conceptual approach for wireless communication.

2.4 Issues and challenges in D2D

D2D communications on the wireless cellular communication will be a huge benefit because of its wide range of advantages. However, D2D communication also brings new challenges and design issues [24] that are discussed below.

2.4.1 Device discovery

The initial step for D2D communication is device discovery which is also called as peer discovery. The device which desires to establish a D2D connection should be able to find the other nearby devices in a short duration of time [25]. The devices can be discovered only if the devices are nearer and if the devices choose to be discoverable. Here, the devices which intend to establish a communication sends beacon signals to the nearby devices in a network as shown in Fig. 6. [26]. The devices which are ready to establish a D2D network connection replies to the beacon signal with its location and channel state information and the distance details. Device discovery can be classified into two types as Centralised Discovery and Distributed Discovery.

  • Centralized Discovery: Two devices in D2D pair discovers each other with the participation of a central Access point or a Base Station. The devices which are intended to transfer the data to the nearby devices notify the Base Station about the same. The Base Station then gathers information about the Channel State quality, location, availability, interference and power control qualities by initiating the exchange of messages between the two devices. The centralised discovery can be further classified based on the degree of involvement of the Base Station. The involvement of the Base Station may be complete or partial. If the Base Station gets completely involved in the device discovery, then the devices are restrained from initiating the discovery of devices themselves. The exchange of messages regarding device discovery between the devices also takes place only through the Base Station. So, the Base Station handles the responsibility of discovering the nearby devices, gathering information about the efficiency of channel connection and D2D communication initiation phases. In the partially involved Base station type of device discovery, the initiation of device discovery is done by the device which intends to communicate, without getting any permission from the central Entity or Base Station. The involvement of the Base Station is said to be partial as it is not involved in the initial discovery phase but later in the exchange of messages to gather information about the Signal to Interference plus Noise Ratio (SINR) and the Channel quality and also later in the D2D connection establishment process.

  • Distributed Discovery: The discovery of the devices and connection establishment are completely done without the involvement of the Base Station by the devices themselves. The device that desires to initiate a communication itself starts to look for the devices in its locality. The beacon signals are sent, the messages are transmitted between the devices regarding the locality, channel state information and availability status of the devices for D2D communication. As there is a lack of management from the central entity or Base Station, there arise chances for interference and synchronisation.

Fig. 6
figure 6

Device Discovery

To find the other nearby devices a pilot or a beacon signal is transmitted. The pilot signal carries the scheduling information which can also turn to be an issue if the information that it carries is inappropriate. The beacon signals are transmitted frequently and repeatedly unless an appropriate device is discovered. This repetition of the beacon signals can cause interference to other devices operating in the network. The beacon signals also cannot be infrequent as it will delay the discovery process as the neighbouring device status keeps on changing due to the mobility of the devices. Another major issue in D2D communication is synchronisation. All the devices in the network are synchronised with the Base Station. The Base station fixes the scheduling time. When two devices are involved in the data transmission and one of the devices is found to be not within the range of the Base station then the network has to continuously look for other nearby devices.

2.4.2 Mode selection

Traditional wireless communication operates in the cellular mode where the devices communicate with the other devices only through the base station. But after the introduction of D2D communication, the modes of communication have been extended to be either cellular or D2D or even hybrid which is a combination of both [27] as shown in Fig. 7.

  • Cellular Mode: This is the traditional method of wireless communication. The devices communicate with each other through the Base Station. The Base Station acts as a relay between the two devices that communicates. The whole of the resources is utilised only by the devices that are operating in the cellular mode. There is, therefore, no concept of direct communication in this mode.

  • D2D Mode: The devices communicate with each other without the involvement of the Base Station. Based on the spectrum that is being used, D2D mode can further be classified as Reuse mode and Dedicated mode [28].

  • Reuse D2D mode: Reuse mode also called as Underlay D2D mode reuses the spectrum that is being used by Cellular communication. The devices that are operating in this mode transmit the data directly reusing the uplink and downlink resources of the cellular radio spectrum.

  • Dedicated D2D mode: Dedicated D2D mode also called as Overlay D2D mode operates in the dedicated spectrum. The data transmission takes place directly between the devices in a separate or dedicated radio spectrum that is allocated for D2D communication.

Fig. 7
figure 7

Systematic representation of mode selection in D2D

The different modes of operations in a network increase the complexity of network management, adds burden to the network and also complicates resource management. The mode of communication depends on the distance between the devices, channel gain, QoS and transmission power. The operating mode is selected if the channel gain and the QoS of a particular mode is greater than all other possible modes. The calculation required to select a mode calls for network overloading and complexity. The number of times the modes in a network is being altered is also a challenge. With a greater number of devices in a network and a greater number of channels, the number of times the modes are being altered is unavoidable.

2.4.3 Security and privacy

The devices that are involved in D2D communication operates in different modes as mentioned above. The operation of the cellular network after the introduction of D2D communication is a mixed function of both cellular and Ad-hoc communication. Therefore, D2D faces security and privacy issues that are also common in both cellular and ad-hoc communications [29]. The most common security issues are malware attack, Eavesdropping, relay attacks, message altering, node impersonation, etc., The security and privacy is a major challenge as there is no central entity to manage security and privacy of the network [30, 31]. Thus, D2D requires a more trusted and secured data transmission. The cryptographic method of protecting the data from the third party is not possible with D2D communication due to the absence of a structured traditional cellular infrastructure. The key generation in the cryptographic scheme is also power consuming. It is not advisable to invest lots of power to increase only the security and privacy of the system. Therefore, a better trade-off has to be maintained between the security and the energy that is being utilised. In D2D communication several devices join and leave the network. The devices that newly join a network must maintain the privacy policy of the network along with the existing members for a better performance of the network regarding security and privacy.

2.4.4 Interference management

In a cellular network with interference due to the integration of D2D communication, interference management is a major challenge [32]. While D2D communication can take place between the devices in a separate dedicated spectrum, the scarce resources are indeed wasted where the main purpose for the transformation from 4G to 5G will be ignored by doing it that way. The necessity and the purpose of D2D communication is to offload the data traffic at the Base Station and at the same time increase the spectral efficiency by spectrum sharing. Therefore, spectrum sharing is the advisable concept in the efficient usage of spectrum resources in D2D communication. At the same time, the sharing of spectrum between the cellular and D2D communication will undoubtedly lead to interferences that are undesired. The interference is found to take place between the two tiers of the new cellular architecture explained above. The interferences in the new two-tiered architecture are co-tier and cross-tier interferences [33] as in Fig. 8.

  • Co-tier interference: Co-tier interference takes place between two devices that are operating in the same tier. In a communication system consisting of both cellular and D2D communication, the interference occurs between two D2D devices. The interference occurs between the transmitting D2D device and a receiving D2D device that are sharing the spectrum resources and that are closer to each other. In other words, the co-tier interference occurs between the neighbouring D2D devices that are sharing the same spectral resources.

  • Cross-tier interference: Cross tier interference takes place between the devices in two different tiers. In other words, the devices in the cellular communication and D2D communication interferes with each other when the D2D communication is incorporated in the cellular communication. The source and the destination of the interference is different in the Cross-tier interference depending on which the cross-tier interference takes place in two different scenarios as uplink and downlink scenarios. In the uplink scenario, the cellular device in the macro-tier transmits its data to the Base Station while the D2D device also transmits its data to the D2D receiver in the device-tier. The uplink spectrum is shared by the D2D pairs and therefore, interference occurs between the cellular transmitter and the D2D receiver and between the D2D transmitter and the Base Station. The interference at the Base Station is negligible as the power at the Base Station is higher compared to the power of the interfering signal. The downlink scenario deals with the condition where the Base Station is transmitting data to the cellular user. The downlink spectrum is being shared by the D2D users for communication. The interference therefore occurs between the transmitting Base Station and the receiving D2D and also between the transmitting D2D user and the receiving cellular user in the macro-tier.

Fig.8
figure 8

a Co-tier interference b Cross-tier interference in uplink scenario c Cross-tier interference in downlink scenario

These interferences degrade the system performance and QoS and also leads to wastage of energy and bandwidth. Thus, the interferences that are occurring in the cellular communication incorporated with D2D communication will degrade the system performance and spectral efficiency which is one of the main characteristics of D2D communication in 5G. Thus, the undesired interference that is occurring in the communication system should either be controlled or minimised [34]. The main reason for the interferences that are occurring in the communication system is poor resource allocation as the sharing of spectrum resources are involved. A proper resource management and allocation can lead to a better performance of the communication system.

2.4.5 Resource management

Resource management includes interference management and power control. The resource allocation approaches can be centralised, distributed or semi-distributed [35].

In the centralised approach, the Base station is responsible for the allocation and management of the resources. The complexity of the centralised approach is more as the Base Station which is responsible for monitoring the performance of the system, SINR and channel quality calculation, connection establishment, and call setup should also allocate the resources and control the interference in the system. This complexity also grows as the number of users in the network increases as the Base Station without any assistance has to collect the necessary information from the devices and measure the quality of the channel and the performance of the network. Therefore, the centralised approach is only suitable for small cell networks and is more complex for larger networks.

The distributed approach, however, does not involve the Base Station in the management of resources. The devices are responsible for the allocation and management of resources. This type is also suitable for larger networks with a greater number of devices. The devices gather information about the neighbouring devices more often by the exchange of messages. The devices that intend to communicate in the D2D mode monitors the cellular resources and opportunistically access the licensed resources of cellular communication. The devices monitor cellular communication to gather information about the channel quality, SINR and also about the availability of cellular resources.

Resource allocation is also a solution to avoid undesirable interference in a communication network. Minimisation of interference is done by an efficient allocation of resources like spectrum and power which is the most challenging issue in D2D communication [36].

The following section gives a complete description and comparative analysis of various resource allocation methods in D2D communication.

3 Resource allocation in D2D

This section explains the existing algorithmic design for resource management and makes a comparative review of the recent works that focus on resource allocation techniques.

As mentioned in the above section, issues and challenges of D2D communication which includes mode selection, peer discovery, Security and privacy, Interference management, and Resource management when given a proper research focus will enhance the performance of the system. In order to maintain the QoS and to improve the performance of the system, the spectrum is shared between the cellular and the D2D devices. The background of D2D resource management and Optimisation problems in D2D communication, various resource allocation algorithms, and methods have been analyzed and tabulated. A review of the literature has been done in detail to provide a comparative analysis of various resource allocation methods. This section summarises the recent works and highlights the novelty of the proposed future work by comparing it with the existing works.

3.1 Resource allocation schemes

With the increasing number of devices and bandwidth-hungry applications, the demand for more bandwidth is a challenge to all the network operators. Content sharing among the devices that are geographically closer has also increased to a very greater extent. Content distribution and content spread require a large number of resources and when transferred through the Base station, a huge amount of data traffic is added up at the Base Station with an increased delay. These demands cannot be met by the current 4G cellular communication where the spectrum resources are scarce and limited. The increased network density and number of devices also lays a greater amount of traffic load on the Base station that has to be offloaded to reduce the latency and to satisfy the mobile device users. Fairness of communication should also be provided to the users in a network irrespective of their location and status. Power consumption of the devices should also be reasonable to enjoy a long-lasting battery power. Therefore, the need for a spectral and power efficient [37] system with improved latency, fairness and throughput can be fulfilled by D2D communication in 5G which is efficient and advantageous in all the above-mentioned parameters.

The introduction of D2D communication into traditional cellular communication has led to the change in the architecture of traditional cellular communication as the two-tier cellular communication. The new altered architecture consists of two tiers. The first tier is the macro-cell tier. The macro-cell tier is the traditional communication that takes place between the Base Station and the cellular devices. The second tier is the Device-tier where the communication is between two devices that are in close proximity to each other. The device-tier is the D2D communication while the macro-cell tier is conventional cellular communication.

Resource allocation has gained more attention from the researchers as a poor resource allocation can lead to interferences in wireless communication. Inband D2D communication [38] as mentioned in the previous section has two types of spectrum allocation which is classified as underlay and overlay where the later type supports a separate dedicated licensed spectrum for both cellular and D2D communications thereby eliminating the causes for interference. In the case of underlay D2D communication where the spectrum resources are shared between the cellular and D2D devices, poor allocation and management of resources will lead to unavoidable interferences.

Therefore, of all the issues that are mentioned in the previous section, interference management is the most important issue. Interferences in a communication network can be cancelled by creating a model of the interference signal and then the estimated model of the interference can be subtracted from the signal that is received at the receiver. The receiver, containing the desired signal along with the undesired interference signal, subtracts the undesired interference from the total received signal thus extracting only the desired signal.

However, a proper resource allocation plays a major role in the involvement of interference avoidance and reduction. In addition to the mitigation of interference, Resource allocation also helps in improving the data rate of the wireless communication systems where the channels are unreliable. A good allocation and sharing of the resources can pave the way to the efficient usage of the resources that are scarce thus serves the purpose of D2D communication and also helps in ignoring the interferences that are caused by resource sharing. Proper resource allocation is thus responsible for the mitigation of interference in a communication system and in the improvement of data rate, throughput, and the system sum-rate. The calculation of the above metrics is done by applying the following equations [39,40,41].

  • Scenario 1: Let us consider a device uploading data to the eNodeB by the traditional wireless cellular communication. Consider that the ith cellular device is considered to transmit the data to the eNodeB in a conventional wireless cellular communication method. Then, Signal to Interference plus Noise Ratio (SINR) between the cellular device and the eNodeB is calculated as in Eq. (1).

$$ {\gamma}_{CU_i eNB}=\frac{P_{CU_i}{g}_{CU_i eNB}}{\sigma^2+\sum \limits_{c=1,c\ne i}^C{P}_{CU_c}{g}_{CU_c eNB}+\sum \limits_{d=1}^D{P}_{D_d}{g}_{D_d eNB}} $$
(1)

where, \( {\gamma}_{CU_i eNB} \) is the SINR at the eNodeB, \( {P}_{CU_i} \) is the transmission power of the ith device in cellular mode, \( {P}_{CU_c} \) and \( {P}_{D_d} \) is the transmission power of the cth cellular devices and the dth device in D2D mode reusing the spectrum of the ith cellular device, \( {g}_{CU_i eNB} \) is the channel gain between the ith cellular device and the eNodeB, \( {g}_{CU_c eNB} \) is the channel gain between the cth cellular device and the eNodeB, \( {g}_{D_d eNB} \) is the channel gain between the dth device in D2D mode and the eNodeB and σ2 is the noise. All the other cellular devices c = 1 to C other than the transmitting ith cellular device and the D2D devices d = 1 to D causes interference to the transmitting ith cellular device.

The data rate for the conventional cellular communication is given in Eq. (2) where the SINR is calculated between the ith cellular device and eNodeB.

$$ \mathrm{Cellular}\ \mathrm{data}\ \mathrm{rate},\kern0.5em {c}_{CU_i}= BW{\ast \mathit{\log}}_2\left(1+{\gamma}_{CU_i eNB}\right) $$
(2)
  • Scenario 2: Let us consider a device Dj transmitting data to another device DR in D2D communication mode. The SINR between two devices operating in the D2D mode can be found out by calculating the SINR between the jth device in D2D mode Dj and the Rth D2D device DR is calculated as in Eq. (3).

$$ {\gamma}_{D_j{D}_R}=\frac{P_{D_j}{g}_{D_j{D}_R}}{\sigma^2+\sum \limits_{c=1}^C{P}_{CU_c}{g}_{CU_C{D}_R}+\sum \limits_{d=1,d\ne j}^D{P}_{D_d}{g}_{D_d{D}_R}} $$
(3)

where, the jth device in D2D mode Dj and the Rth D2D device are considered to be operating in D2D mode, \( {D}_R{\gamma}_{D_j{D}_R} \) is the SINR between the jth device and the Rth device in D2D mode, is \( {P}_{D_j} \) is the transmission power of the jth device, \( {P}_{CU_c} \) and \( {P}_{D_d} \) is the transmission power of the cth cellular device and the dth device in D2D mode using the same spectrum, \( {g}_{D_j{D}_R} \) is the channel gain between the jth device and the Rth device in D2D mode, \( {g}_{CU_C{D}_R} \) is the channel gain between the cth cellular device and the Rth device in D2D mode, \( {g}_{D_d{D}_R} \) is the channel gain between the dth device in D2D mode and the Rth device in D2D mode.

The data rate for the D2D communication is calculated between the jth D2D user Dj and the Rth D2D device DR is calculated as in Eq. (4) where the Rth device can also be a relay in case of a relay assisted two hop or multi-hop D2D communication.

$$ \mathrm{Data}\ \mathrm{rate}\ \mathrm{in}\ \mathrm{D}2\mathrm{D}\ \mathrm{communication},\kern0.5em {c}_{D_d}= BW\ast {\mathit{\log}}_2\left(1+{\gamma}_{D_j{D}_R}\right) $$
(4)

The throughput calculation for the cellular communication, D2D communication and the total throughput of the system with D2D underlaying cellular network is done by the mathematical expression of Shannon’s formula as in Eqs. (5), (6), (7) [40].

$$ {T}_{cellular}={\sum}_{c=1}^C{\mathit{\log}}_2\left(1+{\gamma}_c\right) $$
(5)
$$ {T}_{D2D}={\sum}_{d=1}^D{\mathit{\log}}_2\left(1+{\gamma}_d\right) $$
(6)
$$ {T}_{total}={\sum}_{c=1}^C{\mathit{\log}}_2\left(1+{\gamma}_c\right)+{\sum}_{d=1}^D{\mathit{\log}}_2\left(1+{\gamma}_d\right) $$
(7)

Where, γcγd are the Signal to Interference plus Noise Ratio (SINR) of cellular and D2D communications.

The sum rate capacity is given by the Shannon capacity formula as in Eq. (8) [41] using the Bandwidth and SINR.

$$ \mathrm{Sum}\ \mathrm{rate}= BW\ast {\mathit{\log}}_2\left(1+{SINR}_r\right) $$
(8)

Where, SINRr is the SINR at the receiver, r.

The resource allocation is either done by the Base Station or by the devices themselves. Based on the degree of involvement of Base Station in the resource management, the spectrum resource allocations and management can be carried out in two different ways as centralised and distributed resource allocation techniques.

Many research papers have focussed on the resource allocations being done with the other D2D steps like mode selection, device discovery, and energy/power allocation. The first and the foremost step in D2D communication is the mode selection followed by the discovery of nearby devices in order to establish communication with that device.

3.2 Types of resource management

The resource management types are classified based on the degree of involvement of the Base station. The resource allocations can be classified as centralized resource allocation, distributed resource allocation, and semi-distributed resource allocation as shown in Fig. 9. The centralised resource allocation method involves signal exchanges between the Base Station and the D2D transmitter requesting resource allocation as in Fig. 9a. The Distributed resource allocation method is device-centric where the devices themselves sense the available spectrum. Therefore, the Base Stations are involved only for freezing the spectrum requested by the D2D devices.

Fig. 9
figure 9

a Centralised D2D resource allocation b Distributed resource allocation

As mentioned in the previous section, Centralized resource allocation schemes are very effective in managing and controlling the interference in the network. D2D users or devices provide information on the local channel quality measurements to the Base station. It is therefore obvious that the base station has complete knowledge of the Channel State Information (CSI) in the network. The channel gain of a cellular user and the base station and that of the D2D user and the base station is also well known at the base station side as the devices both cellular and D2D transmit their CSI to the Base station. At the same time, the channel quality between two D2D devices and between one cellular user and one D2D device is difficult for the Base Station to compute. Resource allocation is therefore done also in a D2D communication underlaying cellular system with imperfect CSI [39]. Likewise, when the CSI is not known at the Base Station, the condition is called a partial CSI. The resource allocation is centralized when it is done by a central entity or the Base station. Thus, partial CSI is a challenge in the centralized D2D resource allocation type.

Except for the communication between the device and the base station, there is no other communication that is taking place between the other nodes in the network. Therefore, the signalling overhead is larger in this type of resource allocation. Larger network capacity in the future 5G network and higher complexity makes the centralized resource allocation less suitable. Thus, the complexity and the signalling overhead in the centralized resource allocation methods degrade its feasibility. To overcome this drawback in the centralized resource allocation technique, a distributed resource allocation procedure is used. In distributed resource allocation, the resources should be allocated to the D2D links in a distributed way [42]. Each and every D2D link that is involved in direct communication scans the network for any resource that is not used by any other device and utilises that resource for its communication. The D2D devices transmit the local information like channel quality with the near neighbouring devices. The near-optimal solution is found out by employing a distributed resource allocation method in the network.

The main disadvantage of using distributed resource allocation is that the number of message exchanges between the devices is greater. Most of the previous research works are centralized while research works in the distributed resource allocation have started to spring in the recent days.

The following Sub-sections provide a clear and deeper description of the optimization problems and techniques that are used to encounter the problems in D2D communication. The algorithms like Heuristic algorithm, Lagrangian duality, Evolutionary, Steepest descent, Game Theory, Graph Theory, and Fuzzy logic have been tabulated and analyzed in Table 2 to provide a better understanding to the readers about the algorithms and techniques that are used for the resource allocation in D2D wireless communication systems. The papers that have focussed on Centralised, Semi-Centralized and Distributed resource allocation techniques are compared, analyzed, and tabulated in Table 3. The research gap and summary of the analyzed techniques are also discussed in detail.

3.3 Optimization problems in D2D communication

The concept of D2D communication has to deal with many optimization problems. This section gives a brief description on the optimization issues faced in the mode selection, device discovery and also power and spectrum allocation.

A brief analysis and description of the optimization problems are tabulated in the Table 1.

Table 1 Optimization problems in D2D communication

3.4 Resource allocation algorithms in D2D

Various resource allocation techniques or algorithms are being used to allocate the resources in D2D communication. A brief description of the existing algorithms has been given in Table 2.

Table 2 Resource allocation algorithms in D2D communication

3.5 Analysis of various resource allocation methods

The various resource allocation algorithms and strategies have been analysed in this session. The centralised, as well as the distributed resource allocation methods, have been analysed and tabulated in Table 3.

Table 3 Comparison of various Resource allocation methodologies

Pavan Kumar Mishra et al. [39] have considered a content uploading case of a cell edge device where the uplink cellular resources are reused. The main objective addressed in this paper is to minimise the packet loss, upload time and the number of resources that are required for media uploading and at the same time to increase the throughput. Relay selection and resource allocation are the main areas that are focussed. The first step is relay selection followed by resource allocation. The relay selection phase is involved in the selection of the nearest device that is closer to the device that is involved in the data uploading and also in a location where it is in a good coverage with the Base Station. The device with maximum SINR and channel quality is selected as a Relay. The resource allocation scheme involves the following steps. Initially, the cell edge device sends a request to the eNodeB requesting the allocation of resource blocks. The eNodeB then sends a packet to the users in the network in order to measure the channel quality and the modulation coding schemes (MCS) based on which the resources are allocated. The eNodeB is responsible for measuring the performance of the network and then calculates the reference resource blocks and the reference upload time that will be required for data uploading and also the link capacity of the two hops. The first hop is the transmission between the cell edge device and the relay device while the second hop is between the relaying device and the Base Station. Time constraints and resource block constraints are formulated in order to reduce the resource block and the uploading time that is required for data uploading. The eNodeB also calculates the data rate and the required number of resource blocks. With respect to the time constraint, the number of resource blocks is reduced and with the resource block constraint, the upload time is reduced. The eNodeB measures the Signal to Interference plus Noise Ratio (SINR) and maps it to Channel State Indicator (CSI), measure the data rate as well as the number of Resource Blocks that are required for uploading the data. The simulation is carried out and the graph is plotted by comparing three different methods like D2D based Uploading-Resource-Block Minimization (DBU-RBM), Max-Min method and the proposed method. The graph is plotted between the available Resource Blocks and the Resource Blocks used. The simulation results show that the resources being used by the proposed scheme are lesser when compared to the other two schemes and claims to reduce 40% of the required number of Resource Blocks compared to the DBU-RBM method. The graph that is plotted between the required file size and the time that is required to upload shows that the time required for uploading the data is also reduced when the proposed method is applied.

By Faisal Hussain and other authors [41], the resource allocation algorithms for three different scenarios such as “One to One sharing”, “One to Many sharing” and “Many to Many sharing” have been proposed. The main objective of this paper was to increase the overall system capacity. It has also been explained that sharing of the resources can also decrease the sum rate of the network while the proposed method is said to share the resources only if the sum rate after sharing is greater than the sum rate before sharing. The maximum weighted bipartite matching algorithm is proposed for One to One sharing and Resource allocation algorithms for One to Many sharing and Many to Many sharing. Weighted Bipartite matching algorithm has been used for One to One sharing. In One to One sharing approach, the resource of one cellular user is shared by one D2D pair. The first step is candidate selection where the device which satisfies the following condition is selected. Two types of assignment methods for One to Many sharing are General assignment and restricted assignment. In One-to-One sharing method, the proposed method outperforms the other methods like Random assignment, Greedy algorithm, Local search-based Resource allocation (LORA), Deferred acceptance-based algorithm for Resource allocation (DARA), Bipartite algorithm in terms of system sum-rate and interference minimisation. In the One-to-Many sharing method, the highest system sum-rate is achieved by the Restricted One-to-many sharing algorithm. In the Many-to-Many sharing method, when the Normalised system sum-rate is calculated for the increasing number of D2D pairs, it has been found that the Multiple Allocation D2D (MAD) algorithm outperforms the Graph-colouring based Resource Allocation (GOAL) algorithm.

Jun Xu and Chengcheng Guo have proposed resource allocation methods for real-time D2D communication networks in [76]. The objective of the paper is to maximise the overall utility of the packets that are expected to meet the deadlines. Initially, the problem is modelled using Markov Decision Process (MDP) and based on this model, an optimal offline algorithm is proposed for channel and slot allocation. An optimal online algorithm for channel and slot assignment is further proposed to minimise the higher time complexity. The time complexity of the Offline algorithm increases with time. Therefore, the Online algorithm is used to calculate the cost and based on the calculated time and utility, the system decides which packet is to be accepted. The analysis of the total utility of the system is carried out between the online and the offline algorithms for the varying number of D2D devices. It has been found that the Offline algorithm performs better than the Online algorithm in terms of total system utility. For the increasing number of available channels, the acceptance ratio and utility of the system has been plotted for the Optimal Online algorithm and Earliest Deadline First (EDF) algorithm. The Acceptance ratio is smaller in the case of Optimal Offline algorithm than the EDF algorithm and the utility is lesser in case of Online algorithm and is higher in the case of the EDF algorithm. In terms of competitive ratio, the proposed online algorithm is found to be more optimal. The results have been compared with the existing EDF algorithm and optimal Online algorithm outperforms this existing algorithm in terms of utility.

Pavan Kumar Mishra et al. in [77], have proposed and have implemented a device-centric resource allocation method where the devices themselves allocate the resources without putting the responsibility of resource allocation on the Base Station thereby eliminating a greater amount of load that is laid upon the Base Station. The devices in the network maintain a Resource Occupancy Matrix (ROM) which contains the list of the neighbouring devices and the corresponding resources that are available for allocation and also the Channel Quality Index (CQI). The Resource Blocks are later assigned based on the Resource allocation scenarios. Based on the priority at the Base Station side, the resources are allocated to the devices. The simulation results have been made comparing the Number of requested devices for resources with the success probability of devices and with the load at the Base Station side, Time required for resource allocation and Time required for resource allocation with priority. The simulation is carried out under different scenarios like a lesser number of User Equipments (UEs), Semi-denser and denser network. The results show that for a lesser number of devices the traditional resource allocation scheme is applied where the Base Station itself is capable of handling such few devices. The First Come First Serve (FCFS) and the Best search methods are applied when the number of devices increases. The throughput graph is plotted between the number of devices and the throughput for the three scenarios like the traditional method, FCFS method and Best search method. Therefore, the performance of FCFS and Best search methods are good in a denser network in terms of throughput in Mbps. The simulation results of the paper also claim that the proposed scheme reduces upto 35% of the load at the Base Station. As the work of this paper is a device-centric method it also claims that the resource allocation delay is reduced upto 30% and also improves the network throughput compared to the other previous schemes that are Base Station centric.

In [78], Oleksii Rudenko et al. have focussed on a secured resource allocation method where the resource allocation for the cellular and the D2D users are done in such a way that the system security is also not compromised. In order to solve the security focussed Resource allocation problem, an extensive game-based algorithm has been used in order to strengthen the security of both cellular and D2D communications. The proposed Extensive game theory-based resource allocation algorithm is compared with four different algorithms which are Random Assignment (RA), Gale-Shapely (GS), Kuhn-Munkres (KM), Secrecy based access control (SB) algorithm. The total System Secrecy Capacity (SC) at different propagation loss factors, α and SC for different distances of D2D pairs are plotted. The results show that the extensive game theory-based resource allocation gives a higher system SC than the compared methods. In the same way, the ratios of successfully matched cellular users versus the number of DUs and for a different number of iterations have been performed. The proposed method of this paper, An Extensive game theory method does the matching successfully in lesser number of iterations. The discussed resource allocation method uses a distributed method where the involvement of the Base Station is lesser.

The work of Mohamed Elsherief et al. in [79], has focussed on both the resource and power allocation. The resource allocation part is done in such a way that subcarriers are allocated for the D2D pairs in such a way that each and every user gets the best subcarrier. The resource allocation is also done in such a way that the rate fairness and the date rate are maintained. The iterative Fairness Optimization Resource Allocation (FORA) algorithm has been used in this paper. FORA algorithm operates based on fairness and not on the channel quality. A water filling based algorithm is used for power allocation among the spectrum resources allocated to the D2D links. The performance metrics of FORA are compared with the Subcarrier Achievable Data Rate algorithm (SAD) and Best Subcarrier Channel State Information Resource Allocation (BSCR) in terms of Jain’s fairness index and spectral efficiency. The analysis has been done to find the Jain’s fairness Index for the varying number of D2D pairs and coherence Bandwidth. SAD and BSCR allocate the channels based on the channel quality. Therefore, the links with good channel quality are allocated a larger number of subcarriers and the links with poor channel quality receive a lesser number of subcarriers. The allocation of subcarriers in the FORA model depends on the maximization of the fairness index thus making it optimal in terms of fairness. The spectral efficiency has been plotted for the varying values of the ratio of Maximum transmit power per device to the noise power per subcarrier. SAD and BSCR exhibit higher spectral efficiency at the cost of fairness and it has been found that FORA outperforms these two algorithms in terms of fairness at the cost of spectral efficiency.

Pavan Kumar Mishra et al. have proposed in [80], a hybrid resource allocation algorithm in order to reduce the interference. Two hop D2D communication and cellular communications are considered. The two algorithms that have been used for resource allocation are Particle Swarm optimization and Graph-based resource allocation method. The graph-based algorithm is used to frame the interference matrix which is the first step and is followed by the PSO algorithm to mitigate the interference in the network. The interference minimisation is achieved by the Particle Swarm Optimisation method while the system throughput maximisation is achieved by the graph-based resource allocation method. The proposed schemes are found to perform better when the two communicating devices are peers and are in proximity to each other. The simulation has been performed and the results of the graph plus PSO scheme has been compared with Random Allocation (RA) and Graph based resource allocation method. The throughput and the interference of 2-hop D2D users and cellular users were analysed for various two hop D2D pairs. The proposed PSO plus graph-based method has been proven to perform better than the other two algorithms and the interference level is proven to be maintained so that fair quality is maintained for both the cellular users and D2D users.

The authors in [81], have focussed on both the mode selection and resource allocation algorithms aiming to improve the throughput and to minimise the interference. The Base Station, as it does not have the channel quality information of the cellular-D2D link, the SINR cannot be guaranteed and the QoS cannot be maintained. The uncertainty of this channel state information can be solved by a probabilistic resource allocation method and by gathering the feedback based on the user selection. Therefore, the resource allocation algorithm is being integrated with the quasi-convex optimization algorithm based on the channel probability characteristics.

A sector-based resource allocation algorithm have been proposed by Pimmy Gandotra et al. such that the QoS and Quality of Experience (QoE) have been achieved at a satisfactory rate in [82]. The cellular and D2D reuse pairs are chosen based on the channel gain. The resource blocks have been assigned adaptively based on the demand of each user. Based on the factors like channel gain, Sectored antenna at the Base Station, use of a highly directional antenna and the D2D pair formation, the distance constraint has resulted in an efficient allocation with a satisfied QoS and QoE levels. The simulation results of the proposed scheme have been compared with the Hidden Markov Model (HMM) and the throughput has been calculated for different iterations. Due to the use of sectored antennas, the interference level is said to be reduced thus increasing the throughput value of the proposed sector-based resource allocation. The performance of sector-based is greater than the HMM method.

In [66], the authors have analysed a multicell environment where the uplink resources of multicell were considered. Based on the complete and incomplete information available at Base Station, the comparisons and analyses have been done. The Base Station is considered to be the game player of the Resource allocation problem. The parameters that are related to the D2D transmission are considered to be private information while the other information like probability distribution from past observations is considered to be public information. Therefore, each player is not aware of the details of the other peers. The static game model is repeated several times which is then followed by the formulation and implementation of resource allocation. The resource allocation algorithm is implemented under incomplete information condition and the non-cooperative game theory model is used for analysis in a multi-cell environment.

Lucas-Estañ MC and Gozalvez J in [83], have concentrated on the improvement of network capacity by a resource allocation scheme. The eNB assigns a pool of resources for the D2D transmission based on the information which includes the positions of the Cellular and D2D devices and based on the spectrum that is being used by the Cellular device. This resource pool that is selected by the eNB includes the unused spectrum resources and also the active resources that are being used by the Cellular users. The eNB also decides whether the spectrum that is used by the current cellular device will be suitable for the D2D pair to start transmissions ensuring QoS and minimum interferences. The D2D devices then select the unused spectrum resources that are not being used by the Cellular devices. The D2D devices if it selects the resources that are being used by other cellular users, then the D2D device calculates the interference levels to ensure QoS. The radio resources are selected by the D2D devices based on the location of the devices and the interference levels and do not take into account the transmission power of each user.

Thus, various resource allocation methods that have been used in recent times for the allocation of resources have been analysed, discussed and tabulated to get a clear view of the work done so far in the field of D2D communication. The resource allocation algorithms are used therefore for the improvement of fairness, data rate, throughput and for the minimisation of interference and delay.

4 Discussion

D2D communication operates in two different phases. The first phase is Device discovery followed by the communication phase. The devices in the discovery phase search for the devices that are in proximity to it and the identification about the peer is determined. The determined information is announced to the Base Station followed by several information exchanges. The second phase of D2D communication includes mode selection, channel estimation, power and spectrum allocation. The data transmission takes place in this communication phase. The resource allocation is one of the most important steps in the communication phase.

From the above-referred papers, it has been found out that most of the work in resource allocation is fully or partially controlled by the Base Station, therefore, making it Base Station centric or semi-Base station centric type of spectrum allocation. The traditional cellular communication and D2D communication follow the Base Station centric method of allocation. Sometimes, D2D communication follows a semi Base Station centric method. In the mentioned methods, based on the Channel State Information (CSI) and the feedback that is received from the devices, Base Station allocates the resources. The drawbacks of Base Station assisted resource allocation includes the overall increase in the load at the Base Station, increase in the time consumption and overall decrease in the network throughput.

Thus, to summarise, the Device-centric method is better in performance compared to the centralised methods that are being applied by conventional cellular communication and the existing D2D communications. In a future world of 5G and B5G, where communication is more likely to take place between the machines and devices, the allocation of spectrum resources can be made decentralised or distributed. The identified open issues and the proposed future work for further research has been discussed in detail in the following subsections.

4.1 Open issues

D2D resource allocation and management techniques have certain issues and challenges that need attention in the future. The performance of the network will be improved to a greater extent when the issues and challenges that are present in D2D resource allocation is addressed. From this survey, we have identified many issues and challenges that are listed below.

  • Most of the current works concentrate on Centralised resource allocation schemes. There are only fewer works on Distributed resource allocation in D2D communication. Many newer Distributed algorithms are needed for further research in D2D resource allocation.

  • Reuse of resources in order to improve the spectrum efficiency leads to interferences between the cellular devices and D2D devices thus degrading the performance of the cellular devices.

  • Systems share their resources both in Underlay and Overlay mode, but the performance of the system gets degraded as it faces interference issues when Underlay mode of resource sharing is used and degradation in spectrum efficiency when Overlay mode of resource sharing is used.

  • The work of resource allocation is rarely considered and solved along with other problems such as power allocation and mode selection to guarantee the QoS of the system.

  • The system model becomes computationally time-consuming and more complex when a resource block is allowed to be shared among multiple users in a multi-cell environment.

  • Resource allocation based on the factors like location and distance between the devices still requires improvement.

  • D2D communication that involves the participation of relays is worth for future investigation as with the increase in the number of devices arises the need for spectrum sharing with more than two devices.

  • Dynamic behaviour of the devices in the network is also challenging in a real time communication scenario.

Therefore, to overcome most of the above-mentioned research issues, a distributed resource allocation method is our area of focus for further advancement in D2D communication. Resource allocation that is done by the devices themselves is best suited for the scenario where the amount of data that is to be transmitted and the number of devices is high. To achieve this the SINR and channel gains that were initially calculated by the Base Station is calculated by the individually interested D2D pairs. The devices maintain a table that has a record of all the peer devices that are in proximity and a list of all the available resources. The devices themselves then selects a suitable resource from this table in such a way that the selected resource does not degrade the performance of the cellular devices. The load at the network is reduced and the communication becomes independent of the Base Station in terms of resource allocation. This method, therefore, has a huge advantage and exhibits a good system performance compared to the existing centralised methods in terms of system throughput and latency.

4.2 Future research directions

In view to the above-mentioned issues of applying Centralised resource allocation and advantages of implementing Distributed resource allocation methods, we have identified the future trends in implementing Machine Learning (ML) and Deep Learning (DL) in D2D resource allocation and management. As it has been discussed in the previous Section, the practical real time Wireless communications are dynamic in nature that employs complex computation of the existing algorithms. The role of D2D communication in the complex future 5G and B5G can find a greater advantage by the usage of Artificial Intelligence which allows the devices to intelligently automate its functions, maintenance and management [84]. D2D communication is envisioned to be continuously applied even in the future B5G including 6G wireless cellular systems such that it can support a larger number of devices and a wide variety of applications. Therefore, we propose the application of ML and DL to D2D communication that can serve as a solution to the complex radio resource allocation and management problem [85]. The ML and DL models learn and collects all the required data from the environment, re-trains and retune itself to the varying changes in the environment. The brief explanation about the proposed Machine Learning technique in D2D communication is explained as follows:

4.2.1 Optimal solution: Machine learning

The optimal solution that we stress to overcome the limitations of the existing resource allocation methods is Machine learning techniques. Machine Learning is an application of Artificial Intelligence that allows the devices to automatically learn, implement, and improve without being explicitly programmed. The machine learning algorithms are classified simply as Supervised and Unsupervised learning. Reinforcement learning is also another category of Machine learning that later evolved. In Machine learning, the machine learns from the past experiences the execution of a task by maintaining a particular performance metrics with an objective to improve the system performance.

Supervised learning consists of labelled examples or training samples while unsupervised learning has neither classified nor training samples. The reinforcement learning method interacts with its environment, learns from the previous actions, produces actions, and finds out the errors and rewards. In other words, it is a Trial and Error method that produces an efficient outcome.

The decision-making process under unknown network conditions and spectrum sharing for Device-to-Device networks use Reinforcement learning algorithms like Markov Decision Process (MDP), Partially Observable Markov Decision Process (POMDP), Q learning and Multi-armed Bandit methods. MDP provides a mathematical model for the decision-making process. The process is in a particular state, ‘s’ at a time, ‘t’ where an optimal action, ‘a’ of that particular state is selected. The process then responds at the next time step by transferring into a new state, s’. This transition in the system is described by the state transition probability.

Q learning is also used to determine the optimal action of a system whose system model is unknown for any given MDP decision process. Q learning model consists of a set of Agents, ‘A’ and State, ‘S’. By implementing action in a particular state, the agent receives rewards. The main objective is to maximise the number of acquired rewards. Such a reward is illustrated by the “Q-function”, The Q function is updated in an iterative fashion when the agent carries out a specific function and gains rewards.

The learning algorithm consists of the components [86] which are described as follows:

  1. 1.

    AGENT: All the D2D Transmitters that are responsible for resource allocation are the Agents. Each Agent, ‘i’ is responsible for the optimum selection of Resource Blocks from a list of available resources under different network States.

  2. 2.

    STATE: At a particular time-slot, ‘t’, the learning Agent relies on the environmental conditions to define their states. The State observed by the Agent ‘i’ is given as:

$$ {S}_i^t=\left(i,{n}_i\right) $$
(9)

Where, ni is the Resource Block (RB) that is assigned to the Agent, ‘i’.

  1. 3.

    ACTION: The D2D user who is the Agent has ‘N’ number of resource blocks available for communication. The action of an Agent is defined as the selection of a specific resource block.

$$ \mathrm{Action},\kern0.5em {a}_i={n}_i $$
(10)
  1. 4.

    REWARD: The reward function is defined by the throughput achieved by the D2D user who is the Agent, ‘i’ and is given as Ri = (St, at). The following constraints are taken into account to determine the reward of the function based on the learning process.

    $$ \mathrm{C}1:{I}_{Dn}\ge {I}_{TH}\kern0.5em \mathrm{C}2:{Z}_{Dn}\in \left\{0,1\right\}\kern0.75em \mathrm{C}3:{\zeta}_{cn}\ge {\zeta}_{min} $$

Where, IDn is the interference caused by D2D users sharing uplink spectrum resources, ITH is the maximum tolerable interference by the cellular user, ZDn is the Binary Decision Variable, ζcn is the SINR of a cellular user ‘cn’ operating on RB ‘n’, ζmin is the predefined SINR threshold.

$$ {R}_i\left({S}_t,{a}_t\right)=\left\{\begin{array}{c}{R}_t\left({a}_t\right)={\log}_2\left(1+ SINR(i)\right)\kern2.25em if\ c1,c2\ and\ c3\ are\ met\\ {}-1\kern7.25em otherwise\end{array}\right.\kern0.5em $$
(11)

Where, log2(1 + SINR(i)) is the Throughput and SINR(i) is the Signal to Interference plus Noise ratio of user, ‘i’.

  1. 5.

    The optimal spectrum allocation method is derived from the State-Action Q-values.

  2. 6.

    The learning process is classified into two stages namely Exploration and Exploitation. The Agent explores various actions in different environment states and updates the Q-table as given

$$ {Q}_i\left({s}_i,{a}_i\right)=\left(1-\alpha \right){Q}_i\left({s}_i,{a}_i\right)+\alpha \left[{R}_i\left({s}_i,{a}_i\right)+\gamma {\max}_{l\epsilon {A}_i}{Q}_i\left({s}_i^{\prime },l\right)\right] $$
(12)

Where, α is the learning rate of the Agent, ‘i’, ‘si’ is the current state and \( {s}_i^{\prime } \) is the next state of the agent, ‘i’. The learning agent,‘α’ [84] is given as

$$ \mathrm{The}\ \mathrm{learning}\ \mathrm{rate},\kern0.5em \alpha =\frac{\rho }{visited\left(s,a\right)} $$
(13)

Where, ρ is a positive constant and visited(s, a) is the visited state-action pair.

  1. 7.

    The D2D user who is the learning agent learns the strategy to maximise the reward. The actions in the Exploration stage is selected based on the SINR and the data rates. The actions that meet the constraints are rewarded and those that does not satisfy the constraints are given negative rewards. Under dynamic network conditions, the Agent automatically learns and adapts to the newer situation with the objectives of meeting the constraints C1, C2 and C3 and maximising the throughput.

  2. 8.

    The selection of actions is done in the Exploitation phase and is given as

$$ {a}_i= argmax\left({Q}_i\left({s}_i,{a}_i\right)\right) $$
(14)

Each and every D2D user, who is the Agent learns the strategies by which the throughput can be increased.

Thus, ML and DL techniques are trendy and emerging techniques which promise an automatic resource allocation and decision-making processes [87]. In order to implement Distributed resource allocation technique, ML and DL models can be used to predict and calculate the wireless radio channel parameters like path loss and carrier phase shifts by learning methods. The burden that is laid on the Base Station has thus reduced as the devices themselves accurately predicts and estimates the channel parameters without the involvement of the Base Station, thus reducing the complexity. In a current world of machines and Artificial Intelligence, the scope of distributed resource allocation technique has a long way to go in the field of D2D communication.

5 Conclusion

In this paper, an extensive survey on the resource allocation in D2D communication has been performed. From the analysis and survey of various resource allocation technologies and methods, it has been found that the Base station centric method of centralised resource allocation in D2D communication has been a major area of focus in the past. The distributed resource allocation in D2D communication is an emerging area of research that is more evidently to find scope in the future 5G or Beyond 5G (B5G) environment where the Base Station load is going to be tremendous due to higher network capacity. Distributed approach of resource allocation that has been discussed in few works, clearly exhibits an improved efficiency and reduced time consumption in the process of resource allocation. Thus, the need for offloading the data traffic from the Base Station is an unavoidable need for the researchers to focus more on D2D communication and by doing so the need to reduce the interferences and wastage of spectrum resources arises which in turn motivates the researchers to find new ways and solutions to perform a distributed resource allocation in D2D communication system. Therefore, in order to achieve a distributed way of resource allocation, the usage of trendy technologies like ML and DL will be of greater scope in the future.