1 Introduction

In recent years, Internet of Things (IoT) services such as Siri have become progressively more widespread. With rapid developments in wireless networks and data processing, the earlier Internet environment has become enhanced with various “things.” Wireless communication and near-field communication networks, and cloud computing, have rendered personal mobile devices increasingly intelligent. In the IoT, existing mobile devices and embedded platforms communicate to provide useful services [1,2,3,4].

Most IoT devices are mobile devices with both inbuilt and external sensors monitoring ambient conditions, gravity, orientation, and acceleration; they also serve as GPS receivers to provide spatial and temporal measurements of the local environment. If IoT services are to be stable, two critical problems must be overcome, both of which depend on device performance. The first is that both power and data storage must be adequate for IoT computations; storage causes battery drain. To overcome this problem, IoT computation and storage are fully offloaded to remote computing resources such as a grid and/or the cloud. Second, device mobility may render network connectivity and device availability unstable. Such uncertainty is attributable to unpredictable node mobility, varying rates of battery drain, hardware failures, and lack of a priori knowledge on the performances of various mobile hardware and software platforms [5,6,7,8].

Most current IoT services provide their own platform. However, IoT services need to be combined with other services to form a single integrated service sharing sensed data and integrating management, a development that is compromised by the lack of power and random mobility of personal devices [9,10,11,12,13]. It is essential to develop integrated platforms for IoT services that exchange and manage data from heterogeneous sources. Several problems are encountered when integrating and managing the explosion in IoT services; these involve the types of sensor devices used, the form of data transmission, and the types of computation employed.

Although each IoT service has its own distinctive features, many share similar characteristics. For example, healthcare IoT services fulfill a variety of needs, but they all use similar sensors and modes of analysis. Despite these commonalities, the systems are independent, and it is difficult to integrate them. When IoT service platforms are integrated, accessibility is reduced, data complexity increases, and computation modules are duplicated [14, 15]. When integrating IoT services into a single platform, problems caused by the need to have heterogeneous devices and services collaborate and combine in a manner allowing uniform management must be solved. Reuse of computational modules in integrated IoT platforms reduces the complexity of grid/cloud manipulations and the network offloading overhead on the system side, and renders the development of new IoT services easier on the developer side. However, to ensure reusability, IoT services must be classified and clustered on the basis of similarities.

To solve these issues, we present a clustering system derived using the EM algorithm. Before clustering existing IoT services, we define IoT services by their operative characteristics: sensing, data management, processing, and execution.

  • Sensing step We classify devices by their characteristics;

  • Data management step We classify how data are preprocessed in the end device of the local manager or server;

  • Processing step We classify the computational models of data analysis, data manipulation, and decision-making;

  • Execution step We classify how services are executed.

We then use the EM algorithm to calculate similarities among IoT services. Our system is scalable and flexible, and can easily accept new IoT services. To validate the efficacy of the system, we implemented it using the baseline dataset developed in our recent work [16]. The principal contributions of this paper are:

  1. 1.

    We analyze the existing approaches toward IoT service classification. We define the detailed operative steps of IoT services in terms of their characteristics;

  2. 2.

    We propose a clustering system for IoT services, derived using the EM algorithm. This algorithm is the most effective technique available for appropriate probabilistic clustering. Additionally, the algorithm easily recognizes categorical and continuous attributes without requiring distance specifications;

  3. 3.

    To validate the efficacy of the system, we implemented it using the baseline dataset developed in our recent work [16]. This dataset consists of 37 IoT services that perform their own computations. Although the dataset is small, we derived it by surveying over 100 commercial IoT services in current use.

The rest of the paper is organized as follows: In Sect. 2, we discuss several related works on IoT architecture, the uncertainty of the IoT environment, and previous IoT service classification schemes. In Sect. 3, we present the system architecture that we use to develop an integrated platform on the public cloud. In Sect. 4, we describe our IoT service classification and clustering system. In Sect. 5, we evaluate the proposed system by surveying over 100 commercial IoT services in current use. Finally, Sect. 6 contains our conclusions.

2 Related works

2.1 IoT architecture

Recently, the IoT service environment has focused on communication and interaction among different devices. The architecture of an IoT service basically consists of three layers: a sensing layer, a network layer, and an application layer.

  • Sensing layer Sensing devices such as RFID tags or smartphones;

  • Network layer The collected data are transmitted, communicated, and processed;

  • Application layer The IoT engages in processing and/or execution.

Our service-oriented platform provides various IoT services involving large numbers of service operations such as monitoring, discovery, and service classification [10, 12, 14, 15]. Figure 1 shows the three-layered architecture of the IoT environment.

Fig. 1
figure 1

Basic IoT architecture

2.2 Uncertainty

Developments in wireless networks and devices are associated with changes in IoT services from fixed to mobile nodes. In addition, many high-level, real-time applications are being developed. Therefore, mobile edge computing (MEC)-based IoT services are required to reduce the response time of the central cloud server. Figure 2 shows an MEC environment consisting of various mobile networks and mobile devices [21,22,23].

Fig. 2
figure 2

Overview of an MEC environment

Although an MEC reduces the time required, the changed IoT environment features several uncertainties when it is sought to provide stable IoT services [5,6,7,8].

  • The mobility of the mobile node cannot be predicted;

  • The battery of the mobile device can become suddenly exhausted or discharged;

  • Network failure may occur because of the low communication bandwidth of the wireless network;

  • Communication and computation must deal with the heterogeneous platforms of mobile nodes.

Thus, self-provisioning and self-recovery mechanisms are essential to ensure rapid responses and the high-level service quality of real-time services such as autonomous vehicles, emergency aid, and object recognition and tracking.

2.3 IoT service classification

Previous classifications of IoT services used several criteria such as the components employed, the services provided, device power, and sensor information. Sammarco et al. [17] classified IoT devices in three ways. Level 1 was based on storage and power availability, and included identification and sensing devices such as simple sensors, and passive and semi-passive RFID tags. Level 2 was based on connection methods, including ad hoc connections between sensors and wireless devices (examples: active tags and Zigbee full-function devices). Level 3 was based on the communication method used (connections between wireless devices and the Ethernet; IP/non-IP-based and Bluetooth-based devices).

In Thoma et al. [18], an IoT service allowed or blocked user access and managed sensing, action, and identification. The sensing layer was simple (for example, light, wind, or humidity). The action layer was either simple (such as on/off) or complex. The identification layer was a combination of vision, service description, and service ID.

Ning and Hu [19] classified IoT services into four types. The low-level type was a set of sensors (access devices or resources, end-mobile devices). A resource service was a set of devices managing sensors such as regional supervisors and IoT gateways. An entity service was a single service consisting of sensors and management devices. This was the core IoT service blending low-level services with resource services such as Amazon Echo. An integrated service consisted of several entity services organized in groups within an IoT service environment (smart home, smart building).

Zhu et al. [15] suggested that an integrated IoT platform should be used to share both collected and analyzed data. The platform was based on the common cloud and combined several single services.

Figure 3 shows sensor-based biomedical IoT services, indicating why cloud-based computations are needed.

Fig. 3
figure 3

Example of cloud-based computation of biomedical IoT services

Kelly et al. [20] classified IoT services in terms of power processing and proposed the use of an IoT gateway to solve the battery drain problem of low-power non-IP sensor devices that continuously communicate with others.

3 System architecture

To develop a common platform for various IoT services, we wish to place an integrated platform on the public cloud. The system environment of this study is shown in Fig. 4.

Fig. 4
figure 4

System environment

The system environment consists of four layers:

  • Sensor The device that collects data;

  • Service environment The environment of sensors and coordination devices;

  • Service cluster The cluster of services within the same service environment;

  • Platform The integrated system managing various service clusters.

In the sensor layer, sensors collect valuable data on temperature, pressure, heat, light, and sound. Sensors may provide low-level services such as alerts. If only battery power is available, high-level service (complex analysis) may be difficult; sensors must offload data to the cloud. The service environment is the local area in which a complex service is provided. For example, Amazon Echo uses local sensors and sends the appropriate response/action back to the user. A service cluster is a cluster of services providing similar services. For example, a smart home service cluster could consist of Amazon Echo, Apple HomePod, and Google Home. Our integrated platform includes several clusters (smart home, smart fitness, and smart factory service clusters). Our platform facilitates reusability of the configuration modules of each service.

4 IoT service classification and clustering

In this section, we present the classification criteria and our clustering algorithm for integrated management and enhanced reusability of the configuration modules of IoT services. Based on the service layers mentioned in Sect. 3, we divide the operation of each IoT service into four steps: sensing, data management, processing, and execution (Table 1).

Table 1 Classification criteria for sensing devices

4.1 Sensing step

Various sensors are used to collect input data for IoT services. These sensors perform either simple or complex actions. Sensors may perform identifications, communicate with other sensors, and communicate with servers. Here, we focus on sensor power, transmission, and operation. Table 2 classifies sensing devices in this manner.

Table 2 Classification criteria for data management

The first criteria assess the sensor power. A self-powered sensor activates its own battery. This can be a stand-alone sensor. An AC/DC sensor becomes activated using AC/DC power or its own battery and is then recharged by the user (smart lights, smart phones). Self-recharging sensors include smart cleaners. Auto-recharge sensors have inbuilt charging systems such as the ability to use solar power or movement-derived power. The second criteria explore sensor transmission (IP-based or non-IP-based such as GPS and Bluetooth). The last criteria focus on operation. An event sensor remains dormant until it is woken by an event (sound or movement). A frequency sensor (such as a temperature sensor) is always on. An event-to-frequency sensor remains dormant until woken up and then operates for a certain period. A timer-controlled sensor operates during scheduled cycles.

4.2 Data management step

No standard integrated IoT service platform has yet been clearly defined. To provide integrated management, the format of transmitted data should be unified. However, most IoT services use individualized data formats and filing systems. Thus, various data formats should be converted to the same format, allowing sharing. In the data management step, we focus on data format in transmission, the maintainability of stored data, data transmission, and trust formation. Table 2 classifies data management.

First, data are preprocessed into various sides. The data are then stored. Depending on the IoT service policy, the data can be stored temporarily or permanently (as in smart home IoT services). The data are then transmitted; data are received from sensors or IoT gateways. The last criteria involve trust. The integrity of incoming data is inspected by a sensor supervisor or by consensus among sensors.

4.3 Processing step

Most IoT services have their own models of data manipulation or decision-making. However, these models are combinations of basic operations and computational models for data analysis. Table 3 classifies the data processing modes.

Table 3 Classification criteria for data processing

4.4 Execution step

IoT services execute actions depending on their purpose. For example, a smart air-conditioning service may decide to initiate cooling by processing ambient data and sending a “cool” command to an air conditioner; temperature variations are saved and reported monthly. Most IoT services report their analyses, create alerts, and request action. Table 4 classifies the execution criteria.

Table 4 Classification criteria for data management

4.5 IoT service clustering

In this section, we present our EM (expectation–maximization)-based IoT service clustering algorithm. The EM algorithm is the most effective technique available for probabilistic clustering. EM does not require distance measures and readily admits categorical and continuous attributes [24,25,26,27,28]. As mentioned above, our method focuses of the details of each step. We added an additional step (removal of one-member “clusters”). When the standard deviation is zero, we compare the number of clusters in the current iteration and the number of clusters in the previous iteration. This does not pose a clustering problem, but the dataset of commercial IoT services is currently small and focused on specific IoT service types. We mention this problem below. Table 5 shows the classification algorithm.

Table 5 EM-based IoT service clustering algorithm

5 Experimental analysis

In this section, we use our EM-based IoT service clustering algorithm to evaluate over 100 commercial IoT services. The experimental environment featured a single cluster running on eight heterogeneous desktops. The experimental cluster consisted of two parts: One was Intel i7-based [8 cores (including 16 hyperthread cores), a 3.2 GHz processor, 32 GB of memory, and 256 GB of SSD]; and the other Intel i3-based [4 cores (including 4 hyperthread cores), a 4.0 GHz processor, 16 GB of memory, and 128 GB of SSD]. We used Ubuntu 14.04 as the operating system (C# 7.0).

We clustered current commercial IoT services. The baseline dataset featured over 100 such services. However, many low-level services performed only sensing and alerting. We removed 80 such services and selected 37 as the experimental dataset; all perform their own processing. Figure 5 shows our clustering platform.

Fig. 5
figure 5

Clustering platform

We entered the dataset into our EM-based, IoT service clustering system; 37 clusters were created by reference to purpose. However, 22 clusters contained a single service because of the current absence of similar services. This is why we added an additional step to the EM algorithm. After such addition, five clusters were formed. Table 6 shows the clustering results with the means and SD of each attribute of the 37 commercial IoT services.

Table 6 Five clusters with the means and SD of each attribute

Tables 7, 8, 9, 10 and 11 show the clusters formed by our EM-based IoT service clustering algorithm.

Table 7 Characteristics of Cluster 0
Table 8 Characteristics of Cluster 1
Table 9 Characteristics of Cluster 2
Table 10 Characteristics of Cluster 3
Table 11 Characteristics of Cluster 4

6 Conclusions

Over the past 5 years, IT has trended to form the IoT service environment, which is of major academic and industrial interest. IoT services have given us smart homes, building management tools, surveillance services, and smart farms. IoT services such as Siri are popular. We integrated these services into a cloud-based IoT service platform optimizing communication and interaction among heterogeneous devices and services. Here, to improve accessibility, reduce data complexity, and reuse computation modules in a single IoT service platform, we present a means of IoT service classification and clustering. We classify IoT services into four steps. The first step is sensing using various sensor devices. The second step is data management focusing on the data format transmitted, the maintainability of stored data, further transmission, and security. The third step is processing, divided into computational models for data analysis and models for data manipulation or decision-making. The last step classifies IoT services by the form of their execution. We extended the classic EM algorithm to cluster the services by similarity.

To validate our classification and clustering system, we surveyed over 100 commercial IoT services, eliminated over 80 low-level services, and entered 37 commercial IoT services into an experimental dataset; all perform their own processing. Experimentally, the IoT services fell into five clusters that were similar in terms of purpose. In future, we will implement our system in public clouds such as amazon EC2 and Azure.