Keywords

1 Introduction

With technology rising on a never before seen scale, concurrent systems have been playing an increasingly prominent role in software development. Three forms of concurrent systems have been widespread over the years, namely multithreaded, parallel and distributed. While the multithreaded and parallel programs pertain to cooperating processes/threads on a single system, distributed programs span multiple systems, possibly spread over a large geographical area. It is a complex field that consists of devices that communicate and organize tasks to act as a single-coherent system. It combines the power of multiple machines to improve the performance of a program with unmatched scalability. The only reason that distributed programming is yet to come to the forefront can be attributed to its increased complexity in deployment and maintenance.

Concurrent systems are characterized by some challenges which are entirely different and non-existent in the sequential systems. Although each of these systems throws its unique challenges, the fundamental theme underlying these challenges revolve around safety, liveness and fairness. Dealing with these challenges by techniques such as mutual exclusion and other forms of synchronization introduces second level challenges such as message complexity, response time and other overhead. However, despite the challenges and the overhead involved in concurrent systems, they continue to play a dominant part in software development, due to the performance and scalability benefits they provide.

In this paper, we have analyzed the performance of mutual exclusion in the case of three different concurrent programming paradigms, namely multithreaded, parallel and distributed programming. There are predominantly six metrics used for measuring the performance of mutual exclusion algorithms. These include message complexity, synchronisation delay, response time, system throughput, low and high load performance and best and worst case performance. Since our observations deal with low load performance, more emphasis has been given to message complexity, synchronization delay and response time.

The goal of this project is to directly contrast the performance of distributed programs against their parallel and multithreaded equivalent using sequence diagrams obtained from Java Interactive Visualization Environment (JIVE) and further compare them using the aforementioned three metrics. This analysis has been carried out using the dining philosophers problem in all three programming implementations. The further observations and results based on this analysis are specified in the upcoming sections with an array of scope for future work.

The rest of the paper is structured as follows. Section 2 discusses some of the closely related work. Section 3 describes the implementation of multithreaded, parallel and distributed versions of the dining philosophers problem. Section 4 provides the analysis of the different concurrent versions providing key insights into their implementations. Section 6 summarizes the work and provides directions for future work.

2 Related Work

In this section, some closely related works of previous papers and their contribution to the concurrent program analysis are summarized. Concurrency around the scenarios-multithreaded, parallel and distributed systems are discussed. This is then, followed by the significance of JIVE-an interactive visualization tool used for the analysis.

Dining Philosophers problem [1] is addressed to acknowledge the working of concurrent programs. Java Threads [2] is used for the implementation of multithreaded programming. MapReduce and Fork/Join are popular java based frameworks for achieving parallelism. In the dining philosophers problem, there are only five philosophers. Hence, the Fork/Join framework is a better choice for parallel program implementation [3, 4]. RMI framework [5] is used to build remote communication between Java programs that help to build distributed applications. But, all these frameworks have a drawback that they can’t access shared objects at the same time, which makes synchronization [6] a vital part in the implementation of the concurrent program.

Sequence diagrams are interaction diagrams that record the relationships between objects when the programs work collaboratively. Object-to-object interaction of a program using sequence diagrams makes it easy to understand the workflow and the complexity of the program [7,8,9]. Sharp and Rountev [10] explain sequence diagrams and their working principles. Various limitations of the UML Sequence diagrams are pointed out and a set of techniques to overcome this limitation is highlighted. UML diagrams are extremely large and clustered making them hard to interpret properly. So the earlier unreadable UML sequence diagram is expanded upon to interactively explore different aspects of the diagram, to focus on subsets of the expressed behavior. Thus, the earlier sequence diagram can be elaborated for better understandability and proper interpretation by the programmer.

The sequence diagram of Java programs can be obtained through the Eclipse plugin tool Java Interactive Visualization Environment [11]. JIVE visually portrays both the call history and the runtime state of a program. The call history can be seen in the sequence diagram, where each execution thread is depicted in a different color, simplifying the object interactions. JIVE also generates runtime models, verifies and validates those models against design time models [12]. Ajaz et al. [13] already presented the performance study of a parallel program using JIVE on multicore systems. Kishor et al. [12] emphasize a methodology to interpret the sequence diagram in the form of a finite state diagram. The key state variables are annotated by the user. This is later combined with the execution trace to obtain the sequence diagram. Since state diagrams are smaller in magnitude compared to sequence diagrams, they provide more insight into program behavior and detect subtle errors.

3 Program Implementation

To shed light and raise discussion on concurrency, the dining philosophers algorithm has been implemented in this paper [14]. The dining philosophers problem is a classical problem of synchronization. Five philosophers sit around a circular table. Each of them has two states - thinking and eating. Five forks are placed on the table. In order to eat, the philosopher must have two forks in hand. No two adjacent philosophers can eat simultaneously.

3.1 Multithreaded Programming

To start the thread, the start() method must be invoked. Java provides Thread class to implement multithreaded programming. Thread class provides various methods to create and perform operations on a thread. One of the methods to create a thread process in Java, using Threads, is by extending the Thread class and overriding its run() method.

figure a

Since in the dining philosophers problem, no two adjacent philosophers can get hold of both the forks simultaneously, it’s important to have synchronization in the availability of the forks. When several threads are trying to access a common resource, it is necessary to have control over who has access. And this is what synchronization does. In the multithreaded program, two synchronized methods, namely take() and release(), are used. The synchronized methods lock an object for any shared resource; in this scenario - the forks. The message complexity, the number of messages required for the execution of the critical section, is 2(N-1), where N represents the number of philosophers.

figure b
figure c

The program begins by initializing the threads of five philosophers. The threads start executing the run() method, where the philosopher has to go to the thinking state for a random period and then go to the eating state. The mutual exclusion part will be handled by the two synchronized methods, while taking and releasing the forks.

3.2 Parallel Programming

Fork/Join framework is a framework in Java that sets up and executes parallel programs by taking advantage of multiple processors, which is accomplished by identifying the availability of processor cores and allocating the tasks accordingly. It uses a divide-and-conquer strategy: divide a very large problem into smaller parts. These smaller parts can be further divided into even smaller ones, recursively until a part can be solved directly. In the parallel program, ForkJoinTask is used. Through this mechanism, a small number of actual threads in ForkJoinPool controls a large number of tasks to be executed. ForkJoinTask.invokeAll() method combines fork() and join() in a single call and starts the instances of all the philosophers.

figure d

To attain mutual exclusion, the concept of the synchronized block is used. For any shared resource, the synchronized block is used to lock an object. Nested synchronized blocks help to get hold of both the forks the philosopher needs. So, the message complexity becomes 4(N-1). The instances of all the five philosophers are invoked by ForkJoinTask.invokeAll() method. The instances then run the compute() method, where the synchronized block is defined. Thus, mutual exclusion and concurrency are achieved.

figure e

3.3 Distributed Programming

Remote communication is an inevitable part of distributed programming. In Java, Remote Method Invocation (RMI) helps to achieve this communication between the systems. RMI allows an object running in one Java virtual machine to invoke methods on an object running in another Java virtual machine. In an RMI application, there will be a client program and a server program. The server program will generate a remote object and a reference of that remote object is made available for the client (Code 6). The remote objects on the server can be accessed by client requests, and thus, the client can invoke the server’s methods (Code 7).

figure f

Unlike multithreaded and parallel programs, synchronized keyword can’t be used in distributed systems because they can only be used to control access over a shared object within the same JVM. Hence, five semaphores have been used to keep track of the availability of the five forks. A semaphore uses a counter to keep track of the status of a shared resource and controls its access. The message complexity is 2(N-1) here.

figure g

To execute the program, the server program has to run and then the client program. By running the server, it creates an object. Then, using reBind() method, it registers this object with the RMIregistry. To make use of the server methods, the client needs a reference of the object that the server created. With the help of the lookup() method, the client fetches the object from the registry using its bind name. Remote communication is thus established. The server program includes getForks() and returnForks(), methods to handle the forks, which were included in the remote interface. In the client program, five philosophers are initialized using Java Threads and they run simultaneously.

figure h

4 Analysing Program Behavior

4.1 Multithreaded Program Sequence Diagram

There are five philosophers present in Fig. 1 namely Philosopher 1, Philosopher 2 and so on till Philosopher 5. As in Fig. 1, we can see that Philosopher 5 goes to the thinking state for a while and then starts eating. After a certain amount of time Philosopher 5 releases the fork which allows its adjacent Philosopher 1 to gain access on one of the forks. The time gap between one process leaving the critical section and the next process accessing it is known as the Synchronization delay. In this case, the time interval between one philosopher exiting the eating state(releasing left and right forks) and the next philosopher entering the eating state(gaining control over left and right forks) is the synchronization delay. In Fig. 1, there is a time gap between take1 and eating1. This time gap between sending the request and then getting control over both the forks is known as the Response time. When comparing Synchronization delay and Response time, the delay is relatively lower and response time is very high.

Fig. 1
figure 1

Sequence diagram of multithreaded implementation

4.2 Parallel Program Sequence Diagram

Figure 2 shows the sequence diagram on parallel implementation of the Dining philosophers diagram. Philosopher 1, initially being in the thinking state, goes to the hungry state (requests for forks) after a while. This time gap between sending a request for forks(hungry 1) and getting access to it(getForks1) is known as Response Time. Philosopher 1 then puts back the fork, i.e., returnForks(), making it available for adjacent philosophers. The time gap between returnedForks1 and GotForks4 is the Synchronization Delay. When looking at the variations of Synchronization delay, in most of the cases it is low. There are a few exceptions where the delay is drastically high, but on an average it is low. When comparing the Response time, for most of the cases it is relatively high. In this observation, there has been no low response time.

Fig. 2
figure 2

Sequence diagram of parallel implementation

4.3 Distributed Program Sequence Diagram

For distributed programs, the communication processes are distributed across different hosts. So, different sequence diagrams will be obtained for the server program and client program. But, here our main motive is to check upon the states of the philosopher, only the client program is taken into consideration (Fig. 3). Initially, all the philosophers will be in the thinking state and after some random amount of time they will move to the hungry state. Mutual exclusion takes place at this point. Philosophers check whether the forks are available for them. If both the forks are free, it’s denoted by gotForks. By checking the diagram, it can be observed that the time gap between hungry and gotForks is most of the time high. So, it can be concluded that the response time of distributed programs is high. Comparing two adjacent philosophers, the time gap between returnedForks of one philosopher and gotForks of the other philosopher is very low, making the synchronization delay of the distributed program too low (Table 1).

Fig. 3
figure 3

Sequence diagram of distributed implementation

Table 1 Comparison based on three metrics

5 Results

The following are observed from the sequence diagrams obtained from the concurrent programs:

  • In the multithreaded program, the synchronization delay is low and the response time is high.

  • In the parallel program, the synchronization delay is mostly low, but in some cases it is drastically high. Also, the response time is high.

  • In the distributed programs, the synchronization delay and response time are low.

  • When comparing synchronization delay of the three executions, Multithreaded execution and parallel execution take relatively lower time, while distributed execution takes the lowest time.

  • When comparing the response time of the three executions, multithreaded execution and parallel execution took higher time while distributed execution took relatively lower time.

6 Conclusion

When the concept of mutual exclusion is taken into consideration, distributed programming rarely springs to mind owing to its increasingly complex nature. So we have directly compared the efficiency of distributed programming to its more frequently used counterparts such as parallel and multithreaded programming. The main observations were based on the comparison of performance metrics of Dining philosophers problem, on a small load, run in three different ways-multithreaded, parallel and distributed systems.

Java Threads are used for multithreaded programming and synchronization is achieved by using the synchronized methods. The message complexity of the multithreaded program is 2(N−1), N is the count of the threads/philosophers. To attain parallelism, Fork/Join framework is used and the synchronized blocks help in synchronization. The message complexity of the parallel program is 4(N−1), N is the count of philosophers. And, distributed programs are built using RMI framework and semaphores are used to achieve synchronization. Its message complexity is 2(N−1), N is the number of philosophers [14].

The performance analysis of the concurrent programs is done using the metrics message complexity, synchronization delay and response time. The message complexity is the number of messages required per execution of a critical section. It was found that in terms of message complexity the program run in parallel was the most complex while the programs run in distributed and multithreaded manner shared the same complexity. Synchronization delay is the time required for a process to enter the critical section after another process exits the critical section. It was also observed that Synchronization delay for parallel and multithreading is almost similar but Synchronization delay for distributed systems is less compared to the other two. Response time is the time interval a request waits for its critical section execution to be over after its request messages have been sent out. When comparing the response time of parallel and multithreaded programming, they have almost same response time but, while comparing response time of distributed and multithreading or parallel, response time of multithreading is more than response time of distributed.

In conclusion, for a mutual exclusion program of low load, if RT is the response time and SD is the synchronization delay then,

$$\begin{aligned} {\text {RT (Multithreading)}} \sim {\text {RT (Parallel)}} > {\text {RT (Distributed)}}\end{aligned}$$
(1)
$$\begin{aligned} {\text {SD (Multithreading)}} \sim {\text {SD (Parallel)}} > {\text {SD (Distributed)}}\end{aligned}$$
(2)

As part of future work, the paper can be extended by experimenting on high load and comparing results.