Keywords

1 Introduction

The era of big data [1, 2] has generated large-scale linear programming (LP) problems [3]. Such problems arise in economics, industry, logistics, statistics, quantum physics, and other fields. To solve them, high-performance computing systems and parallel algorithms are required. Thus, the development of new parallel algorithms for solving LP problems and the revision of current algorithms have become imperative. As examples, we can cite the works [4,5,6,7,8,9]. The development of new parallel algorithms for solving large-scale linear programming problems involves testing them on benchmark and random problems. One of the most well-known benchmark repositories of linear programming problems is Netlib-Lp [10]. However, when debugging LP solvers, it is often necessary to generate random LP problems with certain characteristics, with the dimension of the space and the number of constraints being the main ones. The paper [11] suggested one of the first methods for generating random LP problems with known solutions. The method allows generating test problems of arbitrary size with a wide range of numerical characteristics. The main idea of the method is as follows. Take as a basis an LP problem with a known solution and then randomly modify it so that the solution does not change. The main drawback of the method is that fixing the optimal solution in advance significantly restricts the random nature of the resulting LP problem.

The article [12] describes the GENGUB generator, which constructs random LP problems with a known solution and given characteristics, such as the problem size, the density of the coefficient matrix, the degeneracy, the number of binding inequalities, and others. A distinctive feature of GENGUB is the ability to introduce generalized upper bound constraints, defined to be a (sub)set of constraints in which each variable appears at most once (i.e., has at most one nonzero coefficient). This method has the same drawback as the previous one: by preliminarily fixing the optimal solution, one significantly restricts the random nature of the resulting LP problem.

The article [13] suggests a method for generating random LP problems with a preselected solution type: bounded or unbounded, unique or multiple. Each of these structures is generated using random vectors with integer components whose range can be given. Next, an objective function that satisfies the required conditions, i.e., leads to a solution of the desired type, is obtained. The LP problem generator described in [13] is mainly used for educational purposes and is not suitable for testing new linear programming algorithms due to the limited variety of generated problems.

In the present paper, we suggest an alternative method for generating random LP problems. The method has the peculiarity of generating feasible problems of a given dimension with an unknown solution. The generated problem is fed to the input of the tested LP solver, and the latter outputs a solution that must be validated. The validator program (see, for example, [14]) validates the obtained solution. The method we suggest for generating random LP problems is named FRaGenLP (Feasible Random Generator of LP) and is implemented as a parallel program for cluster computing systems. The rest of the article is as follows. Section 2 provides a formal description of the method for generating random LP problems and gives a sequential version of the FRaGenLP algorithm. In Sect. 3, we discuss the parallel version of the FRaGenLP algorithm. In Sect. 4, we describe the implementation of FRaGenLP using a parallel BSF-skeleton and give the results of large-scale computational experiments on a cluster computing system. The results confirm the efficiency of our approach. Section 5 summarizes the obtained results and discusses plans to use the FRaGenLP generator in the development of an artificial neural network capable of solving large LP problems.

2 Method for Generating Random LP Problems

The method suggested in this paper generates random feasible bounded LP problems of arbitrary dimension n with an unknown solution. To guarantee the correctness of the LP problem, the constraint system includes the following support inequalities:

$$\begin{aligned} \left\{ \begin{array}{llllll} {{x_1}}&{}{}&{}{}&{}{}&{} \leqslant &{}\alpha \\ {}&{}{{x_2}}&{}{}&{}{}&{} \leqslant &{}\alpha \\ {}&{}{}&{} \ddots &{}{}&{} \cdots &{} \cdots \\ {}&{}{}&{}{}&{}{{x_n}}&{} \leqslant &{}\alpha \\ { - {x_1}}&{}{}&{}{}&{}{}&{} \leqslant &{}0 \\ {}&{}{ - {x_2}}&{}{}&{}{}&{} \leqslant &{}0 \\ {}&{}{}&{} \ddots &{}{}&{} \cdots &{} \cdots \\ {}&{}{}&{}{}&{}{ - {x_n}}&{} \leqslant &{}0 \\ {{x_1}}&{}{ + {x_2}}&{} \cdots &{}{+ {x_n}}&{} \leqslant &{}{(n - 1)\alpha + \alpha /2} \end{array} \right. \end{aligned}$$
(1)

Here, the positive constant \(\alpha \in {\mathbb {R}_{ > 0}}\) is a parameter of the FRaGenLP generator. The number of support inequalities is \(2n + 1\). The number of random inequalities is determined by a parameter \(d \in {\mathbb {Z}_{ \geqslant 0}}\). The total number m of inequalities is defined by the following equation:

$$\begin{aligned} m = 2n + 1 + d.\end{aligned}$$
(2)

The coefficients of the objective function are specified by the vector

$$\begin{aligned} c = \theta \left( {n,n - 1,n - 2, \ldots ,1} \right) ,\end{aligned}$$
(3)

where the positive constant \(\theta \in {\mathbb {R}_{ > 0}}\) is a parameter of the FRaGenLP generator that satisfies the following condition:

$$\begin{aligned} \theta \leqslant \frac{\alpha }{2}.\end{aligned}$$
(4)

From now on, we assume that the LP problem requires finding a feasible point at which the maximum of the objective function is attained. If the number d of random inequalities is zero, then FRaGenLP generates an LP problem that includes only the support inequalities given in (1). In this case, the LP problem has the following unique solution:

$$\begin{aligned} \bar{x} = \left( {\alpha , \ldots ,\alpha ,{\alpha /2}} \right) .\end{aligned}$$
(5)

If the number d of random inequalities is greater than zero, the FRaGenLP generator adds the corresponding number of inequalities to system (1). The coefficients \({a_i} = \left( {{a_{i1}}, \ldots ,{a_{in}}} \right) \) of the random inequality and the constant term \({b_i}\) on the right side are calculated through the function \(\mathrm {rand} (l,r)\), which generates a random real number in the interval \(\left[ {l,r} \right] \) (\(l,r \in \mathbb {R};l < r \)), and the function \(\mathrm {rsgn} ()\), which randomly selects a number from the set \(\left\{ {1, - 1} \right\} \):

(6)

Here, \({a_{\mathrm {max}}},{b_{\mathrm {max}}} \in {\mathbb {R}_{ > 0}}\) are parameters of the FRaGenLP generator. The inequality sign is always “\(\leqslant \)”. Let us introduce the following notations:

$$\begin{aligned} \mathrm {f} (x) = \langle {c,x} \rangle ;\end{aligned}$$
(7)
$$\begin{aligned} {h} = \left( {{\alpha / 2},{{ \ldots ,\alpha } / 2}} \right) ;\end{aligned}$$
(8)
$$\begin{aligned} \mathrm {dist}_h({a_i},{b_i}) = \frac{{\left| {\langle {{a_i},{h}} \rangle - {b_i}} \right| }}{{\left\| {{a_i}} \right\| }};\end{aligned}$$
(9)
$$\begin{aligned} \pi (h,{a_i},{b_i}) = h - \frac{{\langle {{a_i},h} \rangle - {b_i}}}{{|a_i|}^2}{a_i}.\end{aligned}$$
(10)

Equation (7) defines the objective function of the LP problem. Here and further on, \(\langle { \cdot , \cdot } \rangle \) stands for the dot product of vectors. Equation (8) defines the central point of the bounding hypercube specified by the first 2n inequalities of system (1). Furthermore, Eq. (9) defines a function \(\mathrm {dist}_h({a_i},{b_i})\) that gives the distance from the hyperplane \(\langle {{a_i},x} \rangle = {b_i}\) to the center h of the bounding hypercube. Here and below, \(\left\| {}\cdot {} \right\| \) denotes the Euclidean norm. Equation (10) defines a vector-valued function that expresses the orthogonal projection of the point h onto the hyperplane \(\langle {{a_i},x} \rangle = {b_i}\).

To obtain a random inequality \(\langle {{a_i},x} \rangle \leqslant {b_i}\), we calculate the coordinates of the coefficient vector \({a_i}\) and the constant term \({b_i}\) using a pseudorandom rational number generator. The generated random inequality is added to the constraint system if and only if the following conditions hold:

$$\begin{aligned} \langle {{a_i},{h}} \rangle \leqslant {b_i};\end{aligned}$$
(11)
$$\begin{aligned} \rho < \mathrm {dist}_h({a_i},{b_i}) \leqslant \theta ;\end{aligned}$$
(12)
$$\begin{aligned} \mathrm {f} \left( {\pi \left( {{h},{a_i},{b_i}} \right) } \right) > \mathrm {f} \left( {{h}} \right) ;\end{aligned}$$
(13)
$$\begin{aligned} \forall l \in \{ 1, \ldots ,i - 1\} :\lnot \mathrm {like} ({a_i},{b_i},{a_l},{b_l}).\end{aligned}$$
(14)

Condition (11) requires that the center of the bounding hypercube be a feasible point for the considered random inequality. If the condition does not hold, then the inequality \( - \langle {{a_i},x} \rangle \leqslant - {b_i}\) is added instead of \(\langle {{a_i},x} \rangle \leqslant {b_i}\). Condition (12) requires that the distance from the hyperplane \(\langle {{a_i},x} \rangle = {b_i}\) to the center h of the bounding hypercube be greater than \(\rho \) but not greater than \(\theta \). The constant \(\rho \in {\mathbb {R}_{ > 0}}\) is a parameter of the FRaGenLP generator and must satisfy the condition \(\rho < \theta \), where \(\theta \), in turn, satisfies condition (4). Condition (13) requires that the objective function value at the projection of the point h onto the hyperplane \(\langle {{a_i},x} \rangle = {b_i}\) be greater than the objective function value at the point h. This condition combined with (11) and (12) cuts off constraints that cannot affect the solution of the LP problem. Finally, condition (14) requires that the new inequality be dissimilar from all previously added ones, including the support ones. This condition uses the Boolean function “\(\mathrm {like}\)”, which determines the likeness of the inequalities \(\langle {{a_i},x} \rangle \leqslant {b_i}\) and \(\langle {{a_l},x} \rangle \leqslant {b_l}\) through the following equation:

$$\begin{aligned} \mathrm {like} ({a_i},{b_i},{a_l},{b_l}) = \left\| {\frac{{{a_i}}}{{\left\| {{a_i}} \right\| }} - \frac{{{a_l}}}{{\left\| {{a_l}} \right\| }}} \right\|< {L_{\mathrm {max}}} \wedge \left| {\frac{{{b_i}}}{{\left\| {{a_i}} \right\| }} - \frac{{{b_l}}}{{\left\| {{a_l}} \right\| }}} \right| < {S_{\mathrm {min}}}.\end{aligned}$$
(15)

The constants \({L_{\mathrm {max}}},{S_{\mathrm {min}}} \in {\mathbb {R}_{ > 0}}\) are parameters of the FRaGenLP generator. In this case, the parameter \({L_{\mathrm {max}}}\) must satisfy the condition

$$\begin{aligned} {L_{\mathrm {max}}} \leqslant 0.7\end{aligned}$$
(16)

(we will explain the meaning of this constraint below). According to (15), inequalities \(\langle {{a_i},x} \rangle \leqslant {b_i}\) and \(\langle {{a_l},x} \rangle \leqslant {b_l}\) are similar if the following two conditions hold:

$$\begin{aligned} \left\| {\frac{{{a_i}}}{{\left\| {{a_i}} \right\| }} - \frac{{{a_l}}}{{\left\| {{a_l}} \right\| }}} \right\| < {L_{\mathrm {max}}};\end{aligned}$$
(17)
$$\begin{aligned} \left| {\frac{{{b_i}}}{{\left\| {{a_i}} \right\| }} - \frac{{{b_l}}}{{\left\| {{a_l}} \right\| }}} \right| < {S_{\mathrm {min}}}.\end{aligned}$$
(18)

Condition (17) evaluates the measure of parallelism of the hyperplanes \(\langle {{a_i},x} \rangle = {b_i}\) and \(\langle {{a_l},x} \rangle = {b_l}\), which bound the feasible regions of the corresponding inequalities. Let us explain this. The unit vectors \(e_i = a_i/\left\| a_i \right\| \) and \(e_l = a_l/\left\| a_l \right\| \) are normal to the hyperplanes \(\langle {{a_i},x} \rangle = {b_i}\) and \(\langle {{a_l},x} \rangle = {b_l}\), respectively. Let us introduce the notation \(\delta = \left\| e_i - e_l\right\| \). If \(\delta = 0\), then the hyperplanes are parallel. If \(0 \leqslant \delta < {L_{\mathrm {max}}}\), then the hyperplanes are considered to be nearly parallel.

Condition (18) evaluates the closeness of the parallel hyperplanes \(\langle {{a_i},x} \rangle = {b_i}\) and \(\langle {{a_l},x} \rangle = {b_l}\). Indeed, the scalar values \(\beta _i = b_i/\left\| a_i \right\| \) and \(\beta _l = b_l/\left\| a_l \right\| \) are the normalized constant terms. Let us introduce the notation \(\sigma = | \beta _i - \beta _l |\). If \(\sigma = 0\), then the parallel hyperplanes coincide. If the hyperplanes are nearly parallel and \(0 \leqslant \sigma < S_{\mathrm {min}}\), then they are considered to be nearly concurrent.

Two linear inequalities in \(\mathbb {R}^n\) are considered similar if the corresponding hyperplanes are nearly parallel and nearly concurrent.

The constraint (16) for the parameter \({L_{\mathrm {max}}}\) is based on the following proposition.

Proposition 1

Let the two unit vectors \(e,e' \in {\mathbb {R}^n}\) and the angle \(\varphi < \pi \) between them be given. Then,

$$\begin{aligned} \left\| {e - e'} \right\| = \sqrt{2(1 - \cos \varphi )} .\end{aligned}$$
(19)

Proof

By the definition of the norm in Euclidean space, we have

$$\begin{aligned}&\left\| {e - e'} \right\| = \sqrt{\sum \limits _j {{{\left( {{e_j} - {{e'}_j}} \right) }^2}} } = \sqrt{\sum \limits _j {\left( {{e_j}^2 - 2{e_j}{{e'}_j} + {{e'}_j}^2} \right) } } \\&= \sqrt{\sum \limits _j {{e_j}^2} - 2\sum \limits _j {{e_j}{{e'}_j}} + \sum \limits _j {{{e'}_j}^2} } = \sqrt{1 - 2\langle {{e_j},{{e'}_j}} \rangle + 1} . \\ \end{aligned} $$

Thus,

$$\begin{aligned} \left\| {e - e'} \right\| = \sqrt{2\left( {1 - \langle {{e_j},{{e'}_j}} \rangle } \right) } .\end{aligned}$$
(20)

By the definition of the angle in Euclidean space, we have, for unit vectors,

$$\begin{aligned} \langle {{e_j},{{e'}_j}} \rangle = \cos \varphi . \end{aligned}$$

Substituting in (20) the expression obtained, we have

$$\begin{aligned} \left\| {e - e'} \right\| = \sqrt{2\left( {1 - \cos \varphi } \right) } . \end{aligned}$$

The proposition is proven.

It is reasonable to consider that two unit vectors \(e,e'\) are nearly parallel if the angle between them is less than \(\pi /4\). In this case, according to (19), we have

$$\begin{aligned} \left\| {e - e'} \right\| < \sqrt{2\left( {1 - \cos \frac{\pi }{4}} \right) } . \end{aligned}$$

Taking into account that \(\cos (\pi /4) \approx 0.707\), we obtain the required estimate:

$$\begin{aligned} \left\| {e - e'} \right\| < 0.7. \end{aligned}$$
Fig. 1.
figure 1

Random LP problem with \(n=2\), \(d=5\), \(\alpha =200\), \(\theta =100\), \(\rho =50\), \(S_{\mathrm {min}}=100\), \(L_{\mathrm {max}}=0.35\), \(a_{\mathrm {max}}=1000\), and \(b_{\mathrm {max}}=10\,000\)

An example of a two-dimensional LP problem generated by FRaGenLP is shown in Fig. 1. The purple color indicates the line defined by the coefficients of the objective function; the black lines correspond to the support inequalities, and the red lines correspond to the random inequalities. For the sake of clarity, we use green dashed lines to plot the large and the small circles defined by the equations \({\left( {{x_1} - 100} \right) ^2} + {\left( {{x_2} - 100} \right) ^2} = {100^2}\) and \({\left( {{x_1} - 100} \right) ^2} + {\left( {{x_2} - 100} \right) ^2} = {50^2}\). According to condition (12), any random line must intersect the large green circle but not the small green circle. The semitransparent red color indicates the feasible region of the generated LP problem.

figure a

Algorithm 1 represents a sequential implementation of the described method. Step 1 assigns zero value to the counter k of random inequalities. Step 2 creates an empty list A to store the coefficients of the inequalities. Step 3 creates an empty list B to store the constant terms. Step 4 adds the coefficients and constant terms of the support inequalities (1) to the lists A and B, respectively. Step 5 generates the coefficients of the objective function according to (3). If the parameter d, which specifies the number of random inequalities, is equal to zero, then Step 6 passes the control to Step 19. Steps 7 and 8 generate the coefficients and the constant term of the new random inequality. Step 9 checks condition (11). If the condition does not hold, then the signs of the coefficients and the constant term are reversed (Steps 10, 11). Step 12 checks condition (12). Step 13 checks condition (13). Step 14 checks condition (14). Step 15 appends the coefficients of the new random inequality to the list A (\(\mathbin {++}\) denotes the concatenation of lists). Step 16 appends the constant term of the new random inequality to the list B. Step 17 increments the counter of added random inequalities by one. If the number of added random inequalities has not reached the given quantity d, then Step 18 passes the control to Step 7 to generate the next inequality. Step 19 outputs the results. Step 20 stops computations.

3 Parallel Algorithm for Generating Random LP Problems

Implicit loops generated by passing the control from Steps 12–14 to Step 7 of Algorithm 1 can result in high overheads. For example, during the generation of the LP problem represented in Fig. 1, there were 112 581 returns from Step 12 to label 7, 32 771 from Step 13, and 726 from Step 14. Therefore, generating a large random LP problem on a commodity personal computer can take many hours. To overcome this obstacle, we developed a parallel version of the FRaGenLP generator for cluster computing systems. This version is presented as Algorithm 2. It is based on the BSF parallel computation model [15, 16], which assumes the master–slave paradigm [17]. According to the BSF model, the node serves as a control and communication center. All slave nodes execute the same code but on different data.

figure b

Let us discuss Algorithm 2 in more detail. First, we look at the steps performed by the master node. Step 1 assigns zero value to the counter of random inequalities. Step 2 creates an empty list \({A_S}\) to store the coefficients of the support inequalities. Step 3 creates an empty list \({B_S}\) to store the constant terms of the support inequalities. Step 4 adds the coefficients and constant terms of the support inequalities (1) to the lists \({A_S}\) and \({B_S}\), respectively. Step 5 generates the coefficients of the objective function according to (3). Step 6 outputs the coefficients and constant term of the support inequalities. If the parameter d, which specifies the number of random inequalities, is equal to zero, then Step 7 passes the control to Step 36, which terminates the computational process in the master node. Step 8 creates an empty list \({A_R}\) to store the coefficients of the random inequalities. Step 9 creates an empty list \({B_R}\) to store the constant terms of the random inequalities. In Step 18, the master node receives one random inequality from each slave node. Each of these inequalities satisfies conditions (11)–(13) and is not similar to any of the support inequalities. These conditions are ensured by the slave nodes. In the loop consisting of Steps 19–32, the master node checks all received random inequalities for similarity with the random inequalities previously included in the lists \({A_R}\) and \({B_R}\). The similar new inequalities are rejected, and the dissimilar ones are added to the lists \({A_R}\) and \({B_R}\). In this case, the inequality counter is increased by one each time some inequality is added to the lists. If the required number of random inequalities has already been reached, then Step 31 performs an early exit from the loop. Step 33 sends the current number of added random inequalities to the slave nodes. If this quantity is less than d, then Step 34 passes the control to Step 18, which requests a new portion of random inequalities from the slave nodes. Otherwise, Step 35 outputs the results, and Step 36 terminates the computational process in the master node.

Let us consider now the steps performed by the l-th slave node. If the parameter d, which specifies the number of random inequalities, is equal to zero, then Step 1 passes the control to Step 36, which terminates the computational process in the slave node. Otherwise, Steps 2 and 3 create the empty lists \({A_S}\) and \({B_S}\) to store the support inequalities. Step 4 adds the coefficients and constant terms of the support inequalities (1) to the lists \({A_S}\) and \({B_S}\), respectively. Steps 5–8 generate a new random inequality. Step 9 checks condition (11). If this condition does not hold, then the signs of the coefficients and the constant term are reversed (Steps 10 and 11). Steps 12–14 check conditions (12) and (13). Steps 15–17 check the similarity of the generated inequality to the support inequalities. If any one of these conditions does not hold, then the control is passed to Step 5 to generate a new random inequality. If all conditions hold, then Step 18 sends the constructed random inequality to the master node. In Step 33, the slave receives from the master the current number of obtained random inequalities. If this quantity is less than the required number, then Step 34 passes the control to Step 5 to generate a new random inequality. Otherwise, Step 36 terminates the computational process in the slave node.

4 Software Implementation and the Computational Experiments

We implemented the parallel Algorithm 2 in C++ through the parallel BSF-skeleton [18], which is based on the BSF parallel computation model [15] and encapsulates all aspects related to the parallelization of the program using the MPI library [19].

The BSF-skeleton requires the representation of the algorithm in the form of operations on lists using the higher-order functions Map and Reduce, defined by the Bird–Meertens formalism [20]. The required representation can be constructed as follows. Set the length of the Map and Reduce lists equal to the number of slave MPI processes. Define the Map list items as empty structures:

$$\begin{aligned} {\text {struct PT\_bsf\_mapElem\_T\{ \} }}. \end{aligned}$$

Each element of the Reduce list stores the coefficients and the constant term of one random inequality \(\langle {a,x} \rangle \leqslant b\):

$$\begin{aligned} {\text {struct PT\_bsf\_reduceElem\_T\{ float a[n]; float b\} }}. \end{aligned}$$

Each slave MPI process generates one random inequality using the PC_bsf_MapF function, which executes Steps 5–17 of Algorithm 2. The slave MPI process stores the inequality that satisfies all conditions to its local Reduce list consisting of a single item. The master MPI process receives the generated elements from the slave MPI processes and places them in its Reduce list (this code is implemented in the problem-independent part of the BSF-skeleton). After that, the master MPI process checks each obtained inequality for similarity with the previously added ones. If no matches are found, the master MPI process adds the inequality just checked to its local Reduce list. These actions, corresponding to Steps 19–32 of Algorithm 2, are implemented as the standard function PC_bsf_ProcessResults of the BSF-skeleton. The source code of the FRaGenLP parallel program is freely available on the Internet at https://github.com/leonid-sokolinsky/BSF-LPP-Generator.

Table 1. Specifications of the “Tornado SUSU” computing cluster

Using the program, we conducted large-scale computational experiments on the cluster computing system “Tornado SUSU” [21]. The specifications of the system are given in Table 1. The computations were performed for several dimensions, namely \(n = 3000\), \(n = 5500\), and \(n = 15\,000\). The total numbers of inequalities were, respectively, 6301, \(10\,001\), and \(31\,501\). The corresponding numbers of random inequalities were 300, 500, and 1500, respectively. Throughout the experiments, we used the following parameter values: \(\alpha = 200\) (the length of the bounding hypercube edge), \(\theta = 100\) (the radius of the large hypersphere), \(\rho = 50\) (the radius of the small hypersphere), \({L_{\mathrm {max}}} = 0.35\) (the upper bound of near parallelism for hyperplanes), \({S_{\mathrm {min}}} = 100\) (the minimum acceptable closeness for hyperplanes), \({a_{\mathrm {max}}} = 1000\) (the upper absolute bound for the coefficients), and \({b_{\mathrm {max}}} = 10\,000\) (the upper absolute bound for the constant terms).

The results of the experiments are shown in Fig. 2. Generating a random LP problem with \(31\,501\) constraints with a configuration consisting of a master node and a slave node took 12 min. Generating the same problem with a configuration consisting of a master node and 170 slave nodes took 22 s. The analysis of the results showed that the scalability bound (the maximum of the speedup curve) of the algorithm significantly depends on the dimension of the problem. For \(n = 3000\), the scalability bound was 50 processor nodes approximately. This bound increased up to 110 nodes for \(n = 5000\), and to 200 nodes for \(n = 15\,000\). A further increase in problem size causes the processor nodes to run out of memory. It should be noted that the scalability bound of the algorithm significantly depends on the number of random inequalities too. Increasing this number by a factor of 10 resulted in a twofold reduction of the scalability bound. This is because an increase in the number of slave nodes results in a significant increase in the portion of sequential computations performed by the master node in Steps 19–32, during which the slave nodes are idle.

Fig. 2.
figure 2

Speedup curves of the FRaGenLP parallel algorithm for various dimensions

5 Conclusions

In this paper, we described the parallel FRaGenLP algorithm for generating random feasible bounded LP problems on cluster computing systems. In addition to random inequalities, the generated constraint systems include a standard set of inequalities called support inequalities. They ensure the boundedness of the feasible region of the LP problem. In geometric terms, the feasible region of the support inequalities is a hypercube with edges adjacent to the coordinate axes, and the vertex that is farthest from the origin is cut off. The objective function is defined in such a manner that its coefficients decrease monotonically. The coefficients and constant terms of the random inequalities are obtained using a random number generator. If the feasible region of a randomly generated inequality does not include the center of the bounding hypercube, then the sign of the inequality is reversed. Furthermore, not every random inequality is included in the constraint system. The random inequalities that cannot affect the solution of the LP problem for a given objective function are rejected. The inequalities, for which the bounding hyperplane intersects a small hypersphere located at the center of the bounding hypercube are also rejected. This ensures the feasibility of the constraint system. Moreover, any random inequality that is “similar” to at least one of the inequalities already added to the system (including the support ones) is also rejected. To define the “similarity” of inequalities, two formal metrics are introduced for bounding hyperplanes: the measure of parallelism and the measure of closeness.

The parallel algorithm is based on the BSF parallel computation model, which relies on the master–slave paradigm. According to this paradigm, the master node serves as a control and communication center. All slave nodes execute the same code but on different data. The parallel implementation was performed in C++ through the parallel BSF-skeleton, which encapsulates all aspects related to the MPI-based parallelization of the program. The source code of the FRaGenLP generator is freely available on the Internet at https://github.com/leonid-sokolinsky/BSF-LPP-Generator.

Using this implementation, we conducted large-scale computational experiments on a cluster computing system. As the experiments showed, the parallel FRaGenLP algorithm demonstrates good scalability, up to 200 processor nodes for \(n=15\,000\). Generating a random LP problem with \(31\,501\) constraints takes 22 s with a configuration consisting of 171 processor nodes. Generating the same problem with a configuration consisting of a processor node takes 12 min. The program was used to generate a dataset of 70 000 samples for training an artificial neural network capable of quickly solving large LP problems.