Abstract
The article proposes an n-dimensional mathematical model of the visual representation of a linear programming problem. This model makes it possible to use artificial neural networks to solve multidimensional linear optimization problems, the feasible region of which is a bounded non-empty set. To visualize a linear programming problem, an objective hyperplane is introduced, its orientation is determined by the gradient of the linear objective function: the gradient is the normal to the objective hyperplane. In the case of searching the maximum, the objective hyperplane is positioned in such a way that the value of the objective function at all its points exceeds the value of the objective function at all points of the feasible region, which is a bounded convex polytope. For an arbitrary point of the objective hyperplane, the objective projection onto the polytope is determined: the closer the objective projection point is to the objective hyperplane, the greater the value of the objective function at this point. Based on the objective hyperplane, a finite regular set of points, called the receptive field, is constructed. Using objective projections, an image of the polytope is constructed. This image includes the distances from the receptive points to the corresponding points of the polytope surface. Based on the proposed model, parallel algorithms for visualizing a linear programming problem are constructed. An analytical estimation of its scalability is performed. Information about the software implementation and the results of large-scale computational experiments confirming the efficiency of the proposed approaches are presented.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The rapid development of Big Data technologies [11, 12] has led to the emergence of mathematical optimization models in the form of large-scale linear programming (LP) problems [24]. Such problems arise in industry, economics, logistics, statistics, quantum physics, and other fields [3, 4, 8, 22, 25]. In many cases, the conventional software is not able to handle such large-scale LP problems in an acceptable time [2]. At the same time, in the nearest future, exascale supercomputers potentially capable of solving such problems will appear [6]. In accordance with this, the issue of developing new effective methods for solving large-scale LP problems using exascale supercomputing systems is urgent.
Until now, the class of algorithms proposed and developed by Dantzig on the basis of the simplex method [5] is one of the most common ways to solve LP problems. The simplex method is effective for solving a large class of LP problems. However, the simplex method has some fundamental features that limit its applicability to large LP problems. First, in the worst case, the simplex method traverses all the vertices of the simplex, which results in exponential time complexity [35]. Second, in most cases, the simplex method successfully solves LP problems containing up to 50,000 variables. However, a loss of precision is observed when the simplex method is used for solving large LP problems. Such a loss of precision cannot be compensated even by applying such computational intensive procedures as “affine scaling” or “iterative refinement” [34]. Third, the simplex method does not scale well on multiprocessor systems with distributed memory. Many attempts to parallelize the simplex method were made, but they all failed [19]. In [14], Karmarkar proposed the inner point method having polynomial time complexity in all cases. This method effectively solves problems with millions of variables and millions of constraints. Unlike the simplex method, the inner point method is self-correcting. Therefore, it is robust to the loss of precision in computations. The drawbacks of the interior point method are as follows. First, the interior point method requires the careful tuning of its parameters. Second, this method needs a known point that belongs to the feasible region of the LP problem to start calculations. Finding such an interior point can be reduced to solving an additional LP problem. An alternative is iterative projection-type methods [23, 26, 31], which are also self-correcting. Third, like the simplex method, the inner point method does not scale well on multiprocessor systems with distributed memory. Several attempts at effective parallelization for particular cases were made (see, for example, [10, 15]). However, it was not possible to make efficient parallelization for the general case. In accordance with this, research directions related to the development of new scalable methods for solving LP problems are urgent.
A possible efficient alternative to the conventional methods of LP is optimization methods based on neural network models. Artificial neural networks [20, 21] are one of the most promising and rapidly developing areas of modern information technology. Neural networks are a universal tool capable of solving problems in almost all areas. The most impressive success was achieved in image recognition and analysis using convolutional neural networks [18]. However, in scientific periodicals, there are almost no works devoted to the use of convolutional neural networks for solving linear optimization problems [17]. The reason is that convolutional neural networks focus on image processing, but there are no works on the visual representation of multidimensional linear programming problems in the scientific literature. Thus, the issue of developing new neural network models and methods focused on linear optimization remains open.
In this paper, we try to develop an n-dimensional mathematical model of the visual representation of the LP problem. This model allows one to employ the technique of artificial neural networks to solve multidimensional linear optimization problems, the feasible region of which is a bounded non-empty set. The visualization method based on the described model has high computational complexity. For this reason, we propose its implementation as a parallel algorithm designed for cluster computing systems. The rest of the paper is organized as follows. Section 2 is devoted to the design of the mathematical model of the visual representation of multidimensional LP problems. Section 3 describes the implementation of the proposed visualization method as a parallel algorithm and provides an analytical estimation of its scalability. Section 4 presents information about the software implementation of the described parallel algorithm and discusses the results of large-scale computational experiments on a cluster computing system. Section 5 summarizes the obtained results and provides directions for further research.
2 Mathematical Model of the LP Visual Representation
The linear optimization problem can be stated as follows
where \(c,b \in \mathbb {R}^n\), \(A \in \mathbb {R}^{m\times n}\), and \(c\ne \mathbf {0}\). Here and below, \(\left\langle \cdot \;,\cdot \right\rangle \) stands for the dot product of vectors. We assume that the constraint \(x \geqslant \mathbf {0}\) is also included in the system \(Ax \leqslant b\) in the form of the following inequalities:
The vector c is the gradient of the linear objective function
Let M denote the feasible region of problem (1):
We assume from now on that M is a non-empty bounded set. This means that M is a convex closed polytope in the space \(\mathbb {R}^n\), and the solution set of problem (1) is not empty.
Let \(\tilde{a}_i\in \mathbb {R}^n\) be a vector formed by the elements of the ith row of the matrix A. Then, the matrix inequality \(Ax \leqslant b\) is represented as a system of inequalities
We assume from now on that
for all \(i=1,\ldots ,m\). Let us denote by \(H_i\) the hyperplane defined by the equation
Thus,
Definition 1
The half-space \(H_{i}^+\) generated by the hyperplane \(H_i\) is the half-space defined by the equation
From now on, we assume that problem (1) is non-degenerate, i.e.,
Definition 2
The half-space \(H_{i}^+\) generated by the hyperplane \(H_i\) is recessive with respect to the vector c if
In other words, the ray coming from the hyperplane \(H_{i}\) in the direction opposite to the vector c lies completely in \(H_{i}^+\), but not in \(H_{i}\).
Proposition 1
The necessary and sufficient condition for the recessivity of the half-space \(H_{i}^+\) with respect to the vector c is the condition
Proof
Let us prove the necessity first. Let condition (10) hold. Equation (7) implies
By virtue of (5),
Comparing (10) with (12) and (13), we obtain
In view of (7) and (8), this implies
Using simple algebraic transformations of inequality (14), we obtain (11). Thus, the necessity is proved.
Let us prove the sufficiency by contradiction. Assume that (11) holds, and there are \(x\in H_i\) and \(\lambda >0\) such that
In accordance with (7) and (8), this implies
that is equivalent to
Since \(\lambda >0\), it follows from (11) that
but this contradicts our assumption that \(x\in H_i\). \(\square \)
Definition 3
Fix a point \(z\in \mathbb {R}^n\) such that the half-space
includes the polytope M:
In this case, we call the half-space \(H_c^+\) the objective half-space, and the hyperplane \(H_c\), defined by the equation
the objective hyperplane.
Denote by \(\pi _c(x)\) the orthogonal projection of the point x onto the objective hyperplane \(H_c\):
Here, \(\Vert \cdot \Vert \) stands for the Euclidean norm. Define the distance \(\rho _c(x)\) from \(x\in H_c^+\) to the objective hyperplane \(H_c\) as follows:
Comparing (15), (17) and (18), we find that, in this case, the distance \(\rho _c(x)\) can be calculated as follows:
The following Proposition 2 holds.
Proposition 2
For all \(x,y \in H_c^+\),
Proof
Equation (19) implies that
\(\square \)
Proposition 2 says that problem (1) is equivalent to the following problem:
Definition 4
Let the half-space \(H_i^+\) be recessive with respect to the vector c. The objective projection \(\gamma _i(x)\) of the point \(x\in \mathbb {R}^n\) onto the recessive half-space \(H_i^+\) is a point defined by the equation
where
Examples of objective projections in \(\mathbb {R}^2\) are shown in Fig. 1.
The following Proposition 3 provides an equation for calculating the objective projection onto a half-space that is recessive with respect to the vector c.
Proposition 3
Let the half-space \(H_i^+\) defined by the inequality
be recessive with respect to the vector c. Let
Then,
Proof
According to Definition 4, we have
where
Thus, we need to prove that
Consider the strait line L defined by the parametric equation
Let the point q be the intersection of the line L with the hyperplane \(H_i\):
Then, q must satisfy the equation
for some \(\tau '\in \mathbb {R}\). Substitute the right side of Eq. (27) into Eq. (6) instead of x:
It follows that
Substituting the right side of Eq. (28) into Eq. (27) instead of \(\tau '\), we obtain
which is equivalent to
Since, according to (26), \(q\in H_i\), Eq. (25) will hold if
holds. Assume the opposite, i.e., there exist \(\sigma '>0\) such that
and
Then, it follows from (22) and (32) that
This is equivalent to
Proposition 1 implies that \(\left\langle \tilde{a}_i,c \right\rangle >0\). Hence, Eq. (33) is equivalent to
Thus, we have a contradiction with (31). \(\square \)
Definition 5
Let \(g\in H_c\). The objective projection \(\gamma _M(g)\) of the point g onto the polytope M is a point defined by the following equation:
where
If
then we set \(\gamma _M(g)=\vec \infty \), where \(\vec \infty \) stands for a point that is infinitely far from the polytope M.
Examples of objective projections onto the polytope M in \(\mathbb {R}^2\) are shown in Fig. 2.
Definition 6
The receptive field \(\mathfrak {G}(z,\eta ,\delta )\subset H_c\) of the density \(\delta \in \mathbb {R}_{>0}\) with the center \(z\in H_c\) and the rank \(\eta \in \mathbb {N}\) is a finite ordered set of points satisfying the following conditions:
The points of the receptive field will be called receptive points.
Here, \({\text {Co}}(X)\) stands for the convex hull of a finite point set \(X=\big \lbrace x^{(1)},\ldots ,x^{(K)}\big \rbrace \subset \mathbb {R}^n\):
In Definition 6, condition (35) means that the center of the receptive field belongs to this field. Condition (36) implies that the distance from the central point z to each point g of the receptive field does not exceed \(\eta \delta \sqrt{n}\). According to (37), for any two different points \(g'\ne g''\) of the receptive field, the distance between them cannot be less than \(\delta \). Condition (38) says that for any point \(g'\) of the receptive field, there is a point \(g''\) in this field such that the distance between \(g'\) and \(g''\) is equal to \(\delta \). Condition (39) implies that for any point x belonging to the convex hull of the receptive field, there is a point g in this field such that the distance between x and g does not exceed \(\tfrac{1}{2} \delta \sqrt{n}\). An example of the receptive field in the space \(\mathbb {R}^3\) is presented in Fig. 3.
Let us describe a constructive method for building a receptive field. Without loss of generality, we assume that \(c_n\ne 0\). Consider the following set of vectors:
It is easy to see that
This means that \(c_0,\ldots ,c_{n-1}\) is an orthogonal basis in \(\mathbb {R}^n\). In particular,
The following Proposition 4 shows that the linear subspace of the dimension \((n-1)\) generated by the orthogonal vectors \(c_1,\ldots ,c_{n-1}\) is a hyperplane parallel to the hyperplane \(H_c\).
Proposition 4
Define the following linear subspace \(S_c\) of the dimension \((n-1)\) in \(\mathbb {R}^n\):
Then,
Proof
Let \(s\in S_c\), i.e.,
Then,
In view of (40), this implies
Comparing this with (16), we obtain \(s+z\in H_c\). \(\square \)
Define the following set of vectors:
It is easy to see that the set \(\{e_1,\ldots ,e_{n-1}\}\) is an orthonormal basis of the subspace \(S_c\).
The procedure for constructing a receptive field is presented as Algorithm 1. This algorithm constructs a receptive field \(\mathfrak {G}(z,\eta ,\delta )\) consisting of
points. These points are arranged at the nodes of a regular lattice having the form of a hypersquare (a hypercube of the dimension \(n-1\)) with the edge length equal to \(2\eta \delta \). The edge length of the unit cell is \(\delta \). According to Step 13 of Algorithm 1 and Proposition 4, this hypersquare lies in the hyperplane \(H_c\) and has the center at the point z. The drawback of Algorithm 1 is that the number of nested for loops depends on the dimension of the space. This issue can be solved using the function \({\text {G}}\), which calculates a point of the receptive field by its ordinal number (numbering starts from zero; the order is determined by Algorithm 1). The implementation of the function \({\text {G}}\) is represented as Algorithm 2. The following Proposition 5 provides an estimation of the time complexity of Algorithm 2.
Proposition 5
Algorithm 2 enables an implementation that has time complexityFootnote 1
where n is the space dimension.
Proof
Consider Algorithm 3 representing a low-level implementation of Algorithm 2. The values calculated in Steps 1–2 of Algorithm 3 do not depend on the receptive point number k and therefore can be considered constants. In Steps 3–8, the repeat/until loop runs \((n-1)\) times and requires \(c_{3:8}=5(n-1)\) operations. In steps 13–16, the nested repeat/until loop runs n times and requires \(c_{13:16}=4n\) operations. In steps 10–18, the external repeat/until loop runs \((n-1)\) times and requires \(c_{10:18}=(4+c_{13-16})(n-1)=4(n^2-1)\) operations. In total, we obtain
\(\square \)
Corollary 1
The time complexity of Algorithm 2 can be estimated as \(O(n^2)\).
Definition 7
Let \(z\in H_c\). Fix \(\eta \in \mathbb {N}\), \(\delta \in \mathbb {R}_{>0}\). The image \(\mathfrak {I}(z,\eta ,\delta )\) generated by the receptive field \(\mathfrak {G}(z,\eta ,\delta )\) is an ordered set of real numbers defined by the equation
The order of the real numbers in the image is determined by the order of the respective receptive points.
The following Algorithm 4 implements the function \(\mathfrak {I}(z,\eta ,\delta )\) building an image as a list of real numbers.
Here, \([\,]\) stands for the empty list, and \({+}\!\!{+}\) stands for the operation of list concatenation.
Let \(\left\langle \tilde{a}_i,c\right\rangle >0\). This means that the half-space \(H_i^+\) is recessive with respect to the vector c (see Proposition 1). Let there be a point \(u\in H_i\cap M\). Assume that we managed to create an artificial neural network DNN, which receives the image \(\mathfrak {I}(\pi _c(u),\eta ,\delta )\) as an input and outputs the point \(u'\) such that
Then, we can build the following Algorithm 5 solving linear programming problem (20) using the DNN.
Only an outline of the forthcoming algorithm is presented here, it needs further formalization, detalization and refinement.
3 Parallel Algorithm for Building an LP Problem Image
When solving LP problems of large dimension with a large number of constraints, Algorithm 4 of building an LP problem image can incur significant runtime overhead. This section presents a parallel version of Algorithm 4, which significantly reduces the runtime overhead of building the image of a large-scale LP problem. The parallel implementation of Algorithm 4 is based on the BSF parallel computation model [27, 28]. The BSF model is intended for a cluster computing system, uses the master/worker paradigm and requires the representation of the algorithm in the form of operations on lists using higher-order functions Map and Reduce defined in the Bird–Meertens formalism [1]. The BSF model also provides a cost metric for the analytical evaluation of the scalability of a parallel algorithm that meets the specified requirements. Examples of the BSF model application can be found in [7, 30,31,32,33].
Let us represent Algorithm 4 in the form of operations on lists using higher-order functions Map and Reduce. We use the list of ordinal numbers of inequalities of system (4) as a list, which is the second parameter of the higher-order function Map:
Designate \(\mathbb {R}_\infty =\mathbb {R}\cup \{\infty \}\). We define a parameterized function
which is the first parameter of the higher-order function Map, as follows:
where \(g_k=G(k,n,z,\eta ,\delta )\) (see Algorithm 2), and \(\gamma _i(g_k)\) is calculated by Eq. (24). Informally, the function \({\text {F}}_k\) maps the ordinal number of the half-space \(H_i^+\) to the distance from the objective projection to the objective hyperplane if \(H_i^+\) is recessive with respect to c (see Proposition 1), and the objective projection belongs to M. Otherwise, \({\text {F}}_k\) returns the special value \(\infty \).
The higher-order function Map transforms the list \(\mathcal {L}_{map}\) into the list \(\mathcal {L}_{reduce}\) by applying the function \({\text {F}}_k\) to each element of the list \(\mathcal {L}_{map}\):
Define the associative binary operation as follows:
Informally, the operation calculates the minimum of two numbers.
The higher-order function Reduce folds the list \(\mathcal {L}_{reduce}\) to the single value \(\rho \in \mathbb {R}_\infty \) by sequentially applying the operation to the entire list:
Algorithm 6 builds the image \(\mathfrak {I}\) of the LP problem using higher-order functions \(Map \) and \(Reduce \). The parallel version of Algorithm 6 is based on algorithmic template 2 in [28]. The result is presented as Algorithm 7.
Let us explain the steps of Algorithm 7. For simplicity, we assume that the number of constraints m is a multiple of the number of workers L. We also assume that the numbering of inequalities starts from zero. The parallel algorithm includes \(L+1\) processes: one master process and L worker processes. The master manages the computations. In Step 1, the master reads the space dimension n. In Step 2 of the master, the image variable \(\mathfrak {I}\) is initialized to the empty list. Step 3 of the master assigns zero to the iteration counter k. At Steps 4–14, the master organizes the repeat/until loop, in which the image \(\mathfrak {I}\) of the LP problem is built. In Step 5, the master sends the receptive point number \(g_k\) to all workers. In Step 8, the master expects particular results from all workers. These particular results are folded to a single value, which is added to the image \(\mathfrak {I}\) (Steps 9–10 of the master). Step 11 of the master increases the iteration counter k by 1. Step 12 of the master assigns the logical value \(\left( k\geqslant (2\eta +1)^{n-1}\right) \) to the Boolean variable exit. In Step 13, the master sends the value of the Boolean variable exit to all workers. According to (44), \(exit=false\) means that not all the points of the receptive field are processed. In this case, the control is passed to the next iteration of the external repeat/until loop (Step 14 of the master). After exiting the repeat/until loop, the master outputs the constructed image \(\mathfrak {I}\) (Step 15) and terminates its work (Step 16).
All workers execute the same program codes, but with different data. In Step 3, the lth worker defines its own sublist. In Step 4, the worker enters the repeat/until loop. In Step 5, it receives the number k of the next receptive point. In Step 6, the worker processes its sublist \(\mathcal {L}_{map(l)}\) using the higher-order function Map, which applies the parameterized function \({\text {F}}_k\), defined by (48), to each element of the sublist. The result is the sublist \(\mathcal {L}_{reduce(l)}\), which includes the distances \({\text {F}}_k(i)\) from the objective hyperplane \(H_c\) to the objective projections of the receptive point \(g_k\) onto the hyperplanes \(H_i\) for all i from the sublist \(\mathcal {L}_{map(l)}\). In Step 7, the worker uses the higher-order function Reduce to fold the sublist \(\mathcal {L}_{reduce(l)}\) to the single value of \(\rho _l\), using the associative binary operation , which calculates the minimum distance. The computed particular result is sent to the master (Step 8 of the worker). In Step 13, the worker waits for the master to send the value of the Boolean variable exit. If the received value is false, the worker continues executing the repeat/until loop (Step 14 of the worker). Otherwise, the worker process is terminated in Step 16.
Let us obtain an analytical estimation of the scalability bound of parallel Algorithm 7 using the cost metric of the BSF parallel computation model [28]. Here, the scalability bound means the number of workers at which the maximum speedup is achieved. The cost metric of the BSF model includes the following cost parameters for the repeat/until loop (Steps 4–14) of parallel Algorithm 7:
- m:
-
: length of the list \(\mathcal {L}_{map}\);
- D:
-
: latency (time taken by the master to send one byte message to a single worker);
- \({t_c}\):
-
: time taken by the master to send the coordinates of the receptive point to a single worker and receive the computed value from it (including latency);
- \({t_{Map}}\):
-
: time taken by a single worker to process the higher-order function Map for the entire list \(\mathcal {L}_{map}\);
- \({t_a}\):
-
: time taken by computing the binary operation .
According to Eq. (14) from [28], the scalability bound of Algorithm 7 can be estimated as follows:
Calculate estimations for the time parameters of Eq. (49). To do this, we introduce the following notation for a single iteration of the repeat/until loop (Steps 4–14) of Algorithm 7:
- \({c_c}\):
-
: quantity of numbers sent from the master to the worker and back within one iteration;
- \({c_{Map}}\):
-
: quantity of arithmetic and comparison operations computed in Step 5 of serial Algorithm 6;
- \({c_a}\):
-
: quantity of arithmetic and comparison operations required to compute the binary operation .
At the beginning of every iteration, the master sends each worker the receptive point number k. In response, the worker sends the distance from the receptive point \(g_k\) to its objective projection. Therefore,
In the context of Algorithm 6
where \(c_G\) is the number of operations taken to compute the coordinates of the point \(g_k\), and \(c_{F_k}\) is the number of operations required to calculate the value of \({\text {F}}_k(i)\), assuming that the coordinates of the point \(g_k\) have already been calculated. The estimation of \(c_G\) is provided by Proposition 5. Let us estimate \(c_{F_k}\). According to (24), calculating the objective projection \(\gamma _i(g)\) takes \((6n-2)\) arithmetic operations. It follows from (19) that the calculation of \(\rho _c(x)\) takes \((5n-1)\) arithmetic operations. Inequalities (4) imply that checking the condition \(x\in M\) takes \(m(2n-1)\) arithmetic operations and m comparison operations. Hence, \({\text {F}}_k(i)\) takes a total of \((2mn+11n-3)\) operations. Thus,
Substituting the right-hand sides of Eqs. (45) and (52) in (51), we obtain
To perform the binary operation , one comparison operation must be executed:
Let \(\tau _{op}\) stand for the average execution time of arithmetic and comparison operations, and let \(\tau _{tr}\) stand for the average time of sending a single real number (excluding latency). Then, using Eqs. (50), (53), and (54) we obtain
Substituting the right-hand sides of Eqs. (55)–(57) in (49), we obtain the following estimations of the scalability bound of Algorithm 7:
where n is the space dimension, m is the number of constraints, D is the latency. For large values of m and n, this is equivalent to
If we assume that \(m=O(n)\), then it follows from (58) that
where n is the space dimension. Estimation (59) allows us to conclude that Algorithm 7 scales very wellFootnote 2. In the following section, we verify analytical estimation (59) by conducting large-scale computational experiments on a real cluster computing system.
4 Computational Experiments
We performed a parallel implementation of Algorithm 7 in the form of the ViLiPP (Visualization of Linear Programming Problem) program in C++ using a BSF-skeleton [29]. The BSF-skeleton based on the BSF parallel computation model encapsulates all aspects related to the parallelization of the program using the MPI [9] library and the OpenMP [13] programming interface. The source code of the ViLiPP program is freely available on the Internet at https://github.com/nikolay-olkhovsky/LP-visualization-MPI. Using the ViLiPP parallel program, we conducted experiments to evaluate the scalability of Algorithm 7 on the “Tornado SUSU” cluster computing system [16], the characteristics of which are presented in Table 1.
To conduct computational experiments, we constructed three random LP problems using the FRaGenLP problem generator [32]. The parameters of these problems are given in Table 2. In all cases, the number of non-zero values of the matrix A of problem (1) was 100%. For all problems, the rank \(\eta \) of the receptive field was assumed to be equal to 2. In accordance with Eq. (44), the receptive field cardinality demonstrated an exponential growth with an increase in the space dimension.
The results of the computational experiments are presented in Table 3 and in Fig. 4. In all runs, a separate processor node was allocated for each worker. One more separate processor node was allocated for the master. The computational experiments show that the ViLiPP program scalability bound increases with an increase in the problem dimension. For LP5, the maximum of the speedup curve is reached around 190 nodes. For LP6, the maximum is located around 260 nodes. For LP7, the scalability bound is approximately equal to 326 nodes. At the same time, there is an exponential increase in the runtime of building the LP problem image. Building the LP5 problem image takes 10 s on 11 processor nodes. Building the LP7 problem image takes 5 min on the same number of nodes. An additional computational experiment shows that building an image of the problem with \(n= 9\) takes 1.5 h on 11 processor nodes.
The conducted experiments show that on the current development level of high-performance computing, the proposed method is applicable to solving LP problems that include up to 100 variables and up to 100 000 constraints.
5 Conclusion
The main contribution of this work is a mathematical model of the visual representation of a multidimensional linear programming problem of finding the maximum of a linear objective function in a feasible region. The central element of the model is the receptive field, which is a finite set of points located at the nodes of a square lattice constructed inside a hypercube. All points of the receptive field lie in the objective hyperplane orthogonal to the vector \(c=(c_1,\ldots ,c_n)\), which is composed of the coefficients of the linear objective function. The target hyperplane is placed so that for any point x from the feasible region and any point z of the objective hyperplane, the inequality \(\left\langle c,x\right\rangle < \left\langle c,z\right\rangle \) holds. We can say that the receptive field is a multidimensional abstraction of the digital camera image sensor. From each point of the receptive field, we construct a ray parallel to the vector c and directed to the side of the feasible region. The point at which the ray hits the feasible region is called the objective projection. The image of the linear programming problem is a matrix of the dimension \((n-1)\), in which each element is the distance from the point of the receptive field to the corresponding point of the objective projection.
The algorithm for calculating the coordinates of a receptive field point by its ordinal number is described. It is shown that the time complexity of this algorithm can be estimated as \(O(n^2)\), where n is the space dimension. An outline of the algorithm for solving the linear programming problem by an artificial neural network using the constructed images is presented. A parallel algorithm for constructing the image of a linear programming problem on computing clusters is proposed. This algorithm is based on the BSF parallel computation model, which uses the master/workers paradigm and assumes a representation of the algorithm in the form of operations on lists using higher-order functions Map and Reduce. It is shown that the scalability bound of the parallel algorithm admits the estimation of \(O(n\sqrt{n})\). This means that the algorithm demonstrates good scalability.
The parallel algorithm for constructing the multidimensional image of a linear programming problem is implemented in C++ using the BSF–skeleton that encapsulates all aspects related to parallelization by the MPI library and the OpenMP API. Using this software implementation, we conducted large-scale computational experiments on constructing images for random multidimensional linear programming problems with a large number of constraints on the “Tornado SUSU” computing cluster. The conducted experiments confirm the validity and efficiency of the proposed approaches. At the same time, it should be noted that the time of image construction increases exponentially with an increase in the space dimension. Therefore, the proposed method is applicable to problems with the number of variables not exceeding 100. However, the number of constraints can theoretically be unbounded.
Future research directions are as follows.
-
1.
Develop a method for solving linear programming problems based on the analysis of their images and prove its convergence.
-
2.
Develop and implement a method for training data set generation to create a neural network that solves linear programming problems by analyzing their images.
-
3.
Develop and train an artificial neural network solving multidimensional linear programming problems.
-
4.
Develop and implement a parallel program on a computing cluster that constructs multidimensional images of a linear programming problem and calculates its solution using an artificial neural network.
Notes
- 1.
Here, time complexity refers to the number of arithmetic and comparison operations required to execute the algorithm.
- 2.
Let \(L_{max}=O(n^\alpha )\). We say: the algorithm scales perfectly if \(\alpha > 1\); the algorithm scales well if \(\alpha = 1\); the algorithm demonstrates limited scalability if \(0<\alpha < 1\); the algorithm does not scale if \(\alpha = 0\).
References
Bird, R.S.: Lectures on constructive functional programming. In: Broy, M. (ed.) Constructive Methods in Computing Science. NATO ASI Series, vol. 55, pp. 151–216. Springer, Heidelberg (1988). https://doi.org/10.1007/978-3-642-74884-4_5
Bixby, R.: Solving real-world linear programs: a decade and more of progress. Oper. Res. 50(1), 3–15 (2002). https://doi.org/10.1287/opre.50.1.3.17780
Brogaard, J., Hendershott, T., Riordan, R.: High-frequency trading and price discovery. Rev. Financ. Stud. 27(8), 2267–2306 (2014). https://doi.org/10.1093/rfs/hhu032
Chung, W.: Applying large-scale linear programming in business analytics. In: 2015 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), pp. 1860–1864. IEEE (2015). https://doi.org/10.1109/IEEM.2015.7385970
Dantzig, G.: Linear Programming and Extensions. Princeton University Press, Princeton (1998)
Dongarra, J., Gottlieb, S., Kramer, W.: Race to exascale. Comput. Sci. Eng. 21(1), 4–5 (2019). https://doi.org/10.1109/MCSE.2018.2882574
Ezhova, N.A., Sokolinsky, L.B.: Scalability evaluation of iterative algorithms used for supercomputer simulation of physical processes. In: Proceedings - 2018 Global Smart Industry Conference, GloSIC 2018, Art. No. 8570131, p. 10. IEEE (2018). https://doi.org/10.1109/GloSIC.2018.8570131
Gondzio, J., Gruca, J.A., Hall, J., Laskowski, W., Zukowski, M.: Solving large-scale optimization problems related to Bell’s Theorem. J. Comput. Appl. Math. 263, 392–404 (2014). https://doi.org/10.1016/j.cam.2013.12.003
Gropp, W.: MPI 3 and beyond: why MPI is successful and what challenges it faces. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 1–9. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33518-1_1
Hafsteinsson, H., Levkovitz, R., Mitra, G.: Solving large scale linear programming problems using an interior point method on a massively parallel SIMD computer. Parallel Algorithms Appl. 4(3–4), 301–316 (1994). https://doi.org/10.1080/10637199408915470
Hartung, T.: Making big sense from big data. Front. Big Data 1, 5 (2018). https://doi.org/10.3389/fdata.2018.00005
Jagadish, H.V., et al.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014). https://doi.org/10.1145/2611567
Kale, V.: Shared-memory parallel programming with OpenMP. In: Parallel Computing Architectures and APIs, chap. 14, pp. 213–222. Chapman and Hall/CRC, Boca Raton (2019). https://doi.org/10.1201/9781351029223-18/SHARED-MEMORY-PARALLEL-PROGRAMMING-OPENMP-VIVEK-KALE
Karmarkar, N.: A new polynomial-time algorithm for linear programming. Combinatorica 4(4), 373–395 (1984). https://doi.org/10.1007/BF02579150
Karypis, G., Gupta, A., Kumar, V.: A parallel formulation of interior point algorithms. In: Proceedings of the 1994 ACM/IEEE Conference on Supercomputing (Supercomputing 1994), Los Alamitos, CA, USA, pp. 204–213. IEEE Computer Society Press (1994). https://doi.org/10.1109/SUPERC.1994.344280
Kostenetskiy, P., Semenikhina, P.: SUSU supercomputer resources for industry and fundamental science. In: Proceedings - 2018 Global Smart Industry Conference, GloSIC 2018, Art. No. 8570068, p. 7. IEEE (2018). https://doi.org/10.1109/GloSIC.2018.8570068
Lachhwani, K.: Application of neural network models for mathematical programming problems: a state of art review. Arch. Comput. Methods Eng. 27(1), 171–182 (2019). https://doi.org/10.1007/s11831-018-09309-5
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015). https://doi.org/10.1038/nature14539
Mamalis, B., Pantziou, G.: Advances in the parallelization of the simplex method. In: Zaroliagis, C., Pantziou, G., Kontogiannis, S. (eds.) Algorithms, Probability, Networks, and Games. LNCS, vol. 9295, pp. 281–307. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24024-4_17
Prieto, A., et al.: Neural networks: an overview of early research, current frameworks and new challenges. Neurocomputing 214, 242–268 (2016). https://doi.org/10.1016/j.neucom.2016.06.014
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015). https://doi.org/10.1016/j.neunet.2014.09.003
Sodhi, M.: LP modeling for asset-liability management: a survey of choices and simplifications. Oper. Res. 53(2), 181–196 (2005). https://doi.org/10.1287/opre.1040.0185
Sokolinskaya, I.: Parallel method of pseudoprojection for linear inequalities. In: Sokolinsky, L., Zymbler, M. (eds.) PCT 2018. CCIS, vol. 910, pp. 216–231. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99673-8_16
Sokolinskaya, I., Sokolinsky, L.B.: On the solution of linear programming problems in the age of big data. In: Sokolinsky, L., Zymbler, M. (eds.) PCT 2017. CCIS, vol. 753, pp. 86–100. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67035-5_7
Sokolinskaya, I., Sokolinsky, L.B.: Scalability evaluation of NSLP algorithm for solving non-stationary linear programming problems on cluster computing systems. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2017. Communications in Computer and Information Science, vol. 793, pp. 40–53. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71255-0_4
Sokolinskaya, I.M., Sokolinsky, L.B.: Scalability evaluation of cimmino algorithm for solving linear inequality systems on multiprocessors with distributed memory. Supercomput. Front. Innov. 5(2), 11–22 (2018). https://doi.org/10.14529/jsfi180202
Sokolinsky, L.B.: Analytical estimation of the scalability of iterative numerical algorithms on distributed memory multiprocessors. Lobachevskii J. Math. 39(4), 571–575 (2018). https://doi.org/10.1134/S1995080218040121
Sokolinsky, L.B.: BSF: a parallel computation model for scalability estimation of iterative numerical algorithms on cluster computing systems. J. Parallel Distrib. Comput. 149, 193–206 (2021). https://doi.org/10.1016/j.jpdc.2020.12.009
Sokolinsky, L.B.: BSF-skeleton: a template for parallelization of iterative numerical algorithms on cluster computing systems. MethodsX 8, Article Number 101,437 (2021). https://doi.org/10.1016/j.mex.2021.101437
Sokolinsky, L.B., Sokolinskaya, I.M.: Scalable method for linear optimization of industrial processes. In: Proceedings - 2020 Global Smart Industry Conference, GloSIC 2020, pp. 20–26. Article Number 9267,854. IEEE (2020). https://doi.org/10.1109/GloSIC50886.2020.9267854
Sokolinsky, L.B., Sokolinskaya, I.M.: Scalable parallel algorithm for solving non-stationary systems of linear inequalities. Lobachevskii J. Math. 41(8), 1571–1580 (2020). https://doi.org/10.1134/S1995080220080181
Sokolinsky, L.B., Sokolinskaya, I.M.: FRaGenLP: a generator of random linear programming problems for cluster computing systems. In: Sokolinsky, L., Zymbler, M. (eds.) PCT 2021. CCIS, vol. 1437, pp. 164–177. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81691-9_12
Sokolinsky, L.B., Sokolinskaya, I.M.: VaLiPro: linear programming validator for cluster computing systems. Supercomput. Front. Innov. 8(3), 51–61 (2021). https://doi.org/10.14529/js210303
Tolla, P.: A survey of some linear programming methods. In: Paschos, V.T. (ed.) Concepts of Combinatorial Optimization, 2 edn, chap. 7, pp. 157–188. Wiley, Hoboken (2014). https://doi.org/10.1002/9781119005216.ch7
Zadeh, N.: A bad network problem for the simplex method and other minimum cost flow algorithms. Math. Program. 5(1), 255–266 (1973). https://doi.org/10.1007/BF01580132
Funding
The study was partially funded by the Russian Foundation for Basic Research (project No. 20-07-00092-a) and the Ministry of Science and Higher Education of the Russian Federation (government order FENU-2020-0022).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Olkhovsky, N.A., Sokolinsky, L.B. (2022). Visualizing Multidimensional Linear Programming Problems. In: Sokolinsky, L., Zymbler, M. (eds) Parallel Computational Technologies. PCT 2022. Communications in Computer and Information Science, vol 1618. Springer, Cham. https://doi.org/10.1007/978-3-031-11623-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-11623-0_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11622-3
Online ISBN: 978-3-031-11623-0
eBook Packages: Computer ScienceComputer Science (R0)