Keywords

1 Introduction

The Group Technology (GT) is a manufacturing philosophy based on the principle of identifying and grouping machines and parts by similarity, leading the production systems to obtains advantages throughout all stages of design and manufacturing [1].

Applying the GT in a productive process will result in a binary machine-part incidence matrix. The cellular manufacturing layout consists in to arrange these elements into cells, seeking to minimize extracellular movements and maximize the intracellular ones. The Manufacturing Cell Formation Problems - MCPF’s are considered NP-arduous because of their combinatorial nature and the literature related presents a considerable number of methods applied in their solution, where some of the best known and used for quality comparison are: rank order clustering – ROC [2], modified rank order clustering – MODROC [3], GA [4], SA [5], which uses the grouping efficacy index to measure the grouping effectiveness.

In clustering problems, the modified algorithms with local search techniques are inserted in basic heuristics, overcoming them in terms of grouping effectiveness. The proposed framework employs these hybridizations in a GA, in the search for solutions for the MCFPs, being specifically the method k-means the one chosen to improve the individuals of the population. As objective function, the coefficient of effectiveness, known in the literature as being a good index of clusters of cellular performance, is applied as well as being used in the GT problems used to compare results.

2 Machine-Part Cell Formation Problem

Group Technology gives similar treatment to similar elements, that is, it divides the manufacture into small groups (cells) of machines that will process pieces with greater similarity between themselves (families).

Thus, the parts of that family will always be processed by the same machines and tool changes will have reduced configuration time, also reflecting in the same way in the processing time of the system.

The application of the GT in a production system consists of asserting several steps and, in an inner process of the Production Flow Analysis (AFP), a binary matrix is generated relating existing pieces and the corresponding machines that process them. This matrix is called incidence, and at this stage it is necessary to relate those of greater similarity so that the parts families and machine cells are constructed, knew as Manufacturing Cell Formation Problem (MCFP).

With the main objective of assigning machines to cells and parts for families, this arrangement needs a function to guide the performance of these groupings. Some are well known, such as the use of machines [6], clustering efficiency [7], clustering efficacy [8], etc. According to [6], two are the most used: efficiency and effectiveness of grouping.

The grouping efficacy (µ) mentioned below is adopted for two reasons: first because it overcomes the weaker discriminating power of grouping efficiency measure by assigning equal weight for the number of voids and the number of exceptional elements; the second is because the results obtained in the works used as performance comparison apply this group quality indicator. This measure is defined as follows:

$$ \mu = \frac{{e - e_{0} }}{{e + e_{v} }} $$
(1)

where e is the total number of operations (1’s) in the given matrix, ev is the number of voids (0’s in the diagonal groups), and e0 is the number of exceptional elements (1’s out the diagonal groups).

Also, [9] and [10] justify this index adoption since it: incorporates both the with-in cell machine use and the inter-cell movement; generates block diagonal matrices which are interesting in practice; is independent from the number of cells, among others.

In the Fig. 1, there is a one-piece binary incidence matrix. Reordered, to form machine cells and parts families, it has voids (in green) and exceptional elements (in red). An empty means that although machine and part have been assigned the respective cell and family, the machine will not process the part. An exceptional element implies intercellular movement, since the part will have some processing in another cell, increasing the processing time, among other complications.

Fig. 1.
figure 1

Example of a reordered two cell incidence matrix.

Considering the example in the Fig. 1, the Eq. 2 shows the calculus of grouping efficacy (µ) coefficient, since the e, e0 and ev can be easily found:

$$ \mu = \frac{18 - 3}{18 + 3} = 0,7143 = 71,43\% $$
(2)

3 The Proposed Heuristic for Cell Formation Problems

The framework presented in this paper solves MCFP problems by applying a hybrid genetic algorithm that uses a local search method to refine the solutions.

With an initial population partially formed by a greedy constructor method, the convergency can be reached faster than only using random construction methods. In addition, procedures are used to accelerate these convergences, like the neighborhood research, where an GA solution is used as a starting point for another algorithm: a local search method as the k-means algorithm, applied to a chromosome, trying to improve its cluster efficiency.

3.1 The Genetic Algorithm with Local Search

Combining the survival of individuals with each other and inspired by natural and genetic selection mechanisms. This general theory of systems with robust adaptation finds an excellent practical application in the optimization of mathematical functions.

GAs differ from other heuristics by having distinct characteristics: act on a set of points (population) and not on isolated points; operate in a space of coded solutions and not directly in the search space; they need, as information, only the value of an objective function (function of adaptability or suitability); use probabilistic transitions rather than deterministic rules [11].

Briefly, a GA begins with an initial population and the adaptation of the chromosomes is calculated. Genetic operators are applied to selected individuals (a better adaptation implies a greater chance of selection), based on their suitability, and a new generation of individuals is created. This procedure will be repeated until some final criterion is reached.

In the population, each individual is represented by a chromosome, denoting viable solutions to the problem. Thus, to this framework, each gene is a cell or family and each locus, a machine or part (depending on the portion of the analysed chain), thus having length equal to the number of machines plus the number of pieces presented in the M × N matrix, as it can be seen in Fig. 2.

Fig. 2.
figure 2

Example of a chromosome denoting the two-cell formation shown in Fig. 1.

As mentioned, some genetic operators will be used: cloning, crossover and mutation. Cloning consists in the inclusion of the individual in the next generation. In other hand, crossover combines information of chromosomes from selected individuals, generating new ones. Otherwise, with low probabilistic rate, the mutation randomly disrupts the machine-cell designation, trying to avoid local maxima. The selection technique is the Roulette Method [6], with selection proportional to the fitness.

After selection, as can be seen in Fig. 3, and occurring only in the chromosomes’ “machine” portion, the genetic combination draws a cut position where the data origins from one parent and after, from the other one. A constructor algorithm, driven by these cells’ segments, will construct the families’ ones.

Fig. 3.
figure 3

The genetic operators: crossover and mutation.

The refinement of the individuals is used before a new iteration starts. As mentioned, the procedure is the k-means, which is a non-hierarchical method that aims to produce partitions in a set of objects with prior knowledge of the quantity of these.

The procedure starts with k-centroids defined at random, and, in successive iterations, each element is grouped according to some criterion, such as least squares. Because it is an iterative method, centroids are constantly recalculated until they no longer change.

Just as the calculation of the distance between the points is necessary, since cohesive groups are desired around a common point (centroid), it is also necessary to calculate the similarity between the machines. This coefficient calculation consists in the coincidences between them, so a XNOR logic (Eq. 3) between the binary vectors is calculated, where a nonzero bit in Sij indicates that both machines are equal in terms of part processing, so the sum of the elements of Sij implies that the higher the result, the greater the similarity between the evaluated elements (vectors Ri and Rj in Eq. 3).

$$ S_{ij} = \left( {R_{i} \;{\text{XNOR}}\;R_{j} } \right) $$
(3)

With a totally generated chromosome, the k-means algorithm starts from known centroids: the cells already characterized. The objective is to carry out movements, in order to improve these groupings. Because it is costly, this local search has a stop criterion of only 1 iteration.

The centroids calculation promotes exchanges based on the distances between the elements. After reconstruction of the “machines” chromosomes fraction, the constructor procedure creates the “pieces” segment. Recalculating the adaptation, if better than the original, will replace it in the current population.

3.2 The Greedy Constructive Heuristic

Like the genetic operators, also the greedy algorithm of manufacturing cell formation will only act on the chromosomes’ machine segment and so, it begins by calculating the distance matrix between the machines, where each row of the incidence matrix is a n-dimensional point. The Eq. 4 shows the distance equation, where xi and yi are the vectors; i = {1,…n} is the position index of the vectors.

$$ d = \sum\limits_{i = 1}^{n} {\left\| {x_{i} - y_{i} } \right\|} $$
(4)

For the initial cells’ formation, 5 pairs of machines are grouped with great distances from each other, where a random choice will compose the first cell.

The other cells will only have a single machine: that one which accumulates the biggest distance between all the already allocated, for all the cells. Once defined all these seeds of cells, the algorithm designates each not yet allocated machine to that cell with the shortest accumulated distance to its members. In cases of tie, a random choice is made. In cases of tie, a random choice is made.

3.3 The Part-Segment Constructor Algorithm

Based on the calculation grouping efficiency’s coefficient [3], the training strategy of [9] that tries to maximize the grouping coefficient considering the association of a part to a family. This iterative procedure, as presented by the authors in Eq. 5, computes the effect of allocating each piece to a family. As such, the one that maximizes the function is chosen and ends when there are no more allocations to be made.

$$ F^{ * } = \arg \,\hbox{max} \left\{ {\frac{{N_{1} - N_{1,F}^{out} }}{{N_{1} + N_{0,F}^{in} }}} \right\} $$
(5)

where N1 is the total number of operations (1’s) in the given matrix, \( N_{1,F}^{in} \) is the number of voids (0’s in the diagonal groups for the association of a part i to a family f), and \( N_{1,F}^{out} \) is the number of exceptional elements (1’s out the diagonal groups for the same association of a part i to a family F).

4 Method and Computational Results

The method applied by the framework consists in to form an initial population which has half part of the individuals made by a greedy method that uses similarities of parts and machines to compose the chromosomes. The complement of the population is formed by individuals of random constitution, aiming to counterbalance the homogeny generated by the other method of formation.

The test-bed of the work consists in the application of the framework on 25 problems provided by [6] and related in Table 1.

Table 1. Problems obtained from the literature for analysis.

The authors also analyze the results form GA [4], Zodiac [19], Grafics [20], MST [21], GATSP [22]. Their own proposal, here denominate G&R is also compared with this framework results, all related in Table 2. The formation of a population is 5% of the best individuals (clones), 50% of selected individuals for crossing, and the rest, with 80% random and 20% greedy formation. The population size is 150 individuals, regardless of the size of the problem, as well as the mutation rate, equals to 2%. For each problem, 20 rounds of the GA are performed, statistics are generated and comparative data between the framework results and the literature are also related.

Table 2. Comparison between cluster efficiency indexes obtained from the literature, by different methods and the presented framework.

The implementation of the framework is done in Python language on a i7-7700HQ notebook, with 2.8 GHz and 16 GBytes of RAM. Although not being the scope of the work, the implementation of algorithms in Python is feasible for building the necessary programming codes, being free of easy access to documentation, as well as good libraries of functions already created, such as Numpy, for example.

Also, operating on some of the problems studied, [23] obtained similar values to those found here. Although with a different strategy, they also considered the Euclidean distances between machines and, subsequently, cells.

5 Conclusion

In this study, a hybrid genetic algorithm is proposed with the objective of maximizing clustering efficiency in cell manufacturing problems, of NP-hard combinatorial nature. To obtain good solutions in reasonable computational time, local search techniques associated with genetic algorithms are applied. The present framework also proposes strategies of combined greedy and random formation for the initial populations, as well as constructive procedures of individuals managed by biased rules of pre-optimization, obtaining very satisfactory results in the questions of grouping of efficiency and computational time, pointing very favorable perspectives of using this structure in solving this type of problem.

The maximization of clustering efficiency performed by this framework is compared with 6 other proposals. Briefly, it is assumed that all the average values obtained exceeded the other applied methods mean values. Moreover, for 88% of the problems, the efficiency indicators were equal to or better than the best known in the literature (52% already in the first performed generation). Still, for more than half of the problems, the results surpassed those then known values of the literature, as show in Table 2.

In the formation of the initial population, it is observed that the greedy constructive algorithm consumes more processing time than the random one, besides taking the set of individuals to have a greater homogeneity. Furthermore, greater uniformity leads to worse results from construction, which reinforces the decision to define a randomness rate of 80% and 20% of greedy formation. In addition, in this composition, the best results already appear in the first iterations (initial generation and first iteration), without further improvements in the following iterations, of each test.

The k-means local search algorithm did not imply significant differences in the results when applied with rates higher than 75%, on the individuals generated in the current population. Thus, since its interference in computational cost is directly linked to the increase of this rate, it was limited to this value.

The Python language is a bit slower than compiled languages, but it is free, easy to develop, with ample support material, and the time taken to get the results is satisfactory. Moreover, the local research combined with a good construction procedure, even being computationally expensive, allow an accelerated convergence in the search for good solutions of these types of problems.