Keywords

1 Introduction

As a bridge connecting product design and product manufacturing, the current process design process relies upon the experience of process designers, as well as guide of existing process knowledge of the enterprise. During this process, inexperienced process designers frequently need to spend a lot of time searching for the required knowledge, resulting in the efficiency and quality difference of process design. In this case, due to the capacity to quickly provide appropriate process knowledge for process designers according to process design requirement, process knowledge recommendation technology has been widely used in machining [1], assembly [2], sheet metal and other different process.

With the continuous development and wide application, knowledge recommendation oriented to process knowledge graph has become a research focus, including process design requirement analysis, rapid positioning of requirements and knowledge, and knowledge matching algorithm design. According to the systematic analysis of process design requirements, Zhou et al. [3] obtained process requirements from five aspects such as users and design information, and converted them into corresponding process characteristics. Guo et al. [4] obtained hidden user requirements from user requirement data and established association mapping for product attributes. However, most of current process design requirements analysis is limited to the part information or user requirement, short of comprehensive analysis of design requirements. For the rapid positioning of requirements and knowledge, it is an effective method to construct the unified representation of knowledge attributes [5, 6]. For the matching between process knowledge and design requirements, currently semantic matching calculation methods such as vector space model-based matching calculation [7], ontology-based semantic matching calculation [8] and deep learn-based matching calculation are commonly used. Wang et al. [9] applied TF-IDF algorithm based on vector space model in the matching process of manufacturing knowledge and design requirements. Renu et al. [10] used the text matching algorithm to realize the knowledge retrieval and sharing of the text-based assembly process scheme.

To support the fast and high matching degree process knowledge recommendation, this paper proposes a method to map between process design requirements and process knowledge attributes by constructing the coding label of process knowledge graph. A requirements-knowledge semantic vector space model is established and the matching degree between requirements and knowledge is calculated based on improved cosine distance.

2 “Scene-Label-Classification” Process Knowledge Recommendation Framework

Oriented to knowledge graph, this paper proposed a process knowledge recommendation scheme-“scenario-label-classification”, which is shown in Fig. 1. Different dimensions of process design requirements are described by parametric characteristics such as material, size characteristic, part type, precision, etc. At the same time, unified identification of process knowledge attributes are constructed using knowledge coding label, and parameterized requirement scenes are correlated with process knowledge attributes. Therefore, rapid positioning from process design requirements to process knowledge attributes is achieved and initial process knowledge recommendation candidate is generated. Process design requirement semantic vector is obtained by analyzing parameterized requirement scene. Meanwhile, process knowledge semantic vector of initial candidates is generated by TransE algorithm and dimensionality reduction to the same dimension as the process design requirement semantic vectors. Based on the improved cosine distance, the matching degree between them can calculated and filtrated according to matching degree threshold value. The final process knowledge recommendation candidate set is obtained so as to enable the dynamic classification of process knowledge under different requirement scenarios.

Fig. 1.
figure 1

“Scene-label-classification” process knowledge recommendation framework

3 Correlation Between Requirement Scenarios and Process Knowledge Attributes

The process design requirements of different dimensions are analyzed, and requirement scenarios of process knowledge recommendation are described through materials, dimension characteristics, part types and other parameterized characteristics. Meanwhile, the process knowledge attribute characteristics are identified uniformly by knowledge coding, and then requirement scene characteristics and process knowledge attribute characteristics are expressed in correlation.

According to the requirement scene of process knowledge recommendation, this paper divides process design requirement into three aspects: part information, product information and other information. Among them, part information contains the geometric information and non-geometric information including surface shape, size, precision information, part materials, technical requirements, etc. Compared with the part, product information contains the number of parts, the position relationship between parts, and the assembly relationship between parts, etc. Other information includes processing/assembly time limits, types of knowledge required, etc. Through parameterized characteristic description, the requirement scenarios set R(R1,R2,…,Rq) is established, where R1,R2,…,Rq represent the parameterized requirements characteristics, which facilitates the construction of association mapping between process design requirements and process knowledge attributes.

In this paper, process knowledge attributes are uniformly identified by constructing coding labels in the knowledge graph, enabling the quick positioning of knowledge attribute characteristics according to design requirements. However, when there are too many parameterized demand features, it may be impossible to find the process knowledge that meets all the demand characteristics. Therefore, the quantity threshold value Sn meeting the requirements of parameterized characteristics is set. And process knowledge that meets the conditions (including the number of requirement features ≥Sn) is selected according to the demand scenario, so as to generate an initial process knowledge recommendation candidate set K(K1,K2,…,Km), where K1,K2,…,Km represent contained process knowledge.

4 Construction of Requirements-Knowledge Semantic VSM

Vector space model (VSM) is an information retrieval model which is widely used and effective in recent years. The core idea of VSM is to represent text content with vectors and map it to n-dimensional vector space, thus transforming the similarity problem between texts into the similarity problem between vectors in multi-dimensional space. In this paper, semantic vector expression and dimensionality reduction are respectively conducted for requirements scenarios and initial process knowledge recommendation candidate set, thus obtaining requirement semantic vector and several process knowledge semantic vectors. Then the requirements-knowledge semantic VSM is established so as to calculate the matching degree between process design requirements and process knowledge.

4.1 Construction of Requirement Semantic VSM

The construction idea of semantic VSM is that text is regarded as a combination of several independent characteristic terms, and a high-dimensional space is constructed with these different characteristic terms. Each characteristic term is one dimension of this space, and text is regarded as a space vector. In this paper, parameterized characteristics such as material, dimension feature and part type are utilized to describe the process design requirements, so each parameterized characteristic is the characteristic term of requirement semantic VSM. Assuming that process design requirements consist of independent parameterized characteristics, it can be expressed as:

$$ R = \left\{ {r_{1} ,d_{1} \left. {;r_{2} ,d_{2} ; \ldots \ldots ;r_{n} ,d_{n} } \right\}} \right. $$
(1)

Therein ri(1 ≤ i ≤ n) represents the parameterized characteristic name, and di(1 ≤ i ≤ n) represents each parameter value/description content. For the parameterized characteristics such as technical requirements, the content is described in natural language, like “pay attention to temperature control and quenching transfer time of aluminum alloy materials in heat treatment process”. In order to enable the matching between parameterized requirement characteristics described in natural language and the process knowledge content, it is necessary to generate text vector using NLP to calculate semantic matching degree.

Given different weights for each parameterized characteristic as the vector component, the text vector Vr used to represent the process design requirements can be expressed as:

$$ V_{r} = \left\{ {Wr_{1} ,\;d_{1} \left. {;W_{r2} ,\;d_{2} ; \ldots \; \ldots ;Wr_{n} ,d_{n} } \right\}} \right. $$
(2)

Therein wri(1 ≤ i ≤ n) represents the weight of each parameterized characteristic, which is generally set by process designer according to the importance of each parameterized characteristic. Therefore, process design requirements can be represented by an N-dimensional characteristic vector.

4.2 Construction of Process Knowledge Semantic VSM

Aiming at each process knowledge graph in the process knowledge recommendation candidate set, the vector representation method of process knowledge graph based on TransE algorithm is adopted, so that each process knowledge could be represented by a p-dimension vector Vtrans(P ≥ N). In order to calculate the matching degree between requirement vector and process knowledge vector, dimensionality reduction methods such as PCA is adopted to keep the dimension number and meanings of two vectors consistent. For the ith process knowledge of the initial process knowledge recommendation candidate set, its semantic vector Vk-i(1 ≤ i ≤ m) can be expressed as:

$$ V_{k - i} = \left\{ {r_{1} ,d^{\prime}_{1} ;\left. {r_{2} ,d^{\prime}_{2} ; \ldots \ldots ;r_{n} ,d^{\prime}_{n} } \right\}} \right. $$
(3)

Therein ri(1 ≤ i ≤ n) represents n parameterized characteristics of process design requirements, and d'i(1 ≤ i ≤ n) represents the value/description content of this parameter (empty if no content). Different from requirement semantic vector, the weight of each parameterized characteristic of knowledge semantic vector is calculated based on statistics, among which the most commonly used method is TF-IDF method. Term Frequency (TF) weight indicates the number of times a characteristic item appears in this document. The more times it appears, the more important it is. However, for process knowledge, high appearance frequency of parameterized characteristic does not necessarily mean that it is more important. Therefore, in order to reduce the influence of TF weight, when the occurrence frequency of parameterized requirement characteristic is not 0, it can be expressed as:

$$ {\text{TF}}_{{\left( {i - k} \right)}} = 1 + log(1 + logj) $$
(4)

Therein j represents the number of occurrences of parameterized characteristic rk in process knowledge Ki. If this parameterized requirement characteristic is not included in Ki, then TF(i-k) = 0.

IDF (Inverse Document Frequency) weight refers to the frequency of all process knowledge contained in initial knowledge candidate set for one parameterized characteristic. If this parameterized characteristic appears in multiple process knowledge, it proves that its distinguishing ability is low, and therefore, IDF weight is expressed as:

$$ {\text{IDF}}_{{\left( {i - k} \right)}} = log(\frac{N}{n} + \alpha ) $$
(5)

Therein α is a constant. In the similarity calculation between texts, if a keyword appears in all texts, its IDF value is extremely low. However, in the matching of requirements and process knowledge, if a parameterized characteristic appears in all process knowledge, it does not mean that this characteristic is unimportant. According to Formula (5), the larger α is, the weaker the distinguishing ability of this parameterized characteristic is. Therefore, α is set as 1 in this paper, N represents the number of process knowledge in the initial knowledge candidate set, and n represents the number of process knowledge with this parameterized characteristic. Finally, the weight of parameterized characteristic rk in process knowledge Ki is:

$$ w_{i - k} = {\text{TF}}_{{\left( {i - k} \right)}} *{\text{IDF}}_{{\left( {i - k} \right)}} $$
(6)

The semantic vector Vk used to represent a process knowledge can be expressed as:

$${V}_{k}=\left\{{{w}_{k}}_{1},d{\mathrm{^{\prime}}}_{1};\left.{{w}_{k}}_{2},d{\mathrm{^{\prime}}}_{2};\dots \dots ;{{w}_{k}}_{n},d{\mathrm{^{\prime}}}_{n}\right\}\right.$$
(7)

Therein wki(1≤i≤n) represents the weight of each parameterized characteristic obtained through Formula (6). For an initial process knowledge recommendation candidate set containing m process knowledge, a m*n knowledge semantic VSM can be finally constructed:

$${V}_{k-m}=\left\{\begin{array}{c}{{w}_{k}}_{11},d{\mathrm{^{\prime}}}_{11};\cdots ;{{w}_{k}}_{1n},d{\mathrm{^{\prime}}}_{1n}\\ \cdots \\ {{w}_{k}}_{m1},d{\mathrm{^{\prime}}}_{m1};\cdots ;{{w}_{k}}_{mn},d{\mathrm{^{\prime}}}_{mn}\end{array}\right\}$$
(8)

5 Knowledge Content Matching Degree Calculation

The calculation process of matching degree between requirements semantic vector and semantic vector of each process knowledge is shown in Fig. 2: Based on improved cosine distance, the matching degree of each process knowledge content in the initial candidate set and process design requirement is calculated. The matching degree threshold Mt and quantity threshold Nq is set in advance for comparison to filter out the process knowledge with low matching degree. The reserved process knowledge is sorted according to the matching degree value, and final process knowledge recommendation candidate set K'(K1,K2,…,Kq) is obtained and pushed to the process designer.

Fig. 2.
figure 2

The calculation process of matching degree

5.1 Matched-Degree Calculation Based on Improved Cosine Distance

After obtaining Vr and Vk-i, the semantic matching degree between requirements and process knowledge content can be expressed according to the matching degree between the vectors. In this paper, cosine distance between vectors is utilized:

$$ M(i,k - i) = \cos \theta = \frac{{V_{r} *V_{k - i} }}{{\left| {V_{r} } \right|\left| {V_{k - i} } \right|}} $$
(9)

For requirements semantic vector Vr and process knowledge semantic vector Vk-i, besides containing weights wri(1 ≤ i ≤ n) and wki(1 ≤ i ≤ n) of parameterized characteristics, descriptions di(1 ≤ i ≤ n) and d'i(1 ≤ i ≤ n) of parameterized characteristics are also included. Compared with classical semantic calculation process based on cosine distance, this process also needs to calculate the matching degree between di(1 ≤ i ≤ n) and d'i(1 ≤ i ≤ n). In this regard, this paper summarizes several situations that may occur when di(1 ≤ i ≤ n) and d'i(1 ≤ i ≤ n) match:

  1. a)

    Numerical matching. In this case, di(1 ≤ i ≤ n) and d'i(1 ≤ i ≤ n) can be directly compared, which is applicable to parameterized requirement characteristics described by numerical values such as surface roughness and machining accuracy. The result has two cases: a match of 1 and a mismatch of 0.

  2. b)

    Semantic matching. For parts material, parts type and other parameterized requirement characteristics described by simple text, di(1 ≤ i ≤ n) and d'i(1 ≤ i ≤ n) can be directly compared, which is consistent with numerical matching. For parameterized requirement characteristics such as technical requirements that need to be processed by natural language, semantic matching degree between them should be calculated based on cosine distance. The matching result range is [0,1], where 0 indicates complete mismatch and 1 indicates complete match.

Therefore, based on classical cosine distance, this paper proposes a calculation method of matching degree between the requirements semantic vector Vr and the process knowledge semantic vector Vk-i(1 ≤ i ≤ m) based on improved cosine distance. The calculation formula is:

$$ M(r,k - i) = \frac{{V_{r} *V_{k - i} }}{{\left| {V_{r} } \right|\left| {V_{k - i} } \right|}} = \frac{{\sum\limits_{j = 1}^{n} {\left[ {w_{{r_{j} }} *w_{{(k - i)_{j} }} *M(d_{{r_{j} }} ,d_{{(k - i)_{j} }} )} \right]} }}{{(\sqrt {\sum\limits_{j = 1}^{n} {(w_{{r_{j} }} } } )^{2} *(\sqrt {\sum\limits_{j = 1}^{n} {(w_{{(k - i)_{j} }} )}^{2} } }} $$
(10)

Therein M(drj,d(k-i)j,1 ≤ j ≤ n) represents the matching degree between parameterized characteristics description contents of two vectors. The range of M(r,k-i) is [0,1], and the larger M(r,k-i) is, the higher matching degree between requirements and process knowledge is.

5.2 Candidate Process Knowledge Filtering Ranking

The matching degree M(r,k-i) between each knowledge and process design requirements is calculated and compared with threshold value Mt. If the matching degree value is greater than, the corresponding process knowledge is retained. Otherwise, corresponding process knowledge is eliminated. The reserved process knowledge is sorted from large to small according to matching degree value, and according to quantity threshold Nq, the final process knowledge recommendation candidate set is obtained and pushed to the designer.

6 Experimental Verification and Analysis

The validity of the proposed process knowledge recommendation scheme and knowledge content matching calculation method is verified by a shaft-hole part example of machining design. Firstly, according to the process specification of the part, the parameterized requirement characteristics information used to describe its process design requirements is summarized, as shown in Table 1.

Table 1. Parameterized requirement characteristics of a shaft-hole part

The corresponding process knowledge attributes are located in the knowledge base according to the knowledge coding, and the quantity threshold Sn value that meets the parameterized requirement characteristics is set as 8, and four characteristics including recommended knowledge category (machining route), part type (axle hole), and shape feature (1 is outer circle, 2 is inner hole) are must contained. An initial process knowledge recommendation candidate set containing 12 process knowledge is generated, and its description in “Machining Route” is shown in Table 2.

Table 2. Information of initial knowledge recommendation candidate set K(K1,K2,…,K12)

According to Table 1 and the generated initial process knowledge recommendation candidate set, the requirements semantic vector Vr = {wr1,d1;wr2,d2;…;wr12,d12} and knowledge semantic vector Vk-i = {wr1,d'1;wr2,d'2;…;wr12,d'12} are respectively established. For requirements semantic vector Vr, its weight coefficient represents the key degree of this characteristic. Under the condition that the knowledge attributes of the parts type, material and shape characteristics must meet the requirement, it is assumed that the machining accuracy, surface roughness and technical requirements are the focus of the process designer to pay attention to whether the requirements and knowledge match. The weight coefficients of the machining accuracy, surface roughness and technical requirements are set as 0.2, and the weight coefficients of the other parameterized requirements are set as 0.1.

After the semantic vector weight coefficient of each process knowledge Vk-i in initial knowledge candidate set is calculated, the matching degree threshold Mt = 0.7 and quantity threshold Nq = 6 are set. By the matching degree calculation method based on improved cosine distance, the matching degree between requirements and each process knowledge is calculated. The results are shown as Fig. 3.

Fig. 3.
figure 3

Matching degree results between requirements and each process knowledge

Table 3. Information of final knowledge recommendation candidate set K'(K1,K2,…,K6)

According to the matching degree threshold Mt = 0.7, 5th and 9th process knowledge are filtered out. Process knowledge numbered 6th, 7th, 8th and 12th process knowledge are filtered out according to the required process knowledge quantity threshold Nq = 6. The final process knowledge recommendation candidate set was obtained after reordering according to the matching degree from large to small, as shown in Table 3. After verification, the candidate set of process knowledge meets the machining requirements of the shaft-hole part.

7 Conclusion

In this paper, the “scene-label-classification” knowledge recommendation scheme for process knowledge graph is established to enable dynamic classification of process knowledge for different requirements scenarios. The multi-dimensional requirements for process knowledge recommendation are fully considered, and correlation mapping between requirements and process knowledge attributes was established by parameterized requirement characteristics and coding labels. The demand-knowledge semantic vector space model was constructed by taking parameterized requirement characteristics as dimensions of the space. The matching degree calculation method based on improved cosine distance is proposed, which considered both parameterized requirement characteristics and description content. A verification example with shaft-hole part showed that based on specific requirement scenarios, the proposed method achieved the process knowledge recommendation with strong pertinence and flexible number of knowledge candidates.