1 Introduction

Certain engineering components work under high stress situations and during long term service, the materials typically degrade and material damage occurs (typically voiding and micro-cracks). In order to properly judge and determine the actual and local material state of those components, their material properties are characterised. However, due to the severe limitations on specimen size in testing facilities (e.g. the limited space available for testing in nuclear reactors and boilers), only a small specimen can be obtained. In these cases, the specimen size does not meet the requirements of a valid tensile test.

A new material characterisation method, called the small punch test (SPT), has emerged. Due to its small sample requirement, the SPT has particular values in assessing material properties and remaining life predictions of in-service components. However, it has two obvious disadvantages:

  1. 1.

    Requirement of an accurate hemispherical punch, which is difficult to produce in most manufacturing units.

  2. 2.

    In welded components, as the material properties vary along the direction perpendicular to the weld line, harvesting a circular shaped specimen accurately from that zone is extremely difficult. In addition, if the weldments are not captured in the middle of the circular specimen, the resulting stress field can become quite complex to analyse.

Following the above difficulties, a novel approach, the plane strain SPT has emerged and it is distinguished from the standard SPT in two ways:

  1. 1.

    Instead of a disk shape specimen, a long thin rectangular test piece (with dimensions of 20 mm × 12 mm × 0.5 mm) is implemented.

  2. 2.

    The punch head is a prism with a half-circular shape which makes it considerably easier to manufacture. Furthermore, the upper and lower die consists of rectangular blocks and are specially designed to hold the rectangular specimen.

In this study, a number of experimental plane strain SPT based on the newly designed apparatus have been performed. In addition, finite element (FE) simulations of the plane strain SPT have been carried out to train the artificial neural network to identify the plastic properties of the SS304L. The steps taken for this proposed methodology can be seen in Fig. 1.

Fig. 1
figure 1

Procedure for the identification of metallic structures plastic properties

2 Literature review

SPT technique was first implemented by Baik et al. (1983) to analyse the mechanical properties of irradiated materials in the nuclear industry. The technique successfully made a correlation between the mechanical properties found by the small disc bend tests (i.e. SPT) and the ductile–brittle transition temperature using the standard Charpy impact test.

Manahan (1983) as seen in Fig. 2, divided the corresponding load–displacement curve of SPT into the following four stages:

  1. 1.

    Elastic bending deformation

  2. 2.

    Plastic bending deformation (transition between elastic to plastic

  3. 3.

    Membrane stretching (purely plastic)

  4. 4.

    Plastic instability (damage)

In addition, Manahan et al. (1981) developed a new mechanical bending test using miniature sized disks. By the use of finite element method, it was suggested that the testing method was potentially capable of determining biaxial stress/strain response, biaxial ductility, stress relaxation behaviour and biaxial creep response (Fig. 2).

Fig. 2
figure 2

Typical load stroke curve

Lucas (1983) presented a detailed review of the miniature testing techniques and concluded that the SPT can be used to obtain both strength and ductility information from specimens as small as 8 mm diameter. Moreover, it was suggested that as the technique can sample a larger volume of the test specimen, it is less sensitive to the scale of the microstructure than the micro-hardness test. However, it was found that the stress and deformation paths in the process zone were highly complex and could not be easily analysed.

Okada et al. (1985) performed a series of tensile disk bulge and micro-hardness tests with miniaturised specimens on a variety of metals and alloys. Following the tests, it was concluded that there was a strong correlation between the fracture load obtained from the bulge test and the tensile strength. This led to concluding that specimens as thin as 0.1 mm generally are capable of fulfilling the requirements of obtaining the bulk properties.

Mao and Takahashi (1987) developed a SPT and successfully obtained the fracture strain as well as strength information from miniaturised specimens as small as 3 mm diameter and 0.25 mm in thickness by assuming elastic perfectly plastic analysis. The conducted test was based on driving a steel ball punch through a clamped specimen. In addition a number of correlations were also obtained between the load–displacement curve of the SPT and the mechanical properties (such as yield stress and ultimate tensile stress).

Fleury and Ha (1998) successfully implemented the SPT techniques to estimate the mechanical properties of low alloy steel for a steam power plant. Linear relationships were produced between mechanical properties (determined from the SPT) and the Charpy impact tests to estimate the fracture appearance transition temperature as well as approximating the fracture toughness.

Abendroth and Kuna (2003) introduced a new approach to identify plastic deformation and failure properties of ductile materials. The experimental method of the SPT was used to determine the material response under loading. The resulting load displacement curve was then transferred to artificial neural networks that were trained using the load displacement curves generated by finite element simulations. During the training process the neural network generated an approximation function for the inverse problem relating the material parameters to the shape of the load displacement curve of the small punch test by which the damage and mechanical parameters of various alloy steels were determined.

Husain et al. (2004) developed an inverse FE procedure for SP test to determine the constitutive tensile behaviour of H11 steel. In that procedure, the initial slope of the load displacement curve obtained from the experiment and FE method was matched and then implemented to predict the elastic modulus of the material.

Pathak et al. (2009) reported the influence of key test parameters on SP test results using flat samples. The aim of their research work was to study the effects of yield stress and strain hardening on peak load and corresponding displacement obtained from SP test using curved samples and simulation technique. Based on these results, sensitivity of the material parameters were ascertained.

Zhou et al. (2012). introduced small beam shape specimen to evaluate material properties. This was achieved from the deformation through comparison with finite element analysis and genetic algorithm (GA). Zhou coupled a cost function based on the relative difference between the experimental and testing forces at the top centre of the beam to successfully characterise the material parameters of AA2024-T3.

Furthermore, Yang et al. (2015) created an inverse method to evaluate the yield strength of X80 through SPT and tensile test. They validated the result by recording the load–displacement curve of a two dimensional finite element model (FEM). The findings of FEM proved to be in good agreement with their corresponding experimental results especially in the Elasto-Plastic deformation stages. In addition, the team also used a golden section search optimisation algorithm and predicted the yield strength of X80. Their prediction slightly varied from the experimental yield strength and they concluded that the variation were due to the slight inefficiency of the optimisation technique used. The majority of the above studies only covered a disk shaped specimen which naturally inherits the weakness discussed previously. In addition, the testing apparatuses implemented in the literature entirely differs from this work as the use of hemispherical punch head is replaced by a much easier half-circular alternative. Although Zhou and his colleagues did implement the half circular shaped punch head to test on beam shaped specimen, the beam test requires significantly more materials than its SPT counterpart.

3 Test rig design

The test rig consists of 4 major components: the punch, top and bottom die, specimen and the alignment shim (Fig. 3).

Fig. 3
figure 3

Test rig design assembly, (1) punch, (2) top die, (3) bottom die, (4) alignment shim

The punch must have a 2.5 mm diameter lead as well as being able to fit into the 4 mm aperture with minimum friction and slacking. The challenge always lies in obtaining the specified hemispherical lead and the right fit. The top die basically serves two roles in the SP testing. First of all, it provides the clamping force by the cap screws which securely keeps the specimen in place. Secondly it provides an aligning platform by means of a rectangular slot for the punch to apply the load on the specimen. The bottom die which completes the clamping pair also contains a 20mm long by 4mm wide aperture where the specimen is positioned in the middle of it. The receiving aperture also contains a 0.2 mm, 45° chamfer on its edges. The alignment shim as the name suggests insures the test piece is aligned right in the centre of the bottom die so that the bending occurs longitudinally on the centre of the specimen. The shim must be geometrically identical to the dies otherwise it would not be as effective. The manufactured plane strain SPT apparatus can be seen in Fig. 4.

Fig. 4
figure 4

Plane strain SPT apparatus

4 Experiments

The specimen was prepared as a rectangular of 20 mm length, 12 mm width and 0.5 mm thickness. The material used in this work was Stainless Steel 304L.

The dies and the test piece were cleaned and washed with acetone prior to the test to eliminate dirt and grease which may result in slipping of the test piece. The specimen holders consist of upper and lower die, alignment shim, and 6 × M8 socket head clamping screws. The test piece was placed in the holder and was clamped along its perimeter. The socket head screws were all torqued up to 30 Nm so that the clamping force was uniform along the specimen. It has been suggested that (Sun 2003) different clamping forces in the equipment of the different participating laboratories were assumed to have no significant effect on test results, although this merits further research.

The tests presented in this work were all carried out on the same day at room temperature using a computer controlled universal tensile machine (Zwick/Roell 2061 testing machine) with 100 kN load cell at a constant punch displacement of 0.5 mm/min (0.833 μm/s). In order to avoid impact occurrence, a small sinusoidal load (≈10 N) was initially applied for 10 s and then as the test carried on the corresponding load and displacement (i.e. stroke) were digitally recorded.

5 Finite element analysis

In this section a FE model with GTN constitutive equations was created in ABAQUS 6.10. The model was simulated using the same key dimensions corresponding to the experiment components. The FE models were primarily used to compute LDCs for known elasto-plastic properties of the test specimen at room temperature. LDC would later be used to train the ANN.

5.1 Geometry

The FEA model was constructed using ABAQUS/Explicit 6.10 in two dimensions (plane strain). Since the geometries and the load of the SP test were axially symmetric about the centre line coincident with the punch axis, a two dimensional finites element model was sufficient. In addition, axially symmetric analysis reduced the complexity of the problem and minimised the computational time.

The FEA model used is shown in Fig. 5. There are four components in the model: punch, top die, bottom die, and the test piece. The punch and dies were taken as rigid bodies, and the test piece as deformable. This decision was taken due to the fact that rigid bodies in Abaqus do not require meshing and hence resulting in lower processing power requirements as well as saving significant amount of time. It must be said that implementing rigid punch and dies was only possible because the test piece was very thin (0.5mm in thickness) which meant that inaccuracies could not alter the results.

Fig. 5
figure 5

FE model of the plane strain SPT

5.2 Element type

The test piece was meshed with the 4-node uniform strain quadrilateral (CPE4R) element. This is a fine mesh of linear, reduced-integration elements and is recommended when simulations involving very large deformation (such as the one in this model). The reduced integration elements helped decreasing the analysis time as well as reducing the possibility of excessive flexibility of elements (i.e. hourglassing) (Sun 2003).

5.3 Material model

The material model used was based on the constitutive damage law developed by Gurson, Tvergaard and Needleman (GTN or sometimes referred to as the porous metal plasticity).

This model defines the inelastic flow of the porous metal on the basis of a potential function that characterises the porosity in terms of single state variable, the relative density. In Abaqus/Explicit this is defined by a failure definition.

The material model implemented for this FE analysis consisted of 3 sections, elastic part, plastic part and porous metal plasticity in which porous failure criteria and void nucleation were accounted for.

The list of parameters implemented for the finite element model can be seen in Table 1.

Table 1 GTN material parameters

The elastic part was specified by the linear isotropic elasticity, (i.e. the elastic modulus and Poisson ratio).

5.4 Interactions and contacts

Three contact pairs were created: (a) the punch-specimen, (b) top die-specimen and (c) lower die-specimen. In all three cases the analytical rigid bodies (punch and the dies) were assigned as the master surface and the specimen as the slave surface. The type of contact was chosen as the node based surface which was created by specifying the nodes from the specimen (i.e. slave surface). Also the finite sliding approach was implemented to account for the relative motion of the surfaces. The penalty contact algorithm was chosen in terms of friction coefficient (μ).

5.5 Boundary conditions and loading

The boundary conditions applied in the present FEA model are as followed:

The top and bottom dies were constrained on all degrees of freedom. In addition the clamping force was ignored in this FE model to avoid unnecessary complications. Translations in the radial and horizontal directions were prevented on the left end of the specimen and symmetry (X direction) was implemented on the opposite side to account for the axially symmetric conditions. Furthermore, the punch was only allowed to move vertically, hence it was constrained on both horizontal and radial directions.

Since the experiment lasted around 6 min, the load was applied as a displacement rate of 0.08334 mm/s to avoid getting a large computational burden and to further reduce the running time a mass scaling factor of 10,000 was carefully applied.

It must be said that due to relatively high speed of the loading in this FE analysis, the void growth, coalescence and the failure propagation could have actually been slightly affected, however, as the source only accounted for a very small differences the effect was neglected.

5.6 Mesh convergence

In order to find the most suitable mesh with reasonable computation time a mesh convergence study was performed.

The convergence study was achieved by creating a mesh using the fewest, reasonable number of elements and then analyse the plane strain SPT model. The mesh was recreated with a denser element distribution and the model was subsequently re-analysed and the results were compared to those of the previous mesh. This process was repeated until the results converged satisfactorily. In addition, a single domain ALE adaptive meshing was implemented on the test piece to insure a high quality of mesh throughout the simulation.

The final mesh contained 120 × 40 axially symmetric reduced elements and was found to be sufficiently dense and not overly demanding of computing resources.

6 Neural network modelling

6.1 Feature extraction

The main objective of the Neural Network Modelling is to correctly establish a distinctive correlation between the LDC (Force Vectors) of the simulated plane strain SPT and the corresponding material parameters. The LDC can be regarded as a function of the punch force F, which is depending on the displacement d and the material parameters σ. This function is created by systematically varying the parameter sets and storing them in a database.

In order to reduce the computational intensity required, a decision had to be made where in the data set to operate the function approximation. As the aim was to identify the plastic properties, therefore the purely plastic region of the load displacement curve were made the focal point of the data extraction.

6.2 Data normalisation

Prior to constructing the database, all the original data (i.e. force vectors and material properties) were linearly transformed to the interval [0 1] using the Max–Min technique.

6.3 Database

Based on experience the plastic properties were varied systematically to form the [3 × 125] target matrix. This variations is shown in Table 2.

Table 2 Variation of Material Properties

The rest of database was constructed by simulating FE models based on the systematically varied material parameters as the target matrix. The input matrix was constructed by producing the corresponding LDCs based on the above systematically varied material parameters. Now the input matrix was constructed by extracting the data from the purely plastic region of the simulated LDC as shown in Fig. 6.

Fig. 6
figure 6

Extracted data region from 125 simulated load–deflection curves (LDC)

In addition, a set of [5 × 1] experimental force vector was also used as the early stoppage technique during the training of the neural networks to validate the results.

6.4 Network architecture

The configuration of the array of the neurons is essential in the function of the artificial neural network. Amongst the possible network architectures it was decided to use the feed-forward network (FFN) due to being highly versatile when used in general function approximation (i.e. nonlinear regression). As can be seen in Fig. 7, the FFN architecture implemented consist of one hidden layer of sigmoid neurons followed by an output layer of linear neurons. Multiple layers of neurons with non-linear transfer functions allow the network to learn non-linear and linear relationships between input and output vectors.

Moreover, this particular FFN is capable of approximating any functions of interest with a finite number of discontinuities arbitrarily well, given sufficient neurons in the hidden layer (Hagan and Demuth 1996).

Fig. 7
figure 7

Two-layer FF neural network

For the two layer network shown above, the output of the first layer becomes the input to the second layer and for M linear combinations of input variables Fi (i = 1, …, N) this process can be expressed in terms of the following parametric nonlinear functions:

$$ a_{j}^{1} = \mathop \sum \limits_{i = 1}^{N} w_{ji}^{1} F_{i} + b_{j1}^{1} $$
(1)

where a, w and b are known as the activations, weights and biases respectively. Also j = 1, …, M and subscript 1 indicates that the parameters are in the first layer.

The activations quantities are then transformed using a Tanh-sigmoid activation function f 1 (.) to give:

$$ {\text{h}}_{\text{j}} = {\text{f}}^{1} \left( {{\text{a}}_{\text{j}}^{1} } \right) $$
(2)

where, h is called the hidden units and,

$$ \varvec{f}^{1} \left( {{\text{a}}_{\text{j}}^{1} } \right) = \frac{{{\text{e}}^{{{\text{a}}_{\text{j}}^{1} }} - {\text{e}}^{{ - {\text{a}}_{\text{j}}^{1} }} }}{{{\text{e}}^{{{\text{a}}_{\text{j}}^{1} }} + {\text{e}}^{{ - {\text{a}}_{\text{j}}^{1} }} }} $$
(3)

In the second layer, the resulting values in the above equations are linearly combined to give the corresponding output activations, \( {\text{a}}_{\text{K}} \):

$$ {\text{a}}_{\text{K}}^{2} = \mathop \sum \limits_{{{\text{j}} = 1}}^{\text{M}} {\text{w}}_{\text{Kj}}^{2} {\text{h}}_{\text{j}} + {\text{b}}_{{{\text{K}}1}}^{2} $$
(4)

Note that K is the total number of outputs and subscript 2 refers to the second layer of the FF network. Finally, the output activations shown in (4) are transformed using a linear activation function, f 2 (.) to give the set of network outputs σK as a function of the input vectors, F and the adjustable parameters, W.

$$ \upsigma_{\text{K}} \left( {{\mathbf{F}},{\mathbf{W}}} \right) = \varvec{f}^{2} \left( {\mathop \sum \limits_{{{\text{j}} = 1}}^{\text{M}} {\text{w}}_{\text{Kj}}^{2} \;f^{1} \left( {\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{N}} {\text{w}}_{ji}^{1} {\text{F}}_{i} + {\text{b}}_{j1}^{1} } \right) + {\text{b}}_{{{\text{K}}1}}^{2} } \right) $$
(5)

6.5 Network training

The training involves an iterative procedure for minimization of an error function, with adjustments to the weights being made in a sequence of steps. Given a training set comprising a set of input force vectors F i together with the corresponding set of target vectors tn (containing the systematically changed material parameters) the sum of the error function can be shown as follows:

$$ {\text{E}}\left( {\mathbf{W}} \right) = \frac{1}{2}\mathop \sum \limits_{n = 1}^{N} \left\| {\sigma_{K} \left( {{\mathbf{F}}_{i} ,{\mathbf{W}}} \right) - t_{n} } \right\|^{2} $$
(6)

Therefore, it is obvious that to maximise the likelihood of function \( \upsigma_{\text{K}} \), the function E(W) (i.e. the sum of squared error (SSE)), must be minimised.

Amongst training techniques, Backpropagation with Levenberg–Marquardt (L–M) optimisation algorithm was chosen in this study. L–M algorithm is designed specifically for minimising the SSE. In addition it is one of the fastest methods for training moderate-sized FF neural networks, such as the one implemented in this work (Bishop 1995). For L–M algorithm to function effectively, two training parameters must be defined, one is the error goal and the second one is the minimum gradient. After performing a series of trainings, the value of 0.0005 for the error goal, and 0.0005 for minimum gradient were found to produce the best result.

Fig. 8
figure 8

Flow chart of the training process

In addition, in order to avoid over training the network, an early stopping technique was implemented. This technique was applied by providing the neural network with a validation set F Exp consisting of a (5 × 1) force vectors corresponding to the experimental plane strain SPT. Demonstration of the training process is shown in Fig. 8.

7 Results and discussion

7.1 Plane strain SPT

The load displacement curves corresponding to the SS304L can be seen in Fig. 9. Consistency of the results is apparent as the load displacement curves (LDC) almost follow the same pattern. However, small dissimilarities of the curves can be observed in Fig. 9 as the load start to peak in the final stage. Although, the exact rational behind the above phenomenon is not known, nonetheless the following factors can have a significant influence:

  1. 1.

    Small variation of the specimen size. All the specimens were prepared and cut by a manual guillotine shear cutter and therefore their dimension were not exactly the same.

  2. 2.

    Although due care were taken to cut the specimen along the grain of the sheet, nonetheless this proved to be very difficult without implementing microscope. Therefore some of the specimen may have been cut across the grain and this could have caused the small variation.

  3. 3.

    More precise means of manufacturing such as implementing the electric discharge machining (EDM) may have been effective in eliminating the variations. Having said that, these manufacturing methods could only be achieved at considerably higher costs.

Fig. 9
figure 9

Experimental plane strain SPT load displacement curve

Furthermore, as seen in fig. 9 the load–displacement curves show an approximately linear initial loading which is considered as the elastic bending regime. This stage is mainly controlled by the elastic material properties (i.e. Yong’s modulus and Poisson ratio). The second stage reflects the transition between elastic and plastic regime of SS304L. This stage begins with the transition between elastic to plastic and later becomes purely plastic. Here the voids begin to nucleate (i.e. fN increases) as plastic strain increases. The parameters that influences this region are q1, q2, q3, εN, sN and fN. In the third stage, the curvature of the graph changes from positive to negative where the deformation mode becomes purely plastic. This is the inflection point where the deformation mode becomes mainly membrane stretching. As the deformation increases the void volume fraction reaches a critical value ( fC ) at the end of this stage. Finally as the load reaches its pick, the specimen undergoes a noticeable reduction in thickness and void coalescence begins. The void volume fraction has reached a critical point and keeps rising to its final value ( fF ). The graph also shows shortly after the maximum load has reached (just under 3mm), the load starts to decrease and this phenomenon actually demonstrate the coalescence of voids, in which the test piece loses its load carrying capacity and ultimately failure occurs.

7.2 Neural network simulations

Finally the neural networks were created and run by Matlab and the networks were simulated for 15 times during which the material properties \( \upsigma_{1} \), \( \upsigma_{2} \), and \( \upsigma_{3} \) were recorded. Afterwards, the recorded material properties were un-normalised and fed back to Abaqus and the corresponding force vectors were obtained. Consequently, the material properties corresponding to the force vector that produced the least mean squared error (when compared with its experimental counterpart) was chosen as the ultimate result.

Hence, the following values for the plastic properties were identified to produce the best results: \( {\varvec{\upsigma}}_{1} \) = 246.93 MPa, \( {\varvec{\upsigma}}_{2} \) = 924.4 MPa and \( {\varvec{\upsigma}}_{3} \) = 1281.97.

The comparison of the experimental LDC from the plane strain SPT and its simulated counterpart (using the identified parameter) are shown in Fig. 10. As can be seen, the results are in close agreement except that the predicted LDC is slightly higher in the initial deformation stage that is corresponding to the elastic and transition regions.

Fig. 10
figure 10

Simulated and experimental plane strain SPT

8 Conclusion

A new approach has been developed to identify the plastic properties of SS304L by implementing the plane strain small punch test. The plane strain SPT demonstrated its novelty in terms of functionality and consistency proved to be a possible candidate in identifying material parameters. In addition, a successful numerical simulation of the plane strain SPT were carried out to construct a database which was then used to successfully train artificial neural networks by Levenberg–Marquardt backpropagation algorithm. This consequently led to the successful identification of the plastic properties of SS304L. In general, a close agreement was observed by comparing the experimental and simulated load displacement curves, even though, slight variance was observed in the initial deformation stages. Future work should first of all investigates the small variations in the initial deformation stages as well as validating the precision of the neural network through uncertainty quantifications. Furthermore, the above methodology can be tested by identifying the damage parameters in the Gurson–Tvergard–Needleman (GTN) material model which then can be verified by performing FE simulation of notch and experimental tensile tests.