1 Introduction

Martensitic transformations and microstructures. Martensitic transformations (MTs) are diffusionless solid–solid phase transformations observed in many metallic and nonmetallic crystalline solids, minerals, and various compounds, where a parent phase called austenite (high-temperature phase) transforms into the product phase called martensite (low-temperature phase) [1, 2]. The martensitic phase (here denoted by \(\mathsf{{M}}\)) has lower crystallographic symmetry than the austenite phase (here denoted by \(\mathsf{{A}}\)) and generally has multiple variants. Very complex microstructures such as austenite-twinned martensite, twins within twins, twins within twins within twins, wedge, X-interfaces are observed within the materials undergoing MTs [1,2,3,4,5]. The evolution of such microstructures plays a central role in, for example, the strengthening of steel, shape memory effect in various alloys, ferromagnetic effect, caloric effects, etc. [1, 6].

In continuum theories for MTs, such phase-changing materials are modeled as nonlinear elastic materials having multiple wells in the free energy density function [1, 2, 4, 7, 8]. Fine twinned microstructures associated with austenite-martensite interfaces, which were observed under the microscopes [9,10,11,12], are usually obtained as the minimizers of such non-convex energies within the continuum theories [4]. The analytical crystallographic solutions for twins between a pair of variants and the austenite-twinned martensite interfaces are well-known within the small as well as finite deformation theories [1, 2, 4, 13]. These solutions have been further used for obtaining the solutions for more complex wedge microstructures [5, 14] and X-interfaces [15, 16]. Another important complex microstructure is twins within twins [1, 3] for which the general crystallographic equations were established in [1, 10, 11]. Though the governing equations are well-known for twins within twins, the analytical solutions for such microstructures are still missing to the best of our knowledge.

Phase-field approach to MTs. The phase-field approaches based on the Ginzburg–Landau equations [17] (similar to Allen–Cahn’s approach [18]), which provide an ideal framework for studying the MTs, have been widely used for studying nucleation, growth of the phases, and evolution of complex microstructures [19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51]. Notably, the phase-field approaches are popularly used for other types of structural changes in materials, including melting [52], the evolution of damage [53], the kinetics of grain boundaries [54], etc. In all the phase-field models, a set of sufficiently smooth scalar internal variables, called the order parameters, are used to describe the phases. The volume fraction based (e.g., [55,56,57,58,59,60,61,62,63]) or the transformation strains based (e.g., [19, 20, 23, 24, 26]) order parameters have been used. Within the multiphase phase-field approaches, the order parameters should be constrained to some specified surfaces in order to control the transformation paths. For that purpose, various constraint hypersurfaces such as hypersphere [64], planar surfaces [65], and straight lines [20, 66] have been used; see [20] for a review. The double-well (e.g., [19, 20, 26, 64, 66]) or double-obstacle (e.g., [22])-based thermal energies are usually used. The free energies are considered to be smooth functions of the order parameters, and the transformation strains are accepted as linear [55, 57,58,59,60, 60,61,62,63] or nonlinear functions [19,20,21, 23, 24, 67] of the order parameters smoothly varying between all the phases. The energies and the transformation strains satisfy the requirements of thermodynamic equilibrium of the phases [45, 46] (also see Sect. 2.5). The first large strain-based phase-field theory and computational approaches were presented in [21, 37, 47]. They utilized the methods of repetitive superposition of large strains, developed by Levin in [68,69,70] for viscoelastic materials, extended to materials with phase transformations. A gradient (of the order parameters)-based nonlocal energy is considered, which introduces finite interface widths between the phases, and in 3D domains, the interfaces are modeled as shell-like regions [71]; see, e.g., [72] and the references therein for other types of nonlocal theories. The time evolution of these order parameters describing the kinetics of the PTs is derived using the laws of thermodynamics, yielding a system of coupled Ginzburg–Landau equations. The interfacial stresses, consisting of the elastic and structural components and which play an important role in the nucleation of the phases and also in their kinetics and growth, have been considered (e.g., [35, 73, 74]). A detailed comparison between these various multiphase phase-field approaches to MTs is presented in [20].

The multiphase phase-field model for studying the multivariant MTs developed by the authors in [20] yields non-contradictory results for a two-variant system. However, for a system with more than two variants, some contradictions have been observed in relation to the gradient energy and the system of kinetic equations. One of the aims of this work is to discuss those issues from that model and present a non-contradictory multiphase phase-field model for MTs. A gradient energy proposed therein simplifies consistently for a two-variant system and matches with the well-established result (see [20] for the discussion). However, a contradiction is observed when the system contains more than two variants, as discussed in Sect. 2.3. An alternative form of the gradient energy has been used here which has similarities with the gradient energy used in [22, 23, 58, 65] and yields non-contradictory results for any number of variants. In the present model, we, however, multiply this gradient term with the determinant of the total deformation gradient while determining the total system energy to ensure an appropriate form of the structural stress tensor, which was, however, not considered in [22, 23, 58, 65]. Furthermore, we point out in Sect. 2.4.2 that the coupled kinetic equations for the order parameters are non-contradictory for a two-variant system but lead to contradictions for an N-variant system for \(N>2\) if the kinetic coefficients are assumed to be constants. We thus introduce a system of kinetic equations with kinetic coefficients, which are piece-wise functions of the order parameters and the driving forces, which are motivated by Ref. [60].

Contribution of the paper. The contributions of this paper are mainly threefold:

i) We present a thermodynamically consistent nanoscale phase-field approach for multivariant MTs considering non-contradictory gradient energies and the local energies, including the barrier, chemical, and elastic energy, as well as energies penalizing the multiphase junctions, while the deviations of the transformation paths for \(\mathsf{{A}}\leftrightarrow \mathsf{{M}}\) and \({\mathsf{{M}}}_i\leftrightarrow {\mathsf{{M}}}_j\) PTs from the specified paths are appropriately controlled. The issues with the existing gradient energy models are discussed. Furthermore, a consistent kinetic model for coupled Ginzburg–Landau equations is derived, and the issues with the existing kinetic models are discussed. The present model can be used for MTs with any number of variants.

ii) A general approximate crystallographic solution for the twins within twins microstructure is presented. The solution for the cubic to tetragonal MTs is also obtained.

(iii) The evolution and formation of 3D twins within twins microstructures in a single grain are studied using the present phase-field approach. The simulation results are in good agreement with the crystallographic solution and the experimental results.

Notations. The multiplication and the inner product between two arbitrary second order tensors \({\varvec{A}}\) and \({\varvec{D}} \) are denoted by \(({\varvec{A}} \cdot {\varvec{D}})_{ab}=A_{ac} D_{cb}\) and \({\varvec{A}}:{\varvec{D}}=A_{ab} D_{ba}\), respectively, where \(A_{ab}\) and \(D_{ab}\) are the components of the tensors in a right-handed orthonormal Cartesian basis \(\{\varvec{e}_1,\varvec{e}_2,\varvec{e}_3\}\). The repeated indices imply Einstein’s summation. The Euclidean norm of \(\varvec{A}\) is defined by \(|\varvec{A}|=\sqrt{{\varvec{A}}:{\varvec{A}}^T}\). The second-order identity tensor is denoted by \({{\varvec{I}}}\). \(\varvec{A}^T\), \(tr\,\varvec{A}\), \(det\,\varvec{A}\), \(sym(\varvec{A})\), and \(skew(\varvec{A})\) denote the transpose, trace, determinant, symmetric part, and skew part of \(\varvec{A}\), respectively. For an invertible tensor \(\varvec{A}\), its inverse is denoted by \(\varvec{A}^{-1}\). The tensor or dyadic product between two arbitrary vectors \(\varvec{a}\) and \({\varvec{b}}\) is denoted by \(\varvec{a}\otimes {\varvec{b}}\). The reference, stress-free intermediate, and deformed or current configurations are denoted by \(\Omega _0\), \(\Omega _t\), and \(\Omega \), respectively. The volumes in the reference and current configurations are denoted by \(V_0\) and V, and their external boundaries are denoted by \(S_0\) and S, respectively. The symbols \(\nabla _0(\cdot )\) and \(\nabla (\cdot )\) denote the gradient operators in \(\Omega _0\) and \(\Omega \), respectively. The Laplacian operators in \(\Omega _0\) and \(\Omega \) are designated by \(\nabla _0^2:= \nabla _0 \cdot \nabla _0 \) and \(\nabla ^2 := \nabla \cdot \nabla \), respectively. The symbol \(:=\) implies equality by definition.

2 Coupled mechanics and phase-field model

We describe our multiphase phase-field model in this section. In Sect. 2.1, the order parameters are introduced; in Sect. 2.2, the kinematic relations are enlisted; the general form of Helmholtz free energy is presented in Sect. 2.3; the general form of the governing coupled mechanics and phase-field equations is derived in Sect. 2.4; using the condition for homogeneous nucleation of the phases, we have derived the expressions for the interpolation functions related to the order parameters in Sect. 2.5; the explicit form of the energies is derived in Sect. 2.6, and the structural stresses and Ginzburg–Landau equations are derived in Sect. 2.7; we summarize the shortcomings of the previous models from the literature and discuss how the present model overcomes them in Sect. 2.8.

2.1 Order parameters

For the MTs in a system with austenite and N martensitic variants, we consider \(N+1\) order parameters \(\eta _0, \eta _1, \ldots ,\eta _{i-1}, \eta _i, \eta _{i+1}, \ldots ,\eta _N\), where \(\eta _0\) describes \({\mathsf{{A}}}\leftrightarrow {\mathsf{{M}}}\) transformations such that \(\eta _0=0\) in \(\mathsf{{A}}\) and \(\eta _0=1\) in \(\mathsf{{M}}\), and \(\eta _i\) (for \(i=1,\ldots ,N\)) describes the variant \({\mathsf{{M}}}_i\) such that \(\eta _i=1\) in \({\mathsf{{M}}}_i\) and \(\eta _i=0\) in \({\mathsf{{M}}}_j\) for all \(j\ne i\). Such descriptions for the order parameters were introduced by the authors in earlier work [20]. The order parameters \(\eta _1, \,\eta _2, \ldots , \eta _N\) are to constrained lie on a plane by satisfying (see [20] for details)

$$\begin{aligned} \sum _{i=1}^N \eta _i=1, \end{aligned}$$
(2.1)

which ensures that the variant–variant transformation paths to lie in that hyperspace. We introduce a set of all the order parameters as \({\tilde{\eta }}=\{\eta _0,\eta _1,\ldots ,\eta _i,\ldots ,\eta _N\}\) and a subset \({\tilde{\eta }}_M=\{\eta _1,\ldots ,\eta _i,\ldots ,\eta _N\}\) of \({\tilde{\eta }}\). We also designate \({\hat{\eta }}_0 = \{\eta _0=0,\eta _1,\ldots ,\eta _i,\ldots ,\eta _N\}\) for \(\mathsf{{A}}\) and \({\hat{\eta _i}} = \{\eta _0=1,\eta _1=0,\ldots ,\eta _i=1,\ldots ,\eta _N=0\}\) for the variant \({\mathsf{{M}}}_i\). The set of the gradient of all the order parameters is denoted by \({\tilde{\eta }}^\nabla = \{\nabla \eta _0,\nabla \eta _1,\ldots ,\nabla \eta _i,\ldots ,\nabla \eta _N\}\).

2.2 Kinematics

The position vector of a particle in the deformed configuration \(\Omega \) at time instance t is given by \(\varvec{r}(\varvec{r}_0,t)=\varvec{r}_0+\varvec{u}(\varvec{r}_0,t)\), where \(\varvec{r}_0\) is the position vector of that particle in \(\Omega _0\), and \(\varvec{u}\) is the displacement vector. For a general thermoelastic deformation coupled to MTs, the total deformation gradient tensor \({{\varvec{F}}}:=\nabla _0\varvec{r} \) is multiplicatively decomposition into [75]

$$\begin{aligned} {{\varvec{F}}}:=\nabla _0\varvec{r} = \varvec{F}_e\cdot \varvec{F}_\theta \cdot \varvec{F}_t, \end{aligned}$$
(2.2)

where the subscripts e, \(\theta \), and t designate the elastic, thermal, and transformational parts, respectively, and \(\varvec{F}_e\), \(\varvec{F}_\theta \), and \(\varvec{F}_t\), respectively, are the elastic, thermal, and transformational parts of \(\varvec{F}\). We denote \(J=det\,\varvec{F} \), \(J_t=det\,\varvec{F}_t\), \(J_\theta =det\,\varvec{F}_\theta \), and \(J_e=det\,\varvec{F}_e \). Hence, by Eq. (2.2), \(J=J_eJ_\theta J_t \). The Lagrangian total and elastic strain tensors are defined as

$$\begin{aligned} {{\varvec{E}}} := 0.5(\varvec{C}-{{\varvec{I}}}), \quad \text {and}\quad {{\varvec{E}}}_e := 0.5(\varvec{C}_e-{{\varvec{I}}}), \end{aligned}$$
(2.3)

respectively, where \(\varvec{C}={{\varvec{F}}}^T\cdot {{\varvec{F}}}\) and \(\varvec{C}_e={{\varvec{F}}}_e^T\cdot {{\varvec{F}}}_e\) are the total and elastic right Cauchy–Green strain tensors, respectively. We define the spatial total and elastic strain tensors as

$$\begin{aligned} {{\varvec{b}}} = 0.5(\varvec{B}-{{\varvec{I}}}), \quad \text {and }\quad {{\varvec{b}}}_e = 0.5(\varvec{B}_e-{{\varvec{I}}}), \end{aligned}$$
(2.4)

respectively, where \( \varvec{B} = {{\varvec{F}}}\cdot {{\varvec{F}}}^T\), and \(\varvec{B}_e = {{\varvec{F}}}_e\cdot {{\varvec{F}}}_e^T\) are the total and elastic left Cauchy–Green strain tensors, respectively. In this paper, we assume that the body is at a uniform temperature, and thus, we have \(\varvec{F}_\theta ={\varvec{I}}\) and \(J_\theta =1\).

Kinematic model for \(\varvec{F}_t\). We consider \(\varvec{F}_t\) as a linear combination of the Bain strains multiplied by the interpolation functions related to the order parameters [20]:

$$\begin{aligned} {{\varvec{F}}}_t={\varvec{I}}+ \sum _{i=1}^N \varvec{\varepsilon }_{ti}\, \varphi (a_\varepsilon ,\eta _0)\,\phi _i(\eta _i), \end{aligned}$$
(2.5)

where \(\varvec{\varepsilon }_{ti}=\varvec{U}_{ti}-{\varvec{I}}\) and \(\varvec{U}_{ti}\) are the Bain strain and Bain stretch tensors, respectively, for \({\textsf{M}}_i\), \( \varphi (a_\varepsilon ,\eta _0)\) and \(\phi _i(\eta _i)\) are the interpolation functions, and \(a_\varepsilon \) is a constant parameter. The exact form of these interpolation functions, which are required to yield the conditions \({{\varvec{F}}}_t(\hat{\eta }_0)={\varvec{I}}\) in \(\mathsf{{A}}\) and \({{\varvec{F}}}_t(\hat{\eta }_i)=\varvec{U}_{ti}\) in \({\mathsf{{M}}}_i\) from Eq. (2.5), are derived in Sect. 2.5. The possible values for \(a_\varepsilon \) are also prescribed in Sect. 2.5.

2.3 Free energy of the system

We assume the Helmholtz free energy per unit mass of the body as [20, 73]:

$$\begin{aligned} \psi (\varvec{F},\varvec{F}_e,\theta ,{\tilde{\eta }},{\tilde{\eta }}^\nabla ) = \psi ^l(\varvec{F},\varvec{F}_e,\theta ,{\tilde{\eta }})+ J\psi ^\nabla (\eta _0,{\tilde{\eta }}^\nabla ), \end{aligned}$$
(2.6)

where \( \psi ^l\) is the local part of the free energy density and \(\psi ^\nabla \) is the gradient-based nonlocal energy accounting for the energies of all the interfaces. We have taken \( \psi ^l\) as

$$\begin{aligned} \psi ^l(\varvec{F}, \varvec{F}_e,\theta ,{\tilde{\eta }})=\frac{J_t}{\rho _0}\psi ^e(\varvec{F}_e,\theta ,{\tilde{\eta }})+J\breve{\psi }^{\theta }(\theta ,{\tilde{\eta }})+\tilde{\psi }^\theta (\theta ,{\tilde{\eta }})+\psi ^p({\tilde{\eta }}), \end{aligned}$$
(2.7)

where \(\psi ^e\) is the strain energy per unit volume of \(\Omega _t\), \(\breve{\psi }^\theta \) is the barrier energy related to \({\mathsf{{A}}}\leftrightarrow {\mathsf{{M}}}\) PT and all the variant\(\leftrightarrow \)variant transformations, \(\tilde{\psi }^\theta \) is the thermal/chemical energy for \({\mathsf{{A}}}\leftrightarrow {\mathsf{{M}}}\) transformations, \(\psi ^p\) penalizes various triple and higher junctions between all the phases and also accounts for the penalization in energy for the deviation of the transformation paths from the assigned ones, \(\theta >0\) is the absolute temperature, and \(\rho _0\) is the density of the solid in \(\Omega _0\). In Eqs. (2.6) and (2.7), the barrier energy and the gradient energy are multiplied by J following [73] in order to obtain the desired expression for the structural stresses [given by Eq. (2.53)]. We consider that the material properties, such as the elastic constants, transformation strains, at any material point, are determined using

$$\begin{aligned} B({\tilde{\eta }}, \theta , {{\varvec{F}}}) = B_0(1-\varphi (a,\eta _0)) + \sum _{i=1}^N B_i\phi _i (\eta _i) \varphi (a,\eta _0), \end{aligned}$$
(2.8)

where \(B_i=B_i({\hat{\eta _i}},\theta , {{\varvec{F}}})\) and \(B_0=B_0({\hat{\eta }}_0,\theta , {{\varvec{F}}})\) are the material properties of \({\textsf{M}}_i\) and \({\textsf{A}}\), respectively.

2.4 Governing mechanics and phase-field equations

We now derive the governing equations. Applying the principle of balance of linear and angular momentum and the first and second laws of thermodynamics and using an approach similar to [20, 73], we derive the mechanical equilibrium equation, dissipation inequalities, and Ginzburg–Landau equations.

2.4.1 Mechanical equilibrium equations and stresses

Using the balance of linear momentum, the mechanical equilibrium equations are obtained as (see, e.g., Chapter 3 of [76])

$$\begin{aligned} \nabla _0\cdot {\varvec{P}} =\varvec{0} \quad \text {in }\Omega _0, \quad \text {or}\quad \nabla \cdot \varvec{\sigma }=\varvec{0} \quad \text {in }\Omega , \end{aligned}$$
(2.9)

where the body forces and inertia are neglected, \({\varvec{P}}\) is the total first Piola–Kirchhoff stress tensor, and \(\varvec{\sigma }\) is the total Cauchy stress tensor. Applying the balance of angular momentum along with Eqs. (2.9)\(_1\) and (2.9)\(_2\), we get \({\varvec{P}}\cdot \varvec{F}^T = \varvec{F}\cdot {\varvec{P}}^T\), and \(\varvec{\sigma }=\varvec{\sigma }^T\) (see, e.g., Chapter 3 of [76]). In Appendix A, using Eqs. (2.6) and (2.7) and the first and second laws of thermodynamics, and following the Coleman–Noll procedure [77, 78], we have shown neglecting the viscous stresses that the total first Piola–Kirchhoff stresses are composed of the elastic and structural parts (also see [20]):

$$\begin{aligned}{} & {} {\varvec{P}} = {\varvec{P}}_e+{\varvec{P}}_{st}, \quad \text {where }\quad {{\varvec{P}}}_e = J_t\frac{\partial \psi ^e}{\partial \varvec{F}_e}\cdot \varvec{F}_t^{-T}, \quad \text {and} \end{aligned}$$
(2.10)
$$\begin{aligned}{} & {} {{\varvec{P}}}_{st}= J\rho _0(\breve{\psi }^{\theta }+\psi ^\nabla ){\varvec{F}}^{-T} -J\rho _0 \left( \nabla \eta _0 \otimes \frac{\partial \psi ^\nabla }{\partial \nabla \eta _0}+ \sum _{i=1}^N \nabla \eta _i \otimes \frac{\partial \psi ^\nabla }{\partial \nabla \eta _i}\right) \cdot \varvec{F}^{-T}, \end{aligned}$$
(2.11)

are the elastic and structural first Piola–Kirchhoff stresses, respectively. Using the relation (see, e.g., Chapter 3 of [76])

$$\begin{aligned} {\varvec{P}}=J \varvec{\sigma }\cdot \varvec{F}^{-T}, \end{aligned}$$
(2.12)

we derive the corresponding Cauchy stresses as

$$\begin{aligned}{} & {} \varvec{\sigma }= \varvec{\sigma }_e+\varvec{\sigma }_{st},\quad \varvec{\sigma }_e=J_e^{-1}\frac{\partial \psi ^e}{\partial \varvec{F}_e}\cdot \varvec{F}_e^{T}, \quad \text {and} \end{aligned}$$
(2.13)
$$\begin{aligned}{} & {} \varvec{\sigma }_{st} =\rho _0(\breve{\psi }^{\theta }+\psi ^\nabla ){\varvec{I}}-\rho _0\left( \nabla \eta _0\otimes \frac{\partial \psi ^\nabla }{\partial \nabla \eta _0}+ \sum _{i=1}^N \nabla \eta _i\otimes \frac{\partial \psi ^\nabla }{\partial \nabla \eta _i}\right) . \end{aligned}$$
(2.14)

Using the functional form of \(\psi ^e\) given by Eq. (2.7) in Eqs. (2.10)\(_2\) and (2.13)\(_2\), we rewrite the elastic stresses as

$$\begin{aligned} {\varvec{P}}_e=J_t \varvec{F}_e\cdot \hat{\varvec{S}}_e\cdot \varvec{F}_t^{-T}, \quad \text {and}\quad \varvec{\sigma }_e=J_e^{-1} \varvec{F}_e\cdot \hat{\varvec{S}}_e\cdot \varvec{F}_e^T, \end{aligned}$$
(2.15)

where \( \hat{\varvec{S}}_e=\displaystyle \frac{\partial \psi ^e(\varvec{E}_e)}{\partial \varvec{E}_e}\). For an isotropic elastic response, \({\varvec{P}}_e\) and \(\varvec{\sigma }_e\) can alternatively be expressed as

$$\begin{aligned} {\varvec{P}}_e = J_t(2{\varvec{b}}_e+{\varvec{I}})\cdot \frac{\partial \psi ^e({\varvec{b}}_e)}{\partial {\varvec{b}}_e }\cdot \varvec{F}^{-T}, \quad \text {and}\quad \varvec{\sigma }_e = J^{-1}_e (2{\varvec{b}}_e+{\varvec{I}})\cdot \frac{\partial \psi ^e({\varvec{b}}_e)}{\partial {\varvec{b}}_e }. \end{aligned}$$
(2.16)

2.4.2 Dissipation inequality and Ginzburg–Landau equations

Using the first and second laws of thermodynamics and neglecting any viscous stresses, we have derived the following dissipation inequality in Appendix A:

$$\begin{aligned} \rho _0 {{\mathcal {D}}}= {\dot{\eta }}_0 X_{0M}+\sum _{i=1}^N{\dot{\eta }}_i X_i\ge 0, \end{aligned}$$
(2.17)

where \({{\mathcal {D}}}\) is the power dissipation per unit mass, \(X_{0M}\) and \(X_i\) are the conjugate thermodynamic forces for the evolution of \(\eta _0\) and \(\eta _i\) for \(i=1,\ldots ,N\), and their general form is given by

$$\begin{aligned} X_l= & {} \left( {\varvec{P}}_e^T\cdot \varvec{F}_e-J_t\psi ^e \varvec{F}_t^{-1}\right) :\frac{\partial \varvec{F}_t}{\partial \eta _l}- J_t\left. \frac{\partial \psi ^e}{\partial \eta _l}\right| _{\varvec{F}_e}- \rho _0J\frac{\partial (\breve{\psi }^\theta +\psi ^\nabla ) }{\partial \eta _l} -\rho _0 \frac{\partial ({\tilde{\psi }}^\theta +\psi ^p)}{\partial \eta _l} \nonumber \\{} & {} +\nabla _0\cdot \left( \rho _0 J\varvec{F}^{-1} \cdot \frac{\partial \psi ^\nabla }{\partial \nabla \eta _l}\right) \quad \text {for } l=0,1,2,\ldots , N. \end{aligned}$$
(2.18)

Using the following identities (can be proved using indicial notations)

$$\begin{aligned} \nabla _0\eta _k=\varvec{F}^T\cdot \nabla \eta _k, \quad \text {and}\quad \displaystyle \frac{\partial \psi ^\nabla }{\partial \nabla _0\eta _k}=\varvec{F}^{-1}\cdot \frac{\partial \psi ^\nabla }{\partial \nabla \eta _k}, \end{aligned}$$
(2.19)

we can rewrite \(X_l\) given by Eq. (2.18) into the following compact form in terms of the gradients and variables in \(\Omega _0\):

$$\begin{aligned} X_l = -\rho _0\left. \frac{\partial \psi }{\partial \eta _l}\right| _{\varvec{F}}+\nabla _0\cdot \left( \rho _0 J\frac{\partial \psi ^\nabla }{\partial \nabla _0\eta _l} \right) \quad \text {for } l=0,1,2,\ldots , N. \end{aligned}$$
(2.20)

Alternatively, using the identities \(\rho _0=J\rho \) and \(\displaystyle \nabla _0\cdot (J\varvec{F}^{-T})=\varvec{0}\) (see Chapter 2 of [76] for their proofs) along with Eqs. (2.19)\(_{1,2}\) in the last term of Eq. (2.18), we can express \(X_l\) in terms of the gradients and variables in \(\Omega \) as

$$\begin{aligned} X_l = -J\rho \left. \frac{\partial \psi }{\partial \eta _l}\right| _{\varvec{F}}+J\nabla \cdot \left( J \rho \frac{\partial \psi ^\nabla }{\partial \nabla \eta _l} \right) \quad \text {for } l=0,1,2,\ldots , N, \end{aligned}$$
(2.21)

where \(\rho \) is the density of the material in \(\Omega \).

Without loss of generality, we decouple the inequality (2.17) into

$$\begin{aligned} \rho _0{{\mathcal {D}}}_0= {\dot{\eta }}_0 X_0\ge 0 \quad \text {and}\quad \rho _0 {{\mathcal {D}}}_M=\sum _{i=1}^N{\dot{\eta }}_iX_i\ge 0, \end{aligned}$$
(2.22)

which satisfies the original inequality (2.17). We now derive the Ginzburg–Landau equations using inequalities (2.22)\(_1\) and (2.22)\(_2\). From the inequality (2.22)\(_1\), we derive the kinetic law for \(\eta _0\) as

$$\begin{aligned} {\dot{\eta }}_0 = L_{0M} X_0, \end{aligned}$$
(2.23)

where \( L_{0M}\ge 0\) is the kinetic coefficient for \(\mathsf{{A}}\leftrightarrow \mathsf{{M}}\) PTs. In order to derive the kinetic laws for the order parameters \(\eta _1,\ldots ,\eta _N\) using the inequality (2.22)\(_2\), we introduce the variables \({\dot{\eta }}_{ij}={\dot{\eta }}_i-{\dot{\eta }}_j\) and \(X_{ij}=X_i-X_j\), using which we can verify that \({\dot{\eta }}_{ij}=-{\dot{\eta }}_{ji}\), \(X_{ij}=-X_{ji}\), \({\dot{\eta }}_{ii}=0\), and \(X_{ii}=0\) (no sum on indices). Using these expressions, we can write

$$\begin{aligned} {\dot{\eta }}_i = \sum _{j=1}^N\frac{{\dot{\eta }}_{ij}}{N} \quad \text {for all } i,j=1,\ldots ,N. \end{aligned}$$
(2.24)

Using Eq. (2.24), the dissipation rate due to the evolution of the martensitic variants given by Eq. (2.22)\(_2\) is rewritten as

$$\begin{aligned} \rho _0{{\mathcal {D}}}_M= & {} \sum _{i=1}^N{\dot{\eta }}_iX_i = \sum _{i=1}^N\sum _{j=1}^NX_i\frac{{\dot{\eta }}_{ij}}{N} \qquad \text {(using Eq.~(2.24))}\nonumber \\= & {} \sum _{i=1}^N\sum _{j=1}^N \frac{X_{ij}{\dot{\eta }}_{ij}}{N} +\sum _{i=1}^N\sum _{j=1}^N \frac{X_j{\dot{\eta }}_{ij}}{N} \qquad (\text {using}\,\, X_{ij}=X_i-X_j)\nonumber \\= & {} \sum _{i=1}^N\sum _{j=1}^N \frac{X_{ij}{\dot{\eta }}_{ij}}{N}- \sum _{i=1}^N\sum _{j=1}^N \frac{X_j{\dot{\eta }}_{ji}}{N} \qquad (\text {using} \,\,{\dot{\eta }}_{ij}=-{\dot{\eta }}_{ji})\nonumber \\= & {} \sum _{i=1}^N\sum _{j=1}^N \frac{X_{ij}{\dot{\eta }}_{ij}}{N}- \sum _{i=1}^N\sum _{j=1}^N \frac{X_i{\dot{\eta }}_{ij}}{N} \qquad \text {(swapping the indices in the second term)}\nonumber \\= & {} \sum _{i=1}^N\sum _{j=1}^N \frac{X_{ij}{\dot{\eta }}_{ij}}{N}- \sum _{i=1}^N X_i{\dot{\eta }}_i \qquad \text {(using Eq.~(2.24))}. \end{aligned}$$
(2.25)

Noticing that the second term on the right-hand side of Eq. (2.25) is equal to \(\rho _0{{\mathcal {D}}}_M\) (compare with Eq. (2.22)\(_2\)), we obtain

$$\begin{aligned} \rho _0{{\mathcal {D}}}_M= & {} \frac{1}{2}\sum _{i=1}^N\sum _{j=1}^N X_{ij}\frac{{\dot{\eta }}_{ij}}{N} = \sum _{i=1}^{N-1}\sum _{j=i+1}^N X_{ij}\frac{{\dot{\eta }}_{ij}}{N} \ge 0. \end{aligned}$$
(2.26)

We decouple all the terms from inequality (2.26)\(_2\) and consider \(X_{ij}{{\dot{\eta }}_{ij}}/{N}\ge 0\) (no sum on indices), which satisfies the original inequality (2.26)\(_2\). Based on these decoupled inequalities, we derive the kinetic equations for each pair of variants as

$$\begin{aligned} \frac{{\dot{\eta }}_{ij}}{N} = L_{ij}(X_i-X_j), \end{aligned}$$
(2.27)

where \(L_{ij} \ge 0\) is the kinetic coefficient for transformations between \({\mathsf{{M}}}_i\) and \({\mathsf{{M}}}_j\), and it is taken in [60] as

$$ \begin{aligned} L_{ij} \left\{ \begin{array}{@{}ll@{}} \ne 0 &{} \text {if }(X_i-X_j)\ge 0 \quad \text {and} \quad \{0< \eta _i<1 \,\, \& \,\, 0<\eta _j<1\}\\ \ne 0 &{} \text {if }(X_i-X_j)\le 0 \quad \text {and} \quad \{0< \eta _i< 1 \,\, \& \,\, 0< \eta _j <1\} \\ =0 &{} \text {if }(X_i-X_j)\ge 0 \quad \text {and} \quad \{\eta _i=1 \,\,\text {or}\,\, \eta _j=0\}\\ =0 &{} \text {if }(X_i-X_j)\le 0 \quad \text {and} \quad \{ \eta _i=0 \,\,\text {or}\,\, \eta _j=1\}. \end{array}\right. \end{aligned}$$
(2.28)

Substituting Eq. (2.27) in Eq. (2.24), the Ginzburg–Landau equations for all N order parameters \(\eta _1,\ldots ,\eta _N\) are obtained as

$$\begin{aligned} {\dot{\eta }}_i = \sum _{j=1, j\ne i}^NL_{ij}(X_i-X_j) \quad \text {for all } i=1,2,\ldots ,N. \end{aligned}$$
(2.29)

We note that the kinetic coefficients \(L_{ij}\) in Eq. (2.29), defined in Eq. (2.28), are piece-wise constants, jumping between their finite values and zero depending on the driving forces and the order parameters. An issue will arise if \(L_{ij}\) is assumed to be constants similar to our earlier work in [20]. To understand it clearly, let us consider a three-variant martensitic system (where \(\eta _0=1\)) with \(\mathsf{{M}}_1\), \(\mathsf{{M}}_2\), and \(\mathsf{{M}}_3\) without any loss of generality. Using the constraint \(\eta _1+\eta _2+\eta _3=1\), the two independent Ginzburg–Landau equations from Eq. (2.29), when expressed for \({\dot{\eta }}_1\) and \({\dot{\eta }}_2\), are given by

$$\begin{aligned} {\dot{\eta }}_1 =L_{12}(X_1-X_2)+L_{13}(X_1-X_3) \quad \text {and} \quad {\dot{\eta }}_2 =L_{12}(X_2-X_1)+L_{23}(X_2-X_3). \end{aligned}$$
(2.30)

We now consider a martensitic region where \(\mathsf{{M}}_3\) is absent and only the variants \(\mathsf{{M}}_1\) and \(\mathsf{{M}}_2\) evolve within an arbitrary time interval. The order parameters \(\eta _1\) and \(\eta _2\) hence must be determined using the equation \({\dot{\eta }}_1={\dot{\eta }}_2 = L_{12}(X_1-X_2)\) as \(\eta _1+\eta _2=1\) therein, and this is possible if and only if \(L_{13}=L_{23}=0\) therein within that time interval. However, if the coefficients \(L_{13}\) and \(L_{23}\) are taken as nonzero constants, the contributions from the terms \(L_{13}(X_1-X_3)\) and \(L_{23}(X_2-X_3)\) would be there unwantedly since the driving forces \(X_1-X_3\) and \(X_2-X_3\) might be nonzero there. The desired condition can be fulfilled by \(L_{ij}\) given by Eq. (2.28), but not by constant \(L_{ij}\) considered in [20]. The essence of the third and fourth conditions in Eq. (2.28) is that if variant i is absent, it cannot be transformed into other variants [60]. We mention that \(L_{ij}\) as another function of the order parameters has been used in [79, 80].

2.5 Thermodynamic equilibrium and interpolation functions

We now consider the thermodynamic equilibrium conditions for the homogeneous phases \(\mathsf{{A}}\) and \(\mathsf{{M}}\) [45,46,47] and derive their implications on the explicit expressions of the energies, \(\varvec{F}_t\), and material properties introduced in Eqs. (2.6), (2.5), and (2.8), respectively. Classical thermodynamics says that a material point is in thermodynamic equilibrium, provided it is in thermal, mechanical, and phase equilibrium (see Chapters 1 and 14 of [81]). Since we consider a uniform temperature of the body, the material points are at thermal equilibrium (Chapter 1 of [81]). The state of stress \({\varvec{P}}\) (or \(\varvec{\sigma }\)) of that point satisfies Eq. (2.9)\(_1\) (or Eq. (2.9)\(_2\)) and hence, the points are in mechanical equilibrium as well. For the points from the homogeneous phases to be in phase or chemical equilibrium, the following conditions must be satisfied [45,46,47]:

$$\begin{aligned} X_0= & {} 0 \qquad \text {in }{\mathsf{{A}}} \text { and } {\mathsf{{M}}}, \quad \text {and}\nonumber \\ X_i-X_j= & {} 0 \quad \text {in }{ \mathsf{{M}}} \text { for all }i\ne j, \end{aligned}$$
(2.31)

which are obtained by considering \({\dot{\eta }}_0=0\) and \({\dot{\eta }}_i=0\) in Eqs. (2.23), (2.27), and (2.29) for any given \(\theta \), \(\varvec{F}_e\), and \({\varvec{P}}\), neglecting the terms related to the spatial derivatives of the order parameters. Equation (2.31)\(_2\) is the condition for homogeneous transformation between the variants \({\mathsf{{M}}}_i\) and \({\mathsf{{M}}}_j\) while all other variants are absent; hence, \(\eta _i+\eta _j=1\) therein.

The conditions given by Eq. (2.31)\(_1\), when used in conjunction with Eq. (2.18), yield

(2.32)

Equation (2.18) when used in Eq. (2.31)\(_2\) leads to

(2.33)

We now obtain the exact expression for the interpolation functions \(\varphi (a,\eta _0)\) and \(\phi _i(\eta _i)\) using the equilibrium conditions given by Eqs. (2.32) and (2.33) within the homogeneous phases. Recalling that \(B({\hat{\eta }}_0,\theta , {{\varvec{F}}})=B_0\) and \(\varvec{F}_t(\hat{\eta }_0)={\varvec{I}}\) in \(\mathsf{{A}}\), and \(B({\hat{\eta _i}},\theta , {{\varvec{F}}})=B_i\) and \(\varvec{F}_t(\hat{\eta }_i)=\varvec{U}_{ti}\) in \({\mathsf{{M}}}_i\) by definitions (see the discussion near Eq. (2.8)), we get the following conditions on \(\varphi (a,\eta _0)\) and \(\phi _i(\eta _i)\) within the phases:

$$\begin{aligned} \varphi (a,0) =0, \quad \varphi (a,1) =1; \quad \text {and}\quad \phi _i(0) =0, \quad \phi _i(1) =1. \end{aligned}$$
(2.34)

The additional conditions on \(\varphi (a,\eta _0)\) and \(\phi _i(\eta _i)\) within the phases are obtained using Eqs. (2.5) and (2.8) in Eqs. (2.32) and (2.33) as

$$\begin{aligned}{} & {} \frac{\partial \varphi (a,0)}{\partial \eta _0}= \frac{\partial \varphi (a,1)}{\partial \eta _0}=0, \end{aligned}$$
(2.35)
$$\begin{aligned}{} & {} B_i\frac{\partial \phi _i(\eta _i=0)}{\partial \eta _i}= B_j\frac{\partial \phi _j(\eta _j=1)}{\partial \eta _j} \quad \text {and} \quad B_i\frac{\partial \phi _i(\eta _i=1)}{\partial \eta _i}= B_j\frac{\partial \phi _j(\eta _j=0)}{\partial \eta _j}\quad \text {for all }i\ne j. \end{aligned}$$
(2.36)

Since the properties of the variants can be different (such as elastic constants, Bain strains, etc.), we conclude from Eqs. (2.36)\(_{1,2}\) that for all possible combinations of variants \(B_i\) and \(B_j\), \(\phi _i\) must satisfy

$$\begin{aligned} \frac{\partial \phi _i(\eta _i=0)}{\partial \eta _i}= \frac{\partial \phi _i(\eta _i=1)}{\partial \eta _i}=0 \quad \text {for all }i=1,2,\ldots ,N. \end{aligned}$$
(2.37)

If the conditions given by Eq. (2.37) is met, Eqs. (2.36)\(_1\) and (2.36)\(_2\) are automatically satisfied when the properties of the variants are the same (\(B_i=B_j\)). Considering the general polynomials of degree four for both the interpolation functions \(\varphi (a,\eta _0)\) and \(\phi _i(\eta _i)\), and then applying the conditions given by Eqs. (2.34), (2.35), and (2.37), we obtain their expressions as

$$\begin{aligned} \varphi (a, \eta _0) =a\eta _0^2+(4-2a)\eta _0^3 +(a-3)\eta _0^4, \quad \text {and } \quad \phi (\eta _i)= \eta _i^2(3-2 \eta _i) \quad \text {for all }i=1,2,\ldots ,N, \qquad \end{aligned}$$
(2.38)

where \(0\le a\le 6\) [45].

2.6 Explicit form of free energy

We now write the explicit expressions for all the energies introduced in Eqs. (2.6) and (2.7) below:

2.6.1 Strain energy

We consider \(\psi ^e\) as a quadratic function of the elastic strain tensor \(\varvec{E}_e\) [20]

$$\begin{aligned} \psi ^e =0.5 \varvec{E}_e:\hat{\varvec{{\mathcal {C}}}}_e(\eta _0,{\tilde{\eta }}_M):\varvec{E}_e, \end{aligned}$$
(2.39)

where the fourth-order elastic modulus tensor at any material point is taken following Eq. (2.8) as [20]

$$\begin{aligned} \hat{\varvec{{\mathcal {C}}}}_e(\eta _0,{\tilde{\eta }}_M) = (1-\varphi (a,\eta _0)) \hat{\varvec{{\mathcal {C}}}}_{(e)0} + \varphi (a,\eta _0)\sum _{i=1}^N \phi _i (\eta _i) \hat{\varvec{{\mathcal {C}}}}_{(e)i}, \end{aligned}$$
(2.40)

and \(\hat{\varvec{{\mathcal {C}}}}_{(e)0}\) and \(\hat{\varvec{{\mathcal {C}}}}_{(e)i}\) are the fourth-order elastic modulus tensors of \(\mathsf{{A}}\) and \({{\textsf{M}}}_i\), respectively.

2.6.2 Barrier energy

The total energy of the barriers between \(\mathsf{{A}}\) and \(\mathsf{{M}}\) and between all the variants is [20]

$$\begin{aligned} \breve{\psi }^{\theta } = A_{0M}\,\eta _0^2 (1-\eta _0)^2 + \varphi (a_b, \eta _0) {\bar{A}}\sum _{i=1}^{N-1}\sum _{j=i+1}^N \eta _i^2\, \eta _j^2 , \end{aligned}$$
(2.41)

where \(A_{0M}\) and \({\bar{A}}\) are the coefficients for the barrier energies between \({\mathsf{{A}}}\) and \({\mathsf{{M}}}\), and \({\mathsf{{M}}}_i\) and \({\mathsf{{M}}}_j\) (for all \(i\ne j\)), respectively, and the expression for \(\varphi (a_b,\eta _0)\) is given by Eq. (2.38)\(_1\). We note that the barrier energy between \(\mathsf{{A}}\) and \(\mathsf{{M}}\) (first term of Eq. (2.41)) and the barrier energies between the variants (terms within the summation) satisfy the requirements for the thermodynamic equilibrium conditions (see Eqs. (2.32)\(_{2,4}\)).

2.6.3 Thermal energy

The thermal energy of a particle undergoing \({\mathsf{{A}}}\leftrightarrow {\mathsf{{M}}}\) PTs is taken as [20, 45,46,47]

$$\begin{aligned} \tilde{\psi }^\theta = \psi _{0}^{\theta }(\theta ) + \varphi (a_\theta ,\eta _0) \, \Delta \psi ^{\theta } (\theta ), \quad \text {where } \Delta \psi ^\theta = -\Delta s_{0M} (\theta -\theta _e), \end{aligned}$$
(2.42)

\(\psi _{0}^{\theta }\) is the specific thermal energy of \(\mathsf{{A}}\), \(\Delta \psi ^{\theta }=\psi ^\theta _M-\psi ^\theta _0\) is the specific thermal energy difference between \(\mathsf{{A}}\) and \(\mathsf{{M}}\) phases, \(\Delta s_{0M}=s_M-s_0\), \(s_0\) and \(s_M\) are the specific entropies of \(\mathsf{{A}}\) and \(\mathsf{{M}}\), respectively, and \(\theta _e\) is the thermodynamic equilibrium temperature between \(\mathsf{{A}}\) and \(\mathsf{{M}}\) phases. The interpolation function \(\varphi (a_\theta ,\eta _0)\) is given by Eq. (2.38)\(_1\).

2.6.4 Penalization energy

We penalize the triple and higher junctions between all the phases and the deviations of the transformation paths between the variants using [20]

$$\begin{aligned} \psi ^p= & {} \varphi (a_K, \eta _0)\sum _{i=1}^{N-1}\sum _{j=i+1}^N K_{ij}( \eta _i+\eta _j-1)^2\eta _i^2\eta _j^2 +[1-\varphi (a_K, \eta _0)] \sum _{i=1}^{N-1}\sum _{j=i+1}^N K_{0ij} \eta _0^2\eta _i^2\eta _j^2\nonumber \\{} & {} +\varphi (a_K, \eta _0) \sum _{i=1}^{N-2}\sum _{j=i+1}^{N-1}\sum _{k=j+1}^N K_{ijk} \eta _i^2\eta _j^2\eta _k^2+ [1-\varphi (a_K, \eta _0)] \sum _{i=1}^{N-2}\sum _{j=i+1}^{N-1}\sum _{k=j+1}^N K_{0ijk} \eta _0^2\eta _i^2\eta _j^2\eta _k^2 \nonumber \\{} & {} +\varphi (a_K, \eta _0) \sum _{i=1}^{N-3}\sum _{j=i+1}^{N-2}\sum _{k=j+1}^{N-1}\sum _{l=k+1}^N K_{ijkl} \eta _i^2\eta _j^2\eta _k^2\eta _l^2, \quad \text {where} \end{aligned}$$
(2.43)

the penalty coefficients satisfy the following symmetries (no summation on the indices):

$$\begin{aligned}{} & {} K_{ij}=K_{ji}; \quad K_{0ij}=K_{0ji}; \quad K_{0jik}= K_{0ijk}=K_{0jki}=K_{0kji}=K_{0ikj}=K_{0kij}; \nonumber \\{} & {} K_{ijk}=K_{jik}=K_{jki}=K_{kji}=K_{ikj}=K_{kij}; \quad K_{ijkl}=K_{jikl}=K_{ljki}=K_{kjil}=K_{ikjl}=K_{ilkj}=K_{ijlk}; \nonumber \\ K_{ii}= & {} K_{0ii}=K_{iji}=K_{iik}= K_{0iji}=K_{iikl}= K_{ijjl}=K_{0iik}=K_{ijil}=K_{ijki}= K_{ijkk} =0. \end{aligned}$$
(2.44)

The interpolation function \(\varphi (a_K,\eta _0)\) is given by Eq. (2.38)\(_1\). We can verify that \(\psi ^p\) given by Eq. (2.43) and its first derivative with respect to the order parameters satisfy the requirements for the thermodynamic equilibrium conditions of the phases as discussed in Sect. 2.5. In Eq. (2.43), the parameter \(K_{ij}\ge 0\) controls the penalization of the \({\mathsf{{M}}}_j\leftrightarrow {\mathsf{{M}}}_i\) transformation path deviation from the straight line \(\eta _j+\eta _i=1\) for all \(\eta _k=0\) and \(k\ne j,i\). For the very large values of \(K_{ij}\) (\(\rightarrow \infty \)), the transformation path for \({\mathsf{{M}}}_j\leftrightarrow {\mathsf{{M}}}_i\) PT will coincide with the straight line \(\eta _j+\eta _i=1\), and the spurious phases \({\mathsf{{M}}}_k\) for \(k=1,\ldots , N\) and \( k\ne i,j\) will not arise within the \({\mathsf{{M}}}_i\)-\({\mathsf{{M}}}_j\) boundary. The smaller values of \(K_{ij}\) will cause some deviation of the transformation paths from those straight lines. As a result, the spurious phases will exist across the variant–variant interfaces. The second term in Eq. (2.43) penalizes the triple junctions between \({\mathsf{{A}}}-{\mathsf{{M}}}_i-{\mathsf{{M}}}_j\) (for \(i,j=1,\ldots ,N; i\ne j\)), and varying the constant parameter \(K_{0ij}\) we can control the size and excess energy of the triple junction regions; see, e.g., [52, 54] for details and the numerical examples for the role of such term in grain boundary related phenomena. Similarly, the third, fourth, and fifth terms of Eq. (2.43) penalize the junctions between \({\mathsf{{M}}}_i-{\mathsf{{M}}}_j-{\mathsf{{M}}}_k\), \({\mathsf{{A}}}-{\mathsf{{M}}}_i-{\mathsf{{M}}}_j-{\mathsf{{M}}}_k\), and \({\mathsf{{M}}}_i-{\mathsf{{M}}}_j-{\mathsf{{M}}}_k-{\mathsf{{M}}}_l\), respectively. The constant coefficients \(K_{0ij} \ge 0\), \(K_{ijk} \ge 0\), \(K_{0ijk}\ge 0\), and \(K_{ijkl}\ge 0\) are the control parameters which decide the size and excess energy of the respective junction regions. In the second and fourth terms of \(\psi ^p\), we have a multiplication factor \(1-\varphi (a_K,\eta _0)\) to ensure that these terms are nonzero around the junctions between \(\mathsf{{A}}\) and variants only and vanish within the martensite.

Remark 1

.    In the absence of all the penalty terms, i.e., when \(K_{ij}=K_{0ij}=K_{0ik}=K_{0jk}=K_{ijk}=K_{0ijk}=0\), we can show that for a martensitic region (\(\eta _0=1\)) with three variants, say, \(\mathsf{{M}}_1\), \(\mathsf{{M}}_2\) and \(\mathsf{{M}}_3\), the specific barrier energy (see Eq. (2.41)) at the center of the triple junction region, i.e., at the point with \(\eta _1=\eta _2=\eta _3=1/3\) is \(3\times {\bar{A}}/81={\bar{A}}/27\) which is less than the specific barrier energy \({\bar{A}}/16\) at the middle line of any variant-variant interface (a line with, say, \(\eta _1=\eta _2=1/2\) and \(\eta _3=0\)). When \(K_{123}\ne 0\), the total specific energy at a martensitic particle with \(\eta _1=\eta _2=\eta _3=1/3\) is

$$\begin{aligned} \left. E_{TJ}\right| _{\eta _1=\eta _2=\eta _3=1/3}=\frac{{\bar{A}}}{27}+\frac{K_{123}}{729}. \end{aligned}$$
(2.45)

It is to be noted that Tóth et al. [79] and Bollada et al. [80] considered barrier energy similar to ours given by Eq. (2.41) but with a common multiplication factor to incorporate higher energy at the junction region as compared to the respective interface regions. In this paper, we have, however, followed a different and simpler approach for that purpose, where we have introduced the penalty terms in the free energy, and by varying the coefficients \(K_{0ij},\,K_{0ijk},\,K_{ijk},\) and \(K_{ijkl}\) we can control the energy and size of all the junction regions. For example, by tuning the parameter \(K_{123}\) in Eq. (2.45), we can make the barrier energy height at the junction region higher than the barrier energy in the interfacial region. However, a quantitative comparison between our formulation and the approach in [79, 80] is not given here.

2.6.5 Gradient energy

We consider the gradient energy, considering all the interfacial energies, as

$$\begin{aligned} \psi ^\nabla = \frac{\beta _{0M}}{2\rho _0} {\left| {\nabla \eta _0} \right| ^2} +\frac{1}{2\rho _0}{\tilde{\varphi }}(\eta _0 ,a_\beta , a_c) \sum _{i=1}^{N-1}\sum _{j=i+1}^N {\beta _{ij}}\nabla \eta _i\cdot \nabla \eta _j, \end{aligned}$$
(2.46)

where \(\beta _{0M}\) and \(\beta _{ij}=\beta _{ji}\) are the gradient energy coefficients for \({\mathsf{{A}}}-{\mathsf{{M}}}\) and \({\mathsf{{M}}}_i-{\mathsf{{M}}}_j\) interfaces, respectively. The interpolation function \({\tilde{\varphi }}\) is taken as [20]

$$\begin{aligned} {\tilde{\varphi }}(a_\beta , a_c,\eta _0) = a_c+{a_\beta }{\eta _0^2} -2[{a_\beta }-2(1-a_c)]{\eta _0^3}+ [{a_\beta }- 3(1-a_c)]{\eta _0^4}, \end{aligned}$$
(2.47)

where the constant is taken as \(0<a_c\ll 1\), and the purpose of considering it in Eq. (2.47) is discussed in [20]. When \(a_c=0\), note that \( {\tilde{\varphi }}(a_\beta , a_c=0,\eta _0)=\varphi (a_\beta ,\eta _0)\). Here also, we take \(0\le a_\beta \le 6\). The gradient energy similar to Eq. (2.46) was earlier used in [58, 65], for example. Note that the coefficients \(\beta _{ij}\) in Eq. (2.46) for the variant pairs in twin relationships would be much smaller than that for the variant pairs not in twin relationships.

Remark 2

.    Notably, the authors earlier introduced another form of the gradient energy in [20], given by

$$\begin{aligned} \psi ^\nabla = \frac{1}{2\rho _0} \left[ {\beta _{0M}}{\left| {\nabla \eta _0} \right| ^2} + \sum _{i=1}^{N}\sum _{j=1, \ne i}^N \frac{\beta _{ij}}{8} |\nabla \eta _i-\nabla \eta _j|^2{\tilde{\varphi }}(\eta _0 ,a_\beta , a_c)\right] . \end{aligned}$$
(2.48)

This energy given by Eq. (2.48) simplifies to

$$\begin{aligned} \psi ^\nabla =\frac{1}{2\rho _0}{\beta _{0M}}{\left| {\nabla \eta _0} \right| ^2} +\frac{{\tilde{\varphi }}(\eta _0 ,a_\beta , a_c)}{16\rho _0} \left( \beta _{12}{\left| \nabla \eta _1-\nabla \eta _2 \right| ^2}+\beta _{21}{\left| \nabla \eta _2-\nabla \eta _1 \right| ^2} \right) , \end{aligned}$$
(2.49)

for a system with two variants. Applying the constraint \(\eta _1+\eta _2=1\) and \(\beta _{12}=\beta _{21}\) due to the symmetry [73] in Eq. (2.49), we further simplify it to

$$\begin{aligned} \psi ^\nabla =\frac{1}{2\rho _0}{\beta _{0M}}{\left| {\nabla \eta _0} \right| ^2} +\frac{{\tilde{\varphi }}(\eta _0 ,a_\beta , a_c)}{2\rho _0} \beta _{12}{\left| \nabla \eta _1\right| ^2}, \end{aligned}$$
(2.50)

which is consistent with the results of earlier models; see, e.g., [73] and the references therein. For a system with three variants, Eq. (2.48) reduces to

$$\begin{aligned} \psi ^\nabla= & {} \frac{{\beta _{0M}}}{2\rho _0} {\left| {\nabla \eta _0} \right| ^2} +\frac{1}{8\rho _0}\left[ (\beta _{12}+\beta _{13})\left| {\nabla \eta _1} \right| ^2+(\beta _{12}+\beta _{23})\left| {\nabla \eta _2} \right| ^2 +(\beta _{13}+\beta _{23})\left| {\nabla \eta _3} \right| ^2 \right. \nonumber \\{} & {} \left. - 2\,(\beta _{12}\nabla \eta _1\cdot \nabla \eta _2+\beta _{23}\nabla \eta _2\cdot \nabla \eta _3+ \beta _{13}\nabla \eta _1\cdot \nabla \eta _3)\right] {\tilde{\varphi }}(\eta _0 ,a_\beta , a_c). \end{aligned}$$
(2.51)

Let us consider a region for such a three-variant system where only \(\mathsf{{M}}_1\) and \(\mathsf{{M}}_2\) coexist and \(\mathsf{{M}}_3\) is absent, i.e., \(\eta _3=0\) and \(\eta _1+\eta _2=1\). The gradient energy given by Eq. (2.51) in that region is rewritten by applying these conditions as

$$\begin{aligned} \psi ^\nabla= & {} \frac{{\beta _{0M}}}{2\rho _0} {| {\nabla \eta _0} |^2} +\frac{1}{8\rho _0} \left( \beta _{23}+\beta _{13}+4\beta _{12}\right) | {\nabla \eta _1} |^2{\tilde{\varphi }}\left( \eta _0 ,a_\beta , a_c\right) . \end{aligned}$$
(2.52)

The energy parameters \(\beta _{23}\) and \(\beta _{13}\) will influence the interfacial energy between \(\mathsf{{M}}_1\) and \(\mathsf{{M}}_2\) variants, which is nonphysical; see also [80] for an analysis. We have shown that the gradient energy given by Eq. (2.48) yields a nonphysical contribution for an interface between two variants from the gradient coefficients which are not related to that interface. Hence, this form of gradient energy is not acceptable. However, the numerical results of [20] and the subsequent papers [50, 82], where the same model was used to simulate martensitic microstructures with two variants, are correct. The energy given by Eq. (2.46) used in this paper is non-contradictory for any number of variants.

2.7 Explicit form of structural stresses and phase-field equations

Here, we write down the explicit form of the structural stresses and the Ginzburg–Landau equations using the explicit form of the energies derived in Sect. 2.6. The boundary conditions for the mechanics and phase-field equations used for the computation are also enlisted.

2.7.1 Explicit form of structural stresses

Using the gradient energy given by Eq. (2.46) in Eqs. (2.11) and (2.14), the structural stresses are obtained as

$$\begin{aligned} {{\varvec{P}}}_{st}= & {} J\rho _0(\breve{\psi }^{\theta }+\psi ^\nabla ){\varvec{F}}^{-T}-J\beta _{0M}\nabla \eta _0\otimes \nabla \eta _0 \cdot {\varvec{F}}^{-T}- \frac{J{\tilde{\varphi }}}{2}\left( \sum _{i=1}^{N}\sum _{j=1, j\ne i}^N\beta _{ij}\nabla \eta _i\otimes \nabla \eta _j\right) \cdot {\varvec{F}}^{-T}, \quad \text {and} \nonumber \\ \varvec{\sigma }_{st}= & {} \rho _0(\breve{\psi }^{\theta }+\psi ^\nabla ){\varvec{I}}-\beta _{0M}\nabla \eta _0\otimes \nabla \eta _0 -\frac{{\tilde{\varphi }}}{2}\sum _{i=1}^{N}\sum _{j=1, j\ne i}^N\beta _{ij} \nabla \eta _i\otimes \nabla \eta _j. \end{aligned}$$
(2.53)

The elastic stresses are derived in Eqs. (2.15)\(_{1,2}\).

2.7.2 Explicit form of Ginzburg–Landau equations

Using Eqs. (2.39), (2.41), (2.42), (2.43), and (2.46) in Eqs. (2.18) and (2.20), we get the conjugate forces \(X_0\) and \(X_i\) (for all \(i=1,\ldots ,N\)) when the field variables and space derivatives are expressed in \(\Omega _0\) as

$$\begin{aligned} X_0= & {} \left( {\varvec{P}}_e^T\cdot \varvec{F}-J_t\psi ^e{\varvec{I}}\right) :\varvec{F}_t^{-1}\cdot \frac{\partial \varvec{F}_t}{\partial \eta _0}- J_t \left. \frac{\partial \psi ^e}{\partial \eta _0}\right| _{\varvec{F}_e}- \rho _0\frac{\partial \varphi (a_\theta ,\eta _0)}{\partial \eta _0}\Delta \psi ^\theta \nonumber \\{} & {} \quad -J\rho _0{\bar{A}}\sum _{i=1}^{N-1} \sum _{j=i+1}^N\eta _i^2\eta _j^2\frac{\partial \varphi (a_b,\eta _0)}{\partial \eta _0}\nonumber \\{} & {} -J\rho _0 A_{0M}(\theta ) (2\eta _0-6\eta _0^2+4\eta _0^3) -\frac{J}{2}\frac{\partial {\tilde{\varphi }}(a_\beta ,a_c,\eta _0)}{\partial \eta _0}\sum _{i=1}^{N-1}\sum _{j=i+1}^N\beta _{ij}(\varvec{C}^{-1}\cdot \nabla _0\eta _i)\cdot \nabla _0\eta _j \nonumber \\{} & {} -\rho _0\left( \sum _{i=1}^{N-1}\sum _{j=i+1}^N K_{0ij}\eta _i^2\eta _j^2+ \sum _{i=1}^{N-2}\sum _{j=i+1}^{N-1} \sum _{k=j+1}^N K_{0ijk}\eta _i^2\eta _j^2\eta _k^2\right) \left[ 2(1-\varphi (a_K,\eta _0))\eta _0-\frac{\partial \varphi (a_K,\eta _0)}{\partial \eta _0}\eta _0^2\right] \nonumber \\{} & {} -\rho _0\frac{\partial \varphi (a_K, \eta _0)}{\partial \eta _0}\left[ \sum _{i=1}^{N-1}\sum _{j=i+1}^N K_{ij}( \eta _i+\eta _j-1)^2\eta _i^2\eta _j^2 + \sum _{i=1}^{N-2}\sum _{j=i+1}^{N-1}\sum _{k=j+1}^{N}K_{ijk}\eta _i^2\eta _j^2\eta _k^2 \right. \nonumber \\{} & {} \left. + \sum _{i=1}^{N-3}\sum _{j=i+1}^{N-2}\sum _{k=j+1}^{N-1}\sum _{l=k+1}^N K_{ijkl} \eta _i^2\eta _j^2\eta _k^2\eta _l^2\right] + \nabla _0\cdot \left( J\beta _{0M}\varvec{C}^{-1}\cdot \nabla _0\eta _0\right) ; \end{aligned}$$
(2.54)
$$\begin{aligned} X_i= & {} \left( {\varvec{P}}_e^T\cdot \varvec{F}-J_t\psi ^e{\varvec{I}}\right) :\varvec{F}_t^{-1}\cdot \frac{\partial \varvec{F}_t}{\partial \eta _i} -J_t \left. \frac{\partial \psi ^e}{\partial \eta _i}\right| _{\varvec{F}_e}\nonumber \\{} & {} \quad -2J\rho _0{\bar{A}}\sum _{j=1,\ne i}^N\eta _i\eta _j^2\varphi (a_b,\eta _0)-2\rho _0 \sum _{j=1,\ne i}^N K_{ij}(\eta _i+\eta _j-1) \nonumber \\{} & {} \times (2\eta _i+\eta _j-1)\eta _j^2\eta _i \varphi (a_K,\eta _0)-2\rho _0\left( \sum _{j=1,\ne i}^N K_{0ij}\eta _j^2 +\sum _{j=1,\ne i}^{N-1}\sum _{k=j+1}^N K_{0ijk}\eta _j^2\eta _k^2 \right) \eta _0^2\eta _i(1-\varphi (a_K,\eta _0)) \nonumber \\{} & {} -2\rho _0\varphi (a_K,\eta _0) \sum _{j=1,\ne i}^{N-1}\sum _{k=j+1}^N K_{ijk}\eta _i\eta _j^2\eta _k^2 - 2\rho _0\varphi (a_K,\eta _0)\sum _{j=1,\ne i}^{N-2}\sum _{k=j+1}^{N-1}\sum _{l=k+1}^N K_{ijkl}\eta _i\eta _j^2\eta _k^2\eta _l^2 \nonumber \\{} & {} +0.5 \nabla _0\cdot \left( {\tilde{\varphi }}(a_\beta ,a_c,\eta _0)J\sum _{j=1,\ne i}^N \beta _{ij}\varvec{C}^{-1}\cdot \nabla _0\eta _j\right) \quad \text {for all }i=1,2,3,\ldots ,N. \end{aligned}$$
(2.55)

The conjugate forces can alternatively be rewritten in terms of the Cauchy stress and the spatial derivatives in \(\Omega \) using Eqs. (2.18) and (2.21) as

$$\begin{aligned} \frac{X_0}{J}= & {} \left( \varvec{F}^{-1}\cdot \varvec{\sigma }_e\cdot \varvec{F}-\frac{\psi ^e}{J_e}{\varvec{I}}\right) :\varvec{F}_t^{-1}\cdot \frac{\partial \varvec{F}_t}{\partial \eta _0}- \frac{1}{J_e} \left. \frac{\partial \psi ^e}{\partial \eta _0}\right| _{\varvec{F}_e}- \rho \frac{\partial \varphi (a_\theta ,\eta _0)}{\partial \eta _0}\Delta \psi ^\theta \nonumber \\{} & {} \quad -J\rho {\bar{A}}\sum _{i=1}^{N-1} \sum _{j=i+1}^N\eta _i^2\eta _j^2\frac{\partial \varphi (a_b,\eta _0)}{\partial \eta _0} \nonumber \\{} & {} -J\rho A_{0M}(\theta ) (2\eta _0-6\eta _0^2+4\eta _0^3) -\frac{1}{2}\frac{\partial {\tilde{\varphi }}(a_\beta ,a_c,\eta _0)}{\partial \eta _0}\sum _{i=1}^{N-1}\sum _{j=i+1}^N\beta _{ij}\nabla \eta _i\cdot \nabla \eta _j \nonumber \\{} & {} -\rho \left( \sum _{i=1}^{N-1}\sum _{j=i+1}^N K_{0ij}\eta _i^2\eta _j^2+ \sum _{i=1}^{N-2}\sum _{j=i+1}^{N-1} \sum _{k=j+1}^N K_{0ijk}\eta _i^2\eta _j^2\eta _k^2\right) \left[ 2(1-\varphi (a_K,\eta _0))\eta _0-\frac{\partial \varphi (a_K,\eta _0)}{\partial \eta _0}\eta _0^2\right] \nonumber \\{} & {} -\rho \frac{\partial \varphi (a_K, \eta _0)}{\partial \eta _0}\left[ \sum _{i=1}^{N-1}\sum _{j=i+1}^N K_{ij}( \eta _i+\eta _j-1)^2\eta _i^2\eta _j^2+ \sum _{i=1}^{N-2}\sum _{j=i+1}^{N-1}\sum _{k=j+1}^{N}K_{ijk}\eta _i^2\eta _j^2\eta _k^2 \right. \nonumber \\{} & {} +\left. \sum _{i=1}^{N-3}\sum _{j=i+1}^{N-2}\sum _{k=j+1}^{N-1}\sum _{l=k+1}^N K_{ijkl} \eta _i^2\eta _j^2\eta _k^2\eta _l^2\right] + \nabla \cdot \left( \beta _{0M}\nabla \eta _0\right) ; \end{aligned}$$
(2.56)
$$\begin{aligned} \frac{X_i}{J}= & {} \left( \varvec{F}^{-1}\cdot \varvec{\sigma }_e\cdot \varvec{F}-\frac{\psi ^e}{J_e}{\varvec{I}}\right) :\varvec{F}_t^{-1}\cdot \frac{\partial \varvec{F}_t}{\partial \eta _i} -\frac{1}{J_e} \left. \frac{\partial \psi ^e}{\partial \eta _i}\right| _{\varvec{F}_e} -2J\rho {\bar{A}}\sum _{j=1,\ne i}^N\eta _i\eta _j^2\varphi (a_b,\eta _0)-2\rho \sum _{j=1,\ne i}^N K_{ij} \nonumber \\{} & {} \times (\eta _i+\eta _j-1)(2\eta _i+\eta _j-1)\eta _j^2\eta _i\varphi (a_K,\eta _0) -2\rho \left( \sum _{j=1,\ne i}^N K_{0ij}\eta _j^2 +\sum _{j=1,\ne i}^{N-1}\sum _{k=j+1}^N K_{0ijk}\eta _j^2\eta _k^2 \right) \eta _0^2\eta _i[1 \nonumber \\{} & {} -\varphi (a_K,\eta _0)]-2\rho \varphi (a_K,\eta _0) \sum _{j=1,\ne i}^{N-1}\sum _{k=j+1}^N K_{ijk}\eta _i\eta _j^2\eta _k^2 - 2\rho \varphi (a_K,\eta _0)\sum _{j=1,\ne i}^{N-2}\sum _{k=j+1}^{N-1}\sum _{l=k+1}^N K_{ijkl}\eta _i\eta _j^2\eta _k^2\eta _l^2 \nonumber \\{} & {} + 0.5 \nabla \cdot \left( {\tilde{\varphi }}(a_\beta ,a_c,\eta _0)\sum _{j=1,\ne i}^N \beta _{ij}\nabla \eta _j\right) \quad \text {for all }i=1,2,3,\ldots ,N. \end{aligned}$$
(2.57)

2.7.3 Boundary conditions

The boundary conditions for the phase-field equations and the mechanics problem used in this paper are listed here.

Phase-field problem. We have applied the periodic BC for all the order parameters. Let us consider two boundaries \(S_{p1\eta _k}\subset S_0\) and \(S_{p2\eta _k}\subset S_0\), where \(S_{p1\eta _k}\cap S_{p2\eta _k}\) is empty. These two boundaries have opposite unit normals (outward) in \(\Omega _0\), i.e., \(({\varvec{n}_0})_{S_{p1\eta _k}}=-(\varvec{n}_0)_{S_{p2\eta _k}}\), and they are subjected to the periodic BCs related to the order parameters \(\eta _k\) (for \(k=0,1,2,\ldots ,N\)). Hence, the order parameters and their gradients on these boundaries satisfy

$$\begin{aligned} \eta _k |_{S_{p1\eta _k}} =\eta _k |_{S_{p2\eta _k}} \quad \text {and}\quad (\nabla _0\eta _k\cdot \varvec{n}_0 )_{S_{p1\eta _k}} = (\nabla _0\eta _k\cdot \varvec{n}_0 )_{S_{p2\eta _k}}\qquad \text {for all }k=0,1,2,\ldots ,N. \end{aligned}$$
(2.58)

Mechanics problem. While solving the equilibrium equation given by Eq. (2.9)\(_1\) or Eq. (2.9)\(_2\), we have used a combination of periodic and traction boundary conditions on the surfaces. On a traction boundary \(S_{0T}\subset S_0\), the traction is specified (\(\varvec{p}^{sp}\)):

$$\begin{aligned} {\varvec{P}} \cdot \varvec{n}_0 = \varvec{p}^{sp} \qquad \text {on }S_{0T}. \end{aligned}$$
(2.59)

If two boundaries \(S_{pu1}\subset S_0\) and \(S_{pu2}\subset S_0\) (where \(S_{pu1}\cap S_{pu2}\) is empty) are subjected to a periodic BC on the displacement \(\varvec{u}\), then the displacement on these boundaries is related by

$$\begin{aligned} {\varvec{u}}|_{S_{pu1}} ={\varvec{u}}|_{S_{pu2}}+({\varvec{F}}_h-{\varvec{I}})\cdot {\varvec{r}}_0, \end{aligned}$$
(2.60)

where \({\varvec{F}}_h\) is a specified homogeneous deformation gradient.

2.8 Remarks about the present model

Note that the model presented above is mostly based on the authors’ earlier multiphase phase-field model developed in [20]. The local free energies of [20] (Eqs. (41), (43), (44), (48) therein) are non-contradictory and hence, directly adopted in this paper. However, the gradient energy of [20] (Eq. (51) therein) has issues as discussed in Remark 2 of Sect. 2.6. A non-contradictory gradient energy is hence used here in Eq. (2.46), which overcomes those issues as discussed in Sect. 2.6. Furthermore, the kinetic coefficients \(L_{ij}\) in the Ginzburg–Landau equation (Eq. (27) in [20]) are assumed to be constants in [20]. We have discussed the problems of assuming constant \(L_{ij}\) in Sect. 2.4.2 and considered a form for the coefficients given by Eq. (2.28), which overcomes those problems and satisfies the desired criteria. The present phase-field model differs in these two aspects from our earlier model developed in [20].

3 Crystallographic solutions for twins within twins microstructure

Fig. 1
figure 1

A schematic of twins within twins

In this section, we obtain an approximate solution for the twins within twins microstructures for cubic to tetragonal MTs using the crystallographic theory (see, e.g., [1]). A schematic of twins within twins is shown in Fig. 1 where the twins formed by a variants pair \({\mathsf{{M}}}_i\) and \({\mathsf{{M}}}_j\), and the twins formed by another pair \({\mathsf{{M}}}_k\) and \({\mathsf{{M}}}_l\) form an interface of finite thickness \(\delta _k\) shown by a shaded region. The volume fractions of \({\mathsf{{M}}}_i\) and \({\mathsf{{M}}}_k\) in the respective twins are \(\kappa _1\) and \(\kappa _2\). The \({\mathsf{{M}}}_i\) and \({\mathsf{{M}}}_j\) need not be in a twin relationship with the other variants \({\mathsf{{M}}}_k\) or \({\mathsf{{M}}}_l\) [1].

3.1 Crystallographic equations and general approximate solutions

Crystallographic equations for twin–twin: The equations for the twins between the pairs \({\mathsf{{M}}}_i-{\mathsf{{M}}}_j\), and \({\mathsf{{M}}}_k\)-\({\mathsf{{M}}}_l\) are (see, e.g., Chapter 7 of [1] and [10, 11])

$$\begin{aligned} \varvec{F}_j-\varvec{F}_i = {\varvec{a}}_1'\otimes \varvec{n}_1, \quad \text {and} \quad \varvec{F}_l-\varvec{F}_k = {\varvec{a}}_2'\otimes \varvec{n}_2, \end{aligned}$$
(3.1)

respectively, where \(\varvec{F}_i\), \(\varvec{F}_j\), \(\varvec{F}_k\), and \(\varvec{F}_l\) are piece-wise constant deformation gradient tensors within the martensitic regions \({\mathsf{{M}}}_i\), \({\mathsf{{M}}}_j\), \({\mathsf{{M}}}_k\), and \({\mathsf{{M}}}_l\), respectively; \(\varvec{n}_1\) and \(\varvec{n}_2\) are the unit normals to the respective twin boundaries such that \(\varvec{n}_1\) points into \({\mathsf{{M}}}_j\) and \(\varvec{n}_2\) points into \({\mathsf{{M}}}_l\) (see Fig. 1); the vectors \({\varvec{a}}_1'\) and \({\varvec{a}}_2'\) are related to the simple shear deformations. The governing equation for the twins within twins shown in Fig. 1 is (Chapter 7 of [1] and [10, 11])

$$\begin{aligned} ((1-\kappa _1)\varvec{F}_i +\kappa _1 \varvec{F}_j)-((1-\kappa _2) \varvec{F}_k +\kappa _2\varvec{F}_l)= {{\varvec{b}}}'\otimes {{\varvec{m}}}, \end{aligned}$$
(3.2)

where \(\kappa _1\) and \(\kappa _2\) are the volume fractions of \({\mathsf{{M}}}_j\) and \({\mathsf{{M}}}_l\) in the respective twins, \({{\varvec{m}}}\) is the unit normal to the twin–twin interface shown in the figure, and \({{\varvec{b}}}'\) is a vector related to the deformation.

Using the polar decompositions \(\varvec{F}_i=\varvec{R}_1\cdot \varvec{U}_{ti}\), \(\varvec{F}_j=\varvec{R}_2\cdot \varvec{U}_{tj}\), \(\varvec{F}_k=\varvec{R}_3\cdot \varvec{U}_{tk}\), and \(\varvec{F}_l=\varvec{R}_4\cdot \varvec{U}_{tl}\), where \(\varvec{R}_1\), \(\varvec{R}_2\), \(\varvec{R}_3\), and \(\varvec{R}_4\) are the constant rotation tensors, Eqs. (3.1)\(_1\) and (3.1)\(_2\) can be rewritten as

$$\begin{aligned} \varvec{Q}_1\cdot \varvec{U}_{tj}-\varvec{U}_{ti} = \varvec{a}_1\otimes \varvec{n}_1, \quad \text {and} \quad \qquad \varvec{Q}_2\cdot \varvec{U}_{tl}-\varvec{U}_{tk} = \varvec{a}_2\otimes \varvec{n}_2, \end{aligned}$$
(3.3)

where \(\varvec{Q}_1=\varvec{R}_1^T\cdot \varvec{R}_2, \) \(\varvec{a}_1= \varvec{R}_1^T\cdot \varvec{a}_1'\), \(\varvec{Q}_2=\varvec{R}_3^T\cdot \varvec{R}_4\), and \( \varvec{a}_2= \varvec{R}_3^T\cdot \varvec{a}_2'\). Similarly, using Eqs. (3.1)\(_{1,2}\) and (3.3)\(_{1,2}\) and the relations between these rotation tensors, Eq. (3.2) is rewritten as

$$\begin{aligned} \varvec{Q}_3\cdot (\varvec{U}_{ti} +\kappa _1 {\varvec{a}}_1\otimes \varvec{n}_1)-(\varvec{U}_{tk} +\kappa _2 {\varvec{a}}_2\otimes \varvec{n}_2)= {{\varvec{b}}}\otimes {\varvec{m}}, \end{aligned}$$
(3.4)

where \(\varvec{Q}_3 = \varvec{R}_3^T\cdot \varvec{R}_1\) and \({\varvec{b}} = \varvec{R}_3^T\cdot {{\varvec{b}}}'\). In order to solve Eq. (3.4), we post-multiply it with \((\varvec{U}_{tk} +\kappa _2 {\varvec{a}}_2\otimes \varvec{n}_2)^{-1}\) and rearrange the terms to rewrite the equation as [10]

$$\begin{aligned}{} & {} \varvec{Q}_3\cdot \tilde{\varvec{A}}={\varvec{I}}+ {{\varvec{b}}}\otimes \tilde{ {\varvec{m}}}, ,\quad \text {where} \end{aligned}$$
(3.5)
$$\begin{aligned}{} & {} \tilde{\varvec{A}}=(\varvec{U}_{ti} +\kappa _1 {\varvec{a}}_1\otimes \varvec{n}_1)\cdot (\varvec{U}_{tk} +\kappa _2 {\varvec{a}}_2\otimes \varvec{n}_2)^{-1}, \quad \text { and }\quad \tilde{{\varvec{m}}}=(\varvec{U}_{tk} +\kappa _2 {\varvec{a}}_2\otimes \varvec{n}_2)^{-T}\cdot {{\varvec{m}}}. \end{aligned}$$
(3.6)

The unknowns to be determined from the above equations (3.3) to (3.6) are \(\kappa _1\), \(\kappa _2\), \(\delta _\kappa \), \(\varvec{a}_1\), \(\varvec{a}_2\), \(\varvec{n}_1\), \(\varvec{n}_2\), \({\varvec{b}}\), \({\varvec{m}}\), \(\varvec{Q}_1\), \(\varvec{Q}_2\), and \(\varvec{Q}_3\).

Twins within twins solution: The solution for Eq. (3.3)\(_1\) and (3.3)\(_2\) is well known (see, e.g., Chapter 5 of [1]), which we enlist here for completeness. We thus define a symmetric tensor \(\varvec{{\mathcal {G}}}_1=\varvec{U}_{ti}^{-1}\cdot \varvec{U}_{tj}^2\cdot \varvec{U}_{ti}^{-1}\) corresponding to Eq. (3.3)\(_1\). The eigenvalues of \(\varvec{{\mathcal {G}}}_1\) are denoted by \(\lambda _1\), \(\lambda _2\), and \(\lambda _3\), which are all positive, and the corresponding normalized eigenvectors are denoted by \(\varvec{i}_1\), \(\varvec{i}_2\), and \(\varvec{i}_3\), respectively. Equation (3.3)\(_1\) has a solution if and only if \(\lambda _1\le 1\), \(\lambda _2= 1\), and \(\lambda _3\ge 1\) (assuming \(\lambda _1\le \lambda _2\le \lambda _3\)). The expressions for \(\varvec{a}_1\) and \(\varvec{n}_1\) are given by

$$\begin{aligned} \varvec{a}_1= & {} \zeta _1 \left( \sqrt{\frac{\lambda _3(1-\lambda _1)}{\lambda _3-\lambda _1} } \,\, \varvec{i}_1+\xi \sqrt{\frac{\lambda _1(\lambda _3-1)}{\lambda _3-\lambda _1}}\,\, \varvec{i}_3\right) , \quad \text {and}\nonumber \\ \varvec{n}_1= & {} \frac{\sqrt{\lambda _3}-\sqrt{\lambda _1}}{\zeta _1\sqrt{\lambda _3-\lambda _1}} \left( -\sqrt{{1-\lambda _1} } \,\,\varvec{U}_{ti}\varvec{i}_1+\xi \sqrt{\lambda _3-1}\,\, \varvec{U}_{ti}\varvec{i}_3\right) ,\, \end{aligned}$$
(3.7)

respectively, where \(\xi =\pm 1\), and \(\zeta _1\) is such that \(|\varvec{n}_1|=1\). The solutions \(\varvec{a}_2\) and \(\varvec{n}_2\) for the twins between \({\mathsf{{M}}}_k\) and \({\mathsf{{M}}}_l\) are similarly obtained using the eigenpairs of \(\varvec{U}_{tk}^{-1}\cdot \varvec{U}_{tl}^2\cdot \varvec{U}_{tk}^{-1}\) in Eq. (3.7). The rotations \(\varvec{Q}_1\) and \(\varvec{Q}_2\) can then be obtained using Eq. (3.3)\(_{1,2}\).

Following [10], we now obtain an approximate solution for the twins within twins equation (3.5), which has got a form similar to the austenite-twinned martensite interface equation using the procedure of Ball and James [4] (also see Chapter 7 of [10]). Noticing that the Bain stretches \(\varvec{U}_{ti}\), \(\varvec{U}_{tj}\), \(\varvec{U}_{tk}\), and \(\varvec{U}_{tl}\) are given for a material, we first obtain \(\varvec{a}_1\) and \(\varvec{n}_1\) using Eq. (3.7); \(\varvec{a}_2\) and \(\varvec{n}_2\) are also obtained using similar relations. The procedure for obtaining the remaining unknowns \(\kappa _1\), \(\kappa _2\), \(\varvec{Q}_3\), \({\varvec{b}}\), and \(\tilde{{\varvec{m}}}\) is derived here. In obtaining these unknowns, let us first assume that the parameters \(\kappa _1\) and \(\kappa _2\) are given and solve for the other unknowns. We introduce

$$\begin{aligned} \varvec{{\mathcal {G}}}_2 = \tilde{\varvec{A}}^T\cdot \tilde{\varvec{A}}, \end{aligned}$$
(3.8)

which is symmetric and positive-definite, and its eigenvalues are positive numbers denoted by \(\Lambda _1\), \(\Lambda _2\), and \(\Lambda _3\). \(\varvec{j}_1\), \(\varvec{j}_2\), and \(\varvec{j}_3\) denote the corresponding normalized eigenvectors. Equation (3.5) has a solution if and only if \(\Lambda _1\le 1\), \(\Lambda _2= 1\), and \(\Lambda _3\ge 1\) assuming \(\Lambda _1\le \Lambda _2\le \Lambda _3\). The solutions for the vectors \({\varvec{b}}\) and \(\tilde{{\varvec{m}}}\) are obtained as (see, e.g., [4] and Chapter 6 of [1])

$$\begin{aligned} {\varvec{b}}= & {} \frac{\zeta _2}{\sqrt{\Lambda _3-\Lambda _1}}\left( \sqrt{\Lambda _3(1-\Lambda _1)} \,\,\varvec{j}_1+\xi \sqrt{\Lambda _1(\Lambda _3-1)}\,\, \varvec{j}_3\right) , \quad \text {and}\nonumber \\ \tilde{{\varvec{m}}}= & {} \frac{\sqrt{\Lambda _3}-\sqrt{\Lambda _1}}{\zeta _2\sqrt{\Lambda _3-\Lambda _1}} \left( -\sqrt{{1-\Lambda _1} } \,\,\varvec{j}_1+\xi \sqrt{\Lambda _3-1}\,\, \varvec{j}_3\right) . \end{aligned}$$
(3.9)

The unit normal to the twin–twin boundary \({\varvec{m}}\) is finally obtained using Eq. (3.9)\(_2\) in Eq. (3.6)\(_2\) as

$$\begin{aligned} {{\varvec{m}}} =\frac{\sqrt{\Lambda _3}-\sqrt{\Lambda _1}}{\zeta _2\sqrt{\Lambda _3-\Lambda _1}} (\varvec{U}_{tk} +\kappa _2 \varvec{n}_2\otimes {\varvec{a}}_2 )\cdot \left( -\sqrt{{1-\Lambda _1} } \,\,\varvec{j}_1+\xi \sqrt{\Lambda _3-1}\,\, \varvec{j}_3\right) , \end{aligned}$$
(3.10)

where \(\zeta _2\) in Eqs. (3.9) and (3.10) is such that the norm \(|{\varvec{m}}|=1\). The rotation \(\varvec{Q}_3\) is then determined using Eqs. (3.9)\(_{1,2}\) in Eq. (3.5). Note that the middle eigenvalues \(\Lambda _2\) obtained would be an expression as a function of the volume fractions \(\kappa _1\) and \(\kappa _2\). Setting \(\Lambda _2= 1\), which is required for the existence of the twins within twins solution [10], would give a relation between \(\kappa _1\) and \(\kappa _2\). However, it is impossible to obtain the unique solutions for \(\kappa _1\) and \(\kappa _2\) from the limited governing equations. The thickness of the transition layer \(\delta _\kappa \) is indeterminate.

3.2 Twins within twins solutions for cubic to tetragonal MTs

The solutions for twins within twins for cubic to tetragonal MTs are now obtained. The three Bain stretch tensors for such transformations are (Chapter 5 of [1])

$$\begin{aligned} \varvec{U}_{t1}= & {} \chi \,\varvec{c}_1\otimes \varvec{c}_1 +\alpha \, \varvec{c}_2\otimes \varvec{c}_2 + \alpha \,\varvec{c}_3\otimes \varvec{c}_3, \nonumber \\ \varvec{U}_{t2}= & {} \alpha \, \varvec{c}_1\otimes \varvec{c}_1 +\chi \, \varvec{c}_2\otimes \varvec{c}_2 + \alpha \,\varvec{c}_3\otimes \varvec{c}_3, \nonumber \\ \varvec{U}_{t3}= & {} \alpha \,\varvec{c}_1\otimes \varvec{c}_1 +\alpha \, \varvec{c}_2\otimes \varvec{c}_2 + \chi \, \,\varvec{c}_3\otimes \varvec{c}_3, \end{aligned}$$
(3.11)

where \(\alpha <1\) and \(\chi >1\) are the material constants and \(\{\varvec{c}_1, \,\varvec{c}_2,\,\varvec{c}_3\}\) is a right-handed standard Cartesian basis in the cubic unit cell of \(\mathsf{{A}}\) such that the basis vectors are parallel to three mutually orthogonal sides of that unit cell. The solutions for twins between \({\mathsf{{M}}}_1-{\mathsf{{M}}}_2\) are (\(\varvec{n}\) pointing into \({\mathsf{{M}}}_2\)), \({\mathsf{{M}}}_1-{\mathsf{{M}}}_3\) (\(\varvec{n}\) pointing into \({\mathsf{{M}}}_3\)), and \({\mathsf{{M}}}_2-{\mathsf{{M}}}_3\) (\(\varvec{n}\) pointing into \({\mathsf{{M}}}_3\)) are (see, e.g., Chapter 5 of [1])

$$\begin{aligned} \varvec{a}= & {} \sqrt{2}\upsilon (\chi \varvec{c}_1\pm \alpha \varvec{c}_2), \qquad \varvec{n} =(-\varvec{c}_1\pm \varvec{c}_2)/\sqrt{2}; \nonumber \\ \varvec{a}= & {} \sqrt{2}\upsilon (\chi \varvec{c}_1\pm \alpha \varvec{c}_3), \qquad \varvec{n} =(-\varvec{c}_1\pm \varvec{c}_3)/\sqrt{2}; \nonumber \\ \varvec{a}= & {} \sqrt{2}\upsilon (\chi \varvec{c}_2\pm \alpha \varvec{c}_3), \qquad \varvec{n} =(-\varvec{c}_2\pm \varvec{c}_3)/\sqrt{2}, \end{aligned}$$
(3.12)

respectively, where \(\upsilon ={(\chi ^2-\alpha ^2)}/({\chi ^2+\alpha ^2})\). Since all the variants for the tetragonal \(\mathsf{{M}}\) phase are in a twin relationship [1], the combinations of possible twins within twins solutions are \(\{ {\mathsf{{M}}}_1,{\mathsf{{M}}}_2\}-\{{\mathsf{{M}}}_1,{\mathsf{{M}}}_3\}\), \(\{ {\mathsf{{M}}}_2,{\mathsf{{M}}}_1\}-\{{\mathsf{{M}}}_2,{\mathsf{{M}}}_3\}\), and \(\{ {\mathsf{{M}}}_2,{\mathsf{{M}}}_3\}-\{{\mathsf{{M}}}_2,{\mathsf{{M}}}_1\}\), where the corresponding twin solutions \(\{\varvec{a}_1, \varvec{n}_1\}\) and \(\{\varvec{a}_2, \varvec{n}_2\}\) are to be considered from Eq. (3.12).

Using Eq. (3.6)\(_1\) and the solutions to the corresponding twin pairs from Eq. (3.12), we get \(\varvec{{\mathcal {G}}}_2\) as the following diagonal tensor for all the possible combinations of twins within twins listed above:

$$\begin{aligned}{} & {} \varvec{{\mathcal {G}}}_2 =\tilde{\varvec{A}}^T\cdot \tilde{\varvec{A}} =\Lambda _1 \varvec{j}_1\otimes \varvec{j}_1 + \Lambda _2 \varvec{j}_2\otimes \varvec{j}_2 + \Lambda _3 \varvec{j}_3\otimes \varvec{j}_3, \quad \text {where} \end{aligned}$$
(3.13)
$$\begin{aligned}{} & {} \Lambda _1 =\left[ 1-\kappa _2\upsilon \right] ^2<1, \nonumber \\{} & {} \Lambda _2 =[(1+\kappa _1)\alpha ^2+(1-\kappa _1)\chi ^2]^2[(1+\kappa _2)\chi ^2+(1-\kappa _2)\alpha ^2]^2/(\chi ^2+\alpha ^2)^4=1, \quad \text {and}\quad \nonumber \\{} & {} \Lambda _3 = \left[ 1+\kappa _1\upsilon \right] ^2>1, \end{aligned}$$
(3.14)

and we have used the facts \(\chi >1\), \(\alpha <1\), \(0<\upsilon <1\), and \(0<\kappa _1,\kappa _2<1\). The eigenvectors \(\varvec{j}_1\), \(\varvec{j}_2\), and \(\varvec{j}_3\) are functions of the vectors \(\varvec{c}_1\), \(\varvec{c}_2\), and \(\varvec{c}_3\) and depend on the combinations of the variants in the twins as obtained below. We have imposed the condition \(\Lambda _2=1\) in Eq. (3.14)\(_2\), which is a requirement for \(\varvec{{\mathcal {G}}}_2\) given by Eq. (3.13) to represent a twins within twins as discussed above, and from that, the following two conditions on the parameters \(\chi ,\,\alpha ,\,\kappa _1,\,\kappa _2\) are obtained:

$$\begin{aligned} \frac{1}{\kappa _1}- \frac{1}{\kappa _2}= & {} \upsilon , \quad \text {or} \nonumber \\ (\kappa _1-\kappa _2)\frac{1}{\upsilon }+\kappa _1\kappa _2= & {} \frac{2}{\upsilon ^2}. \end{aligned}$$
(3.15)

Since \(0<\upsilon <1\), Eq. (3.15)\(_1\) is satisfied if and only if \(\kappa _1<\kappa _2\). For \(\kappa _1=\kappa _2\), Eq. (3.15)\(_1\) yields the trivial condition \(\chi =\alpha \), which does not yield twins. It is easy to verify that for all \(\chi >1\) and \(\alpha <1\) no \(0<\kappa _1,\kappa _2<1\) satisfy Eq. (3.15)\(_2\), and hence, we disregard this relation. Finally, considering \(\kappa _1\) and \(\kappa _2\) are related by Eq. (3.15)\(_1\), and using the expressions for \(\Lambda _1\) and \(\Lambda _3\) given by Eqs. (3.14)\(_{1,3}\) into Eqs. (3.7)\(_{1,2}\), (3.9)\(_1\), and (3.10), we get the solutions for \({\varvec{b}}\) and \({\varvec{m}}\) for different twins within twins as listed below.

Case I For \(\{{\mathsf{{M}}}_1,{\mathsf{{M}}}_2\}-\{{\mathsf{{M}}}_1,{\mathsf{{M}}}_3\}\) twin pairs

For \(\{{\mathsf{{M}}}_1,{\mathsf{{M}}}_2\}-\{{\mathsf{{M}}}_1,{\mathsf{{M}}}_3\}\) twin pairs, the indices are \(i=k=1\), \(j=2\), and \(l=3\), and \(\kappa _1\) and \(\kappa _2\) are the volume fractions of \(\mathsf{{M}}_2\) and \({\mathsf{{M}}}_3\) in the respective twins. The eigenvectors of \(\varvec{{\mathcal {G}}}_2\) tensor are obtained as \(\varvec{j}_1=\varvec{c}_3\), \(\varvec{j}_2=\varvec{c}_1\), and \(\varvec{j}_3=\varvec{c}_2\). The vectors \({\varvec{b}}\) and \(\tilde{{\varvec{m}}}\) are obtained using Eqs. (3.14)\(_{1,3}\) in Eq. (3.9)\(_{1,2}\) as

$$\begin{aligned} {\varvec{b}}= & {} \frac{\zeta _2}{\sqrt{(\kappa _1+\kappa _2)[2+(\kappa _1-\kappa _2)\upsilon ]}}\left[ \xi (1-\kappa _2\upsilon ) \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\varvec{c}_2+(1+\kappa _1\upsilon ) \sqrt{\kappa _2(2-\kappa _2\upsilon )}\,\varvec{c}_3\right] , \quad \text {and}\nonumber \\ \tilde{{\varvec{m}}}= & {} \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon }{\zeta _2 \sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ \xi \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\,\varvec{c}_2- \sqrt{\kappa _2(2-\kappa _2\upsilon )}\,\,\varvec{c}_3\right] , \end{aligned}$$
(3.16)

respectively. Using Eqs. (3.16)\(_2\) and (3.7)\(_{1,2}\) in Eq. (3.10), we finally get \({\varvec{m}}\) as

$$\begin{aligned} {\varvec{m}} = \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon \alpha }{\zeta _2\sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ -\xi \upsilon \kappa _2^{1.5}\sqrt{2-\kappa _2\upsilon }\,\,\varvec{c}_1 +\xi \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\,\varvec{c}_2- \sqrt{\kappa _2(2-\kappa _2\upsilon )}(1+\kappa _2\upsilon )\,\,\varvec{c}_3\right] , \nonumber \\ \end{aligned}$$
(3.17)

where the condition \(|{\varvec{m}}|=1\) yields

$$\begin{aligned} \zeta _2= \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon \alpha }{\sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ \kappa _2(2-\kappa _2\upsilon )(1+2\kappa _2\upsilon +2\kappa _2^2\upsilon ^2)+ \kappa _1(2+\kappa _1\upsilon )\right] ^{1/2}. \end{aligned}$$
(3.18)

Case II For \(\{{\mathsf{{M}}}_2,{\mathsf{{M}}}_1\}-\{{\mathsf{{M}}}_2,{\mathsf{{M}}}_3\}\) twin pairs

In this case, the indices are \(i=k=2\), \(j=1\), and \(l=3\), and \(\kappa _1\) and \(\kappa _2\) are the volume fractions of \(\mathsf{{M}}_1\) and \({\mathsf{{M}}}_3\) in the respective twins. The eigenvectors for \(\varvec{{\mathcal {G}}}_2\) tensor are given by \(\varvec{j}_1=\varvec{c}_3\), \(\varvec{j}_2=\varvec{c}_2\), and \(\varvec{j}_3=\varvec{c}_1\). The vectors \({\varvec{b}}\), \(\tilde{{\varvec{m}}}\), and \({\varvec{m}}\) are obtained in a manner similar to case-I, as

$$\begin{aligned} {\varvec{b}}= & {} \frac{\zeta _2}{\sqrt{(\kappa _1+\kappa _2)[2+(\kappa _1-\kappa _2)\upsilon ]}}\left[ \xi (1-\kappa _2\upsilon ) \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\varvec{c}_1+(1+\kappa _1\upsilon ) \sqrt{\kappa _2(2-\kappa _2\upsilon )}\,\varvec{c}_3 \right] , \nonumber \\ \tilde{{\varvec{m}}}= & {} \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon }{\zeta _2 \sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ \xi \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\,\varvec{c}_1- \sqrt{\kappa _2(2-\kappa _2\upsilon )}\,\,\varvec{c}_3 \right] , \quad \text {and} \end{aligned}$$
(3.19)
$$\begin{aligned} {\varvec{m}} ={} & {} \quad \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon \alpha }{\zeta _2\sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ \xi \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\,\varvec{c}_1- \xi \upsilon \kappa _2^{1.5}\sqrt{2-\kappa _2\upsilon }\,\,\varvec{c}_2 - \sqrt{\kappa _2(2-\kappa _2\upsilon )}(1+\kappa _2\upsilon )\,\,\varvec{c}_3\right] , \nonumber \\ \end{aligned}$$
(3.20)

where \(\zeta _2\) is given by Eq. (3.18).

Case III For \(\{{ \mathsf{{M}}}_3,{\mathsf{{M}}}_1\}-\{{\mathsf{{M}}}_3,{\mathsf{{M}}}_2\}\) twin pairs

In this case, the indices are \(i=k=3\), \(j=1\), and \(l=2\), and \(\kappa _1\) and \(\kappa _2\) are the volume fractions of \(\mathsf{{M}}_1\) and \({\mathsf{{M}}}_2\) in the respective twins. The eigenvectors for \(\varvec{{\mathcal {G}}}_2\) tensor are given by \(\varvec{j}_1=\varvec{c}_2\), \(\varvec{j}_2=\varvec{c}_3\), and \(\varvec{j}_3=\varvec{c}_1\). The vectors \({\varvec{b}}\), \(\tilde{{\varvec{m}}}\), and \({\varvec{m}}\) are obtained, in a manner similar to case-I and case-II, as

$$\begin{aligned}{} & {} {\varvec{b}} = \frac{\zeta _2}{\sqrt{(\kappa _1+\kappa _2)[2+(\kappa _1-\kappa _2)\upsilon ]}} \left[ \xi (1-\kappa _2\upsilon ) \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\varvec{c}_1+(1+\kappa _1\upsilon ) \sqrt{\kappa _2(2-\kappa _2\upsilon )}\,\varvec{c}_2 \right] , \nonumber \\{} & {} \quad \tilde{{\varvec{m}}}= \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon }{\zeta _2 \sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ \xi \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\,\varvec{c}_1- \sqrt{\kappa _2(2-\kappa _2\upsilon )}\,\,\varvec{c}_2 \right] , \quad \text {and} \end{aligned}$$
(3.21)
$$\begin{aligned}{} & {} {\varvec{m}} = \frac{\sqrt{\kappa _1+\kappa _2}\,\,\upsilon \alpha }{\zeta _2\sqrt{2+(\kappa _1-\kappa _2)\upsilon }}\left[ \xi \sqrt{\kappa _1(2+\kappa _1\upsilon )}\,\,\varvec{c}_1- \sqrt{\kappa _2(2-\kappa _2\upsilon )}(1+\kappa _2\upsilon )\,\,\varvec{c}_2- \xi \upsilon \kappa _2^{1.5}\sqrt{2-\kappa _2\upsilon }\,\,\varvec{c}_3\right] , \nonumber \\ \end{aligned}$$
(3.22)

where \(\zeta _2\) is given by Eq. (3.18).

In summary, we have obtained the general analytical solution for \(\varvec{a}_1\), \(\varvec{a}_2\), \(\varvec{n}_1\), \(\varvec{n}_2\), \({\varvec{b}}\), and \({\varvec{m}}\) listed in Eqs. (3.7), (3.9), and (3.10). The rotation tensors \(\varvec{Q}_1\), \(\varvec{Q}_2\), and \(\varvec{Q}_3\) can finally be obtained using Eqs. (3.3)\(_{1,2}\) and (3.4). The twins within twins solutions for the cubic to tetragonal MTs are listed in Eqs. (3.12) and (3.16) to (3.22). Since the volume fractions \(\kappa _1\) and \(\kappa _2\), which satisfy the relation given by Eq. (3.15)\(_1\), cannot be determined uniquely, there are many solutions possible for each of the twins within twins listed in Cases (I), (II) and (III) (also see Chapter 7 of [1]). The width of the twins within twins interface (shaded region in Fig. 1) \(\delta _\kappa \) is indeterminate within the governing equations at hand.

4 Results and discussions

We present the simulation results for twins within twins microstructure obtained using our phase-field approach. The materials properties for NiAl alloy, which exhibits cubic to tetragonal MTs, are considered and listed in Sect. 4.1. In Sect. 4.2, the phase-field results are compared with the crystallographic solution obtained in Sect. 3 (Fig. 2).

4.1 Material parameters

Fig. 2
figure 2

Unit cells for cubic austenite and three tetragonal martensitic variants

The material parameters for NiAl alloy are enlisted here. We consider the interfacial widths and energies as \(\delta _{0M}=1\) nm, \(\gamma _{0M}=0.2\) N/m, \(\delta _{12}= \delta _{13}=\delta _{23}=0.75\) nm, and \(\gamma _{12}=\gamma _{13}=\gamma _{23}=0.1\) N/m. Using the following analytical relations between the interfacial thickness and energy and the phase-field parameters [73, 83]

$$\begin{aligned} \delta _{0M}=\sqrt{\frac{18\beta _{0M}}{\rho _0A_{0M}}}; \qquad \beta _{0M}=\gamma _{0M}\delta _{0M}; \qquad \delta _{ij}=\sqrt{\frac{-18\beta _{ij}}{\rho _0{\bar{A}}}}, \qquad \beta _{ij}=-\gamma _{ij}\delta _{ij}, \end{aligned}$$
(4.1)

which were obtained by solving an 1D Ginzburg–Landau equation neglecting mechanics, we obtain \(\rho _0A_{0M} =3600\) MPa, \(\rho _0{\bar{A}}=2400\) MPa, \(\beta _{0M}= 2\times 10^{-10}\) N, and \(\beta _{12}=\beta _{13}=\beta _{23}=-7.5\times 10^{-11}\) N. We take \(\theta _e=215\) K and \(\rho _0\Delta s = -1.47\) MPa K\(^{-1}\), using which we calculate the critical temperatures for \({\mathsf{{A}}}\rightarrow {\mathsf{{M}}}\) and \({\mathsf{{M}}}\rightarrow {\mathsf{{A}}}\) transformations as (see [51]) \(\theta ^c_{{\mathsf{{A}}}\rightarrow {\mathsf{{M}}}}= \theta _e+A_{0M}/(3\Delta s) =0\) K and \(\theta ^c_{{\mathsf{{M}}}\rightarrow {\mathsf{{A}}}}= \theta _e-A_{0M}/(3\Delta s) =430\) K, respectively. The Lamé constants, assuming isotropic elastic response of the phases, are taken to be identical for all the phases \(\mathsf{{A}}\), \({\mathsf{{M}}}_1\), \({\mathsf{{M}}}_2\), and \({\mathsf{{M}}}_3\): \({{\bar{\lambda }}}_0={{\bar{\lambda }}}_1={{\bar{\lambda }}}_2 ={{\bar{\lambda }}}_3=74.62 \) GPa, \({{\bar{\mu }}}_0={{\bar{\mu }}}_1={{\bar{\mu }}}_2={{\bar{\mu }}}_3=72\) GPa. The other constant parameters are taken as [20] \(a=a_\beta =a_\varepsilon =a_K=3\), \(a_c=10^{-3}\), \(\rho _0K_{12}=\rho _0 K_{23}=\rho _0K_{13}=50 \) GPa, \(\rho _0K_{012}=\rho _0 K_{023}=\rho _0K_{013}=5 \) GPa, \(\rho _0K_{0123}= \rho _0K_{123}=50 \) GPa, and \(L_{0M}=2600\) (Pa-s)\(^{-1}\). The kinetic coefficient \(L_{ij}\) given by Eq. (2.28) is taken as 12600 (Pa-s)\(^{-1}\) when it assumes a nonzero value for all \(i,j=1,2,3\) and \(i\ne j\). The transformation stretches are \(\alpha = 0.922\) and \(\chi =1.215\) [45].

4.2 Numerical result for twins within twins

Fig. 3
figure 3

Evolution of twin within twin microstructure in a 22 nm\(\times 22\) nm\(\times 22\) nm cube shown by the color plot of \(\eta _{eq}=\eta _0(1-0.67\eta _1-0.33\eta _2)\): \(\eta _{eq}=0.33\) denotes \(\mathsf{{M}}_1\); \(\eta _{eq}=0.67\) denotes \(\mathsf{{M}}_2\); \(\eta _{eq}=1\) denotes \(\mathsf{{M}}_3\)

For the simulation, we consider a 22 nm\(\times 22\) nm\(\times 22\) nm cube as the reference body \(V_0\), as shown in Fig. 3a. The periodic BC given by Eq. (2.58) is used for all the order parameters on the respective opposite faces of the cube domain. We have used the periodic BC for the normal component of the displacement vector given by Eq. (2.60) on the opposite faces of the cube, where the homogeneous deformation gradient is taken as \(\varvec{F}_h ={\varvec{I}}+ 0.98\,\varvec{n}_0\otimes \varvec{n}_0\), and \(\varvec{n}_0\) is the unit normal to the opposite faces of the cube \(V_0\). On each face of the cube domain, we have used the traction-free BC for the tangential components of the first Piola–Kirchhoff traction vector (see Eq. (2.59)). The temperature of the sample is taken as \(\theta =0\) K. The material properties listed in Sect. 4.1 are used. The Bain tensors listed in Eq. (3.11) are used in Eq. (2.5), where the basis vectors \(\varvec{c}_1\), \(\varvec{c}_2\), and \(\varvec{c}_3\) of \(\mathsf{{A}}\) unit cell are parallel to basis vectors \(\varvec{e}_1\), \(\varvec{e}_2\), and \(\varvec{e}_3\), respectively, attached to the sample \(V_0\) (see Fig. 3a). The initial distribution of the order parameters is taken between \(0\le \eta _0\le 1\), \(0\le \eta _1\le 0.8\), and \(0\le \eta _2\le 0.8\), all distributed randomly, as shown in Fig. 3a. The remaining order parameter \(\eta _3\) is calculated using Eq. (2.1) for all the times \(t\ge 0\). In particular, we have shown a color plot for an equivalent order parameter defined as \(\eta _{eq}=\eta _0(1-0.67\eta _1-0.33\eta _2)\), which takes the following values at different phases: \(\eta _{eq}=0\) in \(\mathsf{{A}}\), \(\eta _{eq}=0.33\) in \(\mathsf{{M}}_1\), \(\eta _{eq}=0.67\) in \(\mathsf{{M}}_2\), and \(\eta _{eq}=1\) in \(\mathsf{{M}}_3\). We have developed a nonlinear finite element procedure, similar to [82], described in Appendix B. A finite element code has been developed using an open-source package deal.ii [84]. The domain is discretized spatially with quadratic brick elements, and it is ensured that at least three grid points lie across all the interfaces. The mesh density in the 3D domain is shown in Fig. 4a, and the mesh density on one of the boundaries is shown in Fig. 4b. The time derivatives of the order parameters are discretized using the backward difference scheme of order two, as described in Appendix B. A constant time step size of \(\Delta t^n=2\times 10^{-16}\) s is used for the simulation.

Fig. 4
figure 4

Mesh density a in 3D computational domain, and b on one of the external boundaries

The evolution of the microstructure is shown at different time instances in Fig. 3a–d. Figure 3b, c show the intermediate microstructures at different time instances approaching a twinned microstructure. We finally obtain a twins within twins microstructure between the twin pairs \(\mathsf{{M}}_1-\mathsf{{M}}_2\) and \(\mathsf{{M}}_1-\mathsf{{M}}_3\) as shown in Fig. 3d. The microstructure shown in Fig. 3d is a little far from being a stationary one. However, the twinned microstructure obtained here can be compared with the analytical solution as there is no further significant change in the orientations of the twin and twin–twin boundaries with time, as observed in Fig. 3c, d. The microstructures on three faces of the domain with unit normals parallel to \(\varvec{e}_3\), \(\varvec{e}_2\) and \(\varvec{e}_1\) are shown in Fig. 5a–c, respectively.

Fig. 5
figure 5

Twin microstructures on three mutually perpendicular faces of the sample shown in Fig. 3d. The color plots of \(\eta _{eq}\) are shown: \(\eta _{eq}=0.33\) denotes \(\mathsf{{M}}_1\); \(\eta _{eq}=0.67\) denotes \(\mathsf{{M}}_2\); \(\eta _{eq}=1\) denotes \(\mathsf{{M}}_3\)

The plots for the components of the Cauchy elastic stress tensor (in GPa) at \(t=2.25\) ps (corresponding to the microstructure shown in Fig. 3d) are shown in Fig. 6. The internal stresses are concentrated mainly across the twin–twin boundaries and twin boundaries (see, e.g., [85] for experimental results). The stresses vary from compressive to tensile between two adjacent variant plates within and near the twin–twin interfaces. For a better understanding of the elastic stresses across the interfaces, we have shown a plot for the components across the lines joined by the points O and A, and C and B (see Fig. 3d) in Fig. 7a, b, respectively. In these two figures, the elastic stresses within \(\mathsf{{M}}_1-\mathsf{{M}}_3\) and \(\mathsf{{M}}_1-\mathsf{{M}}_2\) twin boundaries are plotted. Figure 7b covers the stresses across a twin–twin boundary. All the normal stresses \(\sigma _{(e)11}\), \(\sigma _{(e)22}\), and \(\sigma _{(e)33}\) are significantly higher across the twin boundaries compared to the adjacent phases. The reason for such large elastic stresses within the twin boundaries is studied in detail by the authors in [86]. The shear stresses \(\sigma _{(e)12}\) and \(\sigma _{(e)13}\) on the corresponding external boundary (having unit normal \(\varvec{e}_1\) in \(\Omega _0\)) are much lower due to the traction-free BC applied in the tangential plane. The stresses across the twins within twins boundaries are usually much higher than that across the twin boundaries (also see Fig. 7). This can be explained using the fact that the twin–twin boundaries are compatible in an average sense, whereas the twin boundaries are compatible in Hadamard’s sense according to the crystallographic theory (see Chapter 5 of [1]). Understanding the stress distributions across these interfaces is important from the materials design perspective [85].

Fig. 6
figure 6

Components of the Cauchy elastic stress tensor (in GPa) in the sample shown in Fig. 3d

Fig. 7
figure 7

Components of the Cauchy elastic stress tensor across the lines drawn between points (a) O and A, and (b) C and B shown in Fig. 3d. a and b the elastic stress distribution across the twin boundaries between variants \(\mathsf{{M}}_1-\mathsf{{M}}_3\) and \(\mathsf{{M}}_1-\mathsf{{M}}_2\), respectively

Fig. 8
figure 8

Plot for \(\kappa _2\) versus \(\kappa _1\) given by Eq. (3.15)\(_1\) for NiAl alloy

Table 1 Crystallographic solutions for twins between the variants for NiAl alloy in \(\{\varvec{c}_1\,,\varvec{c}_2\,,\varvec{c}_3 \}\) basis
Fig. 9
figure 9

A plane within a twins-twins interface (within the domain shown in Fig. 3d) is shown by the rectangle. The unit normal to that plane is \({\varvec{m}} =0.1020\,\varvec{c}_1-0.7204\,\varvec{c}_2+0.6859\,\varvec{c}_3\)

Comparison of crystallographic and numerical solutions. We now present a comparative study of the microstructure obtained numerically with the crystallographic solution obtained in Sect. 3.2. We have shown the twins within twins microstructures obtained using the present phase-field approach in Fig. 3, where the twins with variants pair \({\mathsf{{M}}}_1-{\mathsf{{M}}}_2\) are forming interfaces with the twins made of variants pair \({\mathsf{{M}}}_1-{\mathsf{{M}}}_3\), i.e., the indices of Fig. 1 are \(i=k=1\), \(j=2\), and \(l=3\). The normal to interfaces between \({\mathsf{{M}}}_1\) and \({\mathsf{{M}}}_2\) plates are making approximately \(45^\circ \) with both \(\varvec{e}_1\) and \(\varvec{e}_2\) axes, and the normal to interfaces between \({\mathsf{{M}}}_1\) and \({\mathsf{{M}}}_3\) plates are making approximately \(45^\circ \) with both \(\varvec{e}_1\) and \(\varvec{e}_3\) axes. The inclinations of the twin boundaries agree with the crystallographic solution listed in Table 1. The volume fractions of \({\mathsf{{M}}}_1\) in the respective twin pairs are calculated as 0.54 and 0.44, respectively, from the simulation result. Hence, \(\kappa _1=0.46\) and \(\kappa _2=0.56\). The unit normal to one of the twins within twins interface (shown by a rectangular plane with red lines in Fig. 9) \({\varvec{m}}\) is obtained as (pointing from \({\mathsf{{M}}}_1-{\mathsf{{M}}}_3\) side to \({\mathsf{{M}}}_1\)-\({\mathsf{{M}}}_2\) side)

$$\begin{aligned} {\varvec{m}} =0.1020\,\varvec{c}_1-0.7204\,\varvec{c}_2+0.6859\,\varvec{c}_3. \end{aligned}$$
(4.2)

We now calculate \({\varvec{m}}\) using the analytical solution given by Eq. (3.17). As mentioned earlier, the volume fractions \(\kappa _1\) and \(\kappa _2\), satisfying the relation given by Eq. (3.15)\(_1\) and plotted in Fig. 8, cannot be uniquely determined from the limited governing equations at hand. We assume \(\kappa _1=0.46\) for the analytical solution based on the numerical data and obtain \(\kappa _2=0.5250\) using the analytical expression given by Eq. (3.15)\(_1\). The analytical and numerical results for \(\kappa _2\) differ by 7.7%. Finally, using Eqs. (3.16), (3.17), and (3.18), we obtain \(\zeta _2=0.2634\) and the vectors \({\varvec{b}}\), \(\tilde{{\varvec{m}}}\), and \({\varvec{m}}\) as

$$\begin{aligned} {\varvec{b}}= & {} \pm 0.16\,\varvec{c}_2+0.2093\,\varvec{c}_3, \nonumber \\ \tilde{{\varvec{m}}}= & {} \pm 0.7119\,\varvec{c}_2-0.7115\,\varvec{c}_3, \quad \text {and}\nonumber \\ {\varvec{m}}= & {} \mp 0.0927\,\varvec{c}_1\pm 0.6564\,\varvec{c}_2-0.7487\,\varvec{c}_3. \end{aligned}$$
(4.3)

The maximum difference in the components of analytical and numerically obtained unit normals \({\varvec{m}}\) is approximately 9%, and the orientation of the analytical \({\varvec{m}}\) differs by \(5.3^\circ \) from the numerical one. One of the main sources of difference is that in the numerical solution, local stress fields and their relaxation by incomplete martensitic variants at the twin within twin interface are automatically taken into account. Qualitatively similar twins within twins microstructures shown in Fig. 3d have been observed experimentally during the cubic to tetragonal MTs, for example, in NiAl alloy [9, 87], NiMn alloy [88], etc.

5 Concluding remarks

This paper has mainly threefold contributions, which are summarized below:

(i) The phase-field model for multivariant MTs developed by the authors in [20] is revisited in this paper. That model is analyzed, and the issues are found related to the nonlocal gradient energy and the coupled kinetics related to the variant–variant transformations in [20]. Those issues are resolved here, and a non-contradictory and thermodynamically consistent multiphase phase-field model for studying multivariant MTs is presented. The present model considers \(N+1\) order parameters to describe the austenite and N martensitic variants similar to [20], where the sum of all the order parameters related to the variants is constrained to unity. The local part of the system free energy, composed of the strain energy, barrier energy between the phases and the variants, thermal energy of \(\mathsf{{A}}\) and \(\mathsf{{M}}\) phases, penalization energies for deviation of the variant–variant transformation paths from the prescribed ones and also for the triple and higher junctions, is considered from [20]. The barrier and gradient energies are multiplied with the determinant of the total deformation gradient, which yields the desired form of the structural stress tensor. The kinematic model for the transformation stretch tensor as a linear combination of the Bain strains multiplied with some nonlinear interpolation functions is assumed. The interpolation functions are derived as fourth-degree polynomials in the order parameters using the thermodynamic equilibrium conditions of the homogeneous phases. The coupled mechanics and Ginzburg–Landau equations are derived using the balance laws of linear and angular momentum, and the first and second laws of thermodynamics. The kinetic coefficients related to the evolution of the order parameters for the variants at material points are taken to be piecewise constant functions depending on the values of the order parameters and thermodynamic forces. Neglecting the viscous stresses, the total stress at a point is found to be sum of the elastic and structural parts.

(ii) A general approximate crystallographic solution for the twins within twins microstructure is obtained. The solutions for cubic to tetragonal transformations are presented considering all three variants present within the microstructure.

(iii) A non-monolithic finite element formulation for the coupled mechanics and phase-field equations is developed (see Appendix B). 3D twins within twins microstructure evolution in a single grain is studied, and the numerical results are compared with the crystallographic solution.

The present phase-field model can be used to study complex martensitic microstructures with any number of variants for cubic\(\leftrightarrow \)orthorhombic and cubic\(\leftrightarrow \)monoclinic transformations. The martensitic microstructures induced by defects, including nanovoid surfaces [89], dislocations, etc., can be studied by extending the present model. Note that for large transformation strains for the MTs Si I to Si II, a new martensitic microstructure, which does not obey the mathematical theory of martensite [1,2,3,4,5], was obtained with molecular dynamic simulations in [90]. It will be a challenge for the current (and any other) large-strain theory to simulate such a microstructure.