Introduction

Population analysts are increasingly going beyond fixed vital rate models, such as the life table and stable population, to dynamic (i.e., changing vital rate) models capable of reflecting the observed complexity of demographic change (cf. Schoen 2006). Dynamic models can show population trajectories over age and time, build on demographic regularities, and expand the value of available data. With regard to populations with multiple living states, however, progress in dynamic modeling has been slowed by the lack of a logical and realistic basis for depicting patterned demographic change.

Here, constant cross-product ratios (αs) in the elements of population projection (or transition) matrices are proposed as the basis for structured dynamic modeling of multistate populations. The constant-α approach relaxes the fixed stable assumption while maintaining substantial continuity over time and providing useful relationships between population stocks and flows. Those relationships enable the estimation of interstate transition probabilities using minimal data inputs. The approach can be of value to those modeling multistate populations and to those seeking to estimate the full array of population flows from population stocks or fragmentary data. The rationale for constant αs is set forth, implementation with different inputs is described, and an application to multistate models of poverty status is demonstrated.

Modeling With Constant Cross-Product Ratios

Defining Cross-Product Ratios in the Multistate Context

Multistate models, including models with multiple ages and states, are powerful analytical tools. They can reflect both period and cohort behavior and simultaneously encompass fertility, mortality, and migration. At the same time, multistate models are data intensive, which limits their applicability. Holding cross-product ratios constant greatly reduces multistate data requirements, facilitating both the determination of transition matrices and the calculation of population trajectories.

To develop the model, let us define N living state multistate transition matrix A as

$$ \mathbf{A}=\left[\begin{array}{cccc}{\uppi}_{11}& {\uppi}_{21}& \dots & {\uppi}_{N1}\\ {}{\uppi}_{12}& {\uppi}_{22}& \dots & {\uppi}_{N2}\\ {}\dots & \dots & \dots & \dots \\ {}{\uppi}_{1N}& {\uppi}_{2N}& \dots & {\uppi}_{NN}\end{array}\right], $$
(1)

where πij represents the probability that a person who begins the interval in state i ends the interval in state j. For now, assume that there is no mortality or other form of attrition. Then each column depicts the end of interval probability distribution of those in a given initial state. With Σj πij = 1 for all initial states i, transition matrix A is column stochastic (i.e., the columns sum to 1).

The N state transition matrix can have (N – 1)2 independent cross-product ratios. Let the element of any matrix A in row i and column j be A(i,j). With no zero elements in A, cross-product ratio αij can be written as

$$ {\upalpha}_{ij}=A\left(1,1\right)\times A\left(i,\kern-0.20em ,j\right)/\left[A\left(1,j\right)\times A\left(i,1\right)\right]=\left({\pi}_{11}{\pi}_{ji}\right)/\left({\pi}_{j1}{\pi}_{1i}\right). $$
(2)

Transition matrix A can be used to define matrix Y of cross-product ratios, where

$$ Y=\left[\begin{array}{cccc}1& 1& \dots & 1\\ {}1& {\upalpha}_{22}& \dots & {\upalpha}_{2N}\\ {}\dots & \dots & \dots & \dots \\ {}1& {\upalpha}_{N2}& \dots & {\upalpha}_{NN}\end{array}\right]. $$
(3)

The elements in the first row and first column of Y are set to 1, and the (i, j)th element (i, j > 1) reflects the cross-product ratio defined in Eq. (2). Matrix Y thus contains the (N – 1)2 cross-product ratios implied by Eq. (2) and the A matrix of Eq. (1).

If A is a 2 × 2 matrix, the sole α is the same as the array’s odds ratio. For example, if

$$ \mathbf{A}=\left[\begin{array}{cc}.8& .25\\ {}.2& .75\end{array}\right], $$

then

$$ \mathbf{Y}=\left[\begin{array}{cc}1& 1\\ {}1& 12\end{array}\right]. $$

In general, αs are of the form (πhi / πhj) (πkj / πki), hk and ij, with the probabilities from base matrix A. The αs are of three types. In Type I, there are two matrix diagonal elements in the numerator, and the denominator has two probabilities that reflect offsetting transitions between those two states. Hence there is no net movement in Type I. In Type II, either the numerator or the denominator has a single diagonal element, with the other probability in that product indicating an interstate transition. The product of probabilities in the other portion of a Type II ratio indicates the same net transition, but via an intermediate state. In Type III, possible when N > 3, there are no diagonal elements. The products in the numerator and denominator indicate transitions from two sending states to two receiving states, with the products reflecting the same net transitions. In all three types, the products in the numerator and denominator indicate different, but qualitatively equivalent, transitions. Thus constant cross-product ratios impose a structure that constrains all the transitions, or interactions, between the states.

An important property of cross-product ratios is that they are not affected by row or column multiplication of A. Let R be an N × N diagonal matrix of row factors whose ith diagonal element is ri, and let C be an N × N diagonal matrix of column factors whose jth diagonal element is cj. By definition in Eq. (2), the set of αs for the matrix Z = RYC is the same as the set of αs for matrices A and Y (cf. Agresti 2013:45–46). In short, the cross-product ratios of a matrix are invariant to pre- and/or post-multiplication by diagonal matrices.

Interpreting Cross-Product Ratios

Cross-product ratios have been used by statisticians since at least the 1930s. They are at the heart of an estimating procedure known by many names, including biproportional (or multiproportional) adjustment (BPA), iterative proportional fitting (IPF), and the RAS method. Extensive discussions can be found in Bishop et al. (1975) and Willekens (1982). BPA based on fixed αs has often been used to estimate the array of interstate movements from population distributions at the beginning and end of an interval, beginning with the work of Kruithof (1937) and continuing through Chilton and Poet (1973), Philipov (1978), Nair (1985), and others. Iteration was employed to find the elements of the Z matrix that, given a base Y, satisfied the projection relationship

$$ {\mathbf{x}}_{end}=\mathbf{Z}\ {\mathbf{x}}_{beg}=\mathbf{RYC}\ {\mathbf{x}}_{beg}, $$
(4)

where x is a vector showing populations by state, and beg and end refer, respectively, to the beginning and end of the age or time interval. Matrix Z = RYC provides the full multistate transition matrix.

The BPA procedure has many desirable statistical properties (cf. Batty and Mackie 1972; Bishop et al. 1975). The iterative solution is unique, preserves the array of cross-product ratios, and gives a maximum likelihood estimate. That solution maximizes entropy in that it finds the pattern of interstate flows that can be achieved in the greatest number of ways (Halli and Rao 1992).

We can enlarge that perspective by viewing transition probability matrix A as a contingency table and bringing to bear the substantial body of research on the analysis of such tables (e.g., Agresti 2013; Bishop et al. 1975; Goodman and Kruskal 1979). Willekens (1982) appears to have been the first to formally link the estimation of multistate arrays using constant cross-product ratios to the log linear modeling of contingency table data. In two-way (or two dimensional) arrays, such as those considered here, a log linear model can be written as

$$ \log\ {p}_{ij}={\uplambda}_{i.}+{\uplambda}_{.j}+{\uplambda}_{ij}, $$
(5)

where the model has been scaled so that the overall λ is zero, λi. indicates the main effect of row i, λ.j indicates the main effect of column j, and λij is the (i, j) interaction effect. Eq. (5) represents a saturated model—that is, one in which all possible effects are included. Hence, the model fully specifies the pij (Agresti 2013:340–341). Exponentiating Eq. (5), and adjusting the subscripts to conform to Eq. (1), the analogous multistate equation can be written as

$$ {\uppi}_{ji}={r}_i\ {c}_j\ {\upalpha}_{ij}, $$
(6)

where ri is the row effect, cj is the column effect, and αij is the interaction effect, as embodied in the cross-product ratio. Per Eq. (4), Eq. (6) specifies the (i, j)th element of matrix Z = RYC.

Demographically, an approach to dynamic modeling that essentially used the Z = RYC form to represent change over time was the hyperstable model of Schoen and Kim (1994), which was extended to multiage and multistate models in Schoen (2006). McFarland (1972) used BPA to reconcile the number of brides and grooms in marriage analyses involving both male and female rates, offering a solution to the so-called two-sex problem of demography. Despite its positive features, Schoen and Jonsson (2003) criticized the constant-α approach for its lack of a demographic interpretation and proposed a relative state attractiveness (RSA) procedure. RSA assigns each state a factor representing how “attractive” that state is and, starting from a base matrix, finds the state-specific factors that would take the initial population to the end of interval population. Although the two approaches differ conceptually and procedurally, repeated applications to data consistently found that BPA with constant cross-product ratios and RSA yielded estimates that were extremely close to one another.

On closer examination, the thrust of the two approaches is quite similar. Aside from their treatment of the diagonal elements of the transition matrices, both BPA with constant cross-product ratios and RSA create estimates that are based on pre- and post-multiplication of a base matrix by diagonal matrices. The insight in Willekens (1982) that constant-α model calculations yield results corresponding to those of fixed interaction log linear models is consistent with Schoen and Jonsson’s (2003) argument that the RSA factors reflect the appeal of each state. Those attractiveness factors correspond to the log linear main effects. The constant αs correspond to the log linear interactions between states.

Is the assumption of fixed cross-product ratios reasonable? Previous estimations using αs from a presumably suitable base matrix yielded plausible results (Chilton and Poet 1973; Nair 1985; Philipov 1978), although there were some anomalies when small populations were involved. Schoen and Jonsson (2003) systematically compared constant-α estimates to observed values in both model and actual populations. In a two-state model where the probabilities of transfer were deliberately manipulated, the constant-α estimates were generally quite close to known model values. Errors were larger when both transfer probabilities increased or decreased, a circumstance contrary to the relative attractiveness assumption. In a four-state (single, married, widowed, and divorced) marital status model, reasonably good results were found. There was some sensitivity to the choice of base α values, although reasonable age patterns were found in every case considered, and the major summary measures of marriage and divorce were well estimated. Those results are very encouraging, but the constant-α constraint is quite strong. In the N state case, the constant-α assumption imposes (N – 1)2 constraints on N(N – 1) unknowns. The analyst should take care in choosing base α values, place less confidence in estimates based on few transfers or small populations, and give less credence to cases in which the relative attractiveness assumption is violated (e.g., when rates of both marriage and divorce increase). On the other hand, when applied to dynamic modeling, the constant-α approach seems highly appropriate for a wide range of applications.

In short, the constant-α approach is methodologically sound and demographically reasonable. The row and column effects, or relative state attractiveness factors, can change over time while the interactions between states (i.e., the αs) remain the same. Previous research has indicated that constant-α estimates generally perform well. Constant cross-product ratios provide the basis for realistic yet analytically tractable dynamic models. They can reflect changing behavior, trace population trajectories, and estimate demographic flows from population stocks.

Finding Transition Matrices in Models With Constant Cross-Product Ratios

In an N living state matrix of the form of Eq. (1), with no attrition, there are N(N – 1) independent probabilities. The set of constant α in Y provides (N – 1)2 constraints, leaving only (N – 1) additional constraints, which identify the main effects, to be specified.

The objective here is to determine the N × N transition probability matrix for a given interval from a given set of (N – 1)2 cross-product ratios and (N – 1) additional pieces of information. The idea is not to estimate the probabilities, but to find the probabilities that would prevail in a constant-α model with those N(N – 1) values. I examine four possible sources of data on main effects: (1) data on adjacent populations, (2) known probabilities, (3) known transfer rates, and (4) data on numbers of interstate transfers.

Finding Transition Matrices From Adjacent Populations

The principal use of constant cross-product ratios has been to estimate a transition array from a set of αs and the beginning and ending population distributions by state. To illustrate, assume an N living state model consistent with Eq. (1), with no mortality. I focus on transition probabilities and find unknown values from the known relationships, making use of the projection relationship in Eq. (4) and the fact that matrix Z = RYC is column stochastic.

With N living states, the projection relationship provides N – 1 equations, and the column sums provide N more equations. Diagonal matrices R and C are identified only to a scalar factor, so r1 can be set to equal 1. Thus, there are 2N – 1 unknown elements in R and C, and the system of equations is just identified. Numerically, those equations can be simultaneously solved using available mathematical software, such as Mathematica or Maple. Iteration is not necessary. Quadratic or higher-order equations are involved, and negative solutions are to be expected, but there will be only one demographically valid (i.e., real and nonnegative) solution for the factors in R and C and hence for probability matrix Z.

The solution procedure is straightforward. Consider the two living state case. Denoting the ending age/time by 1 and the initial age/time by 0, Eq. (4) yields the matrix projection equation

$$ {\mathbf{x}}_1=\mathbf{RYC}\ {\mathbf{x}}_0, $$
(7)

which for N = 2 can be written as

$$ \left[\begin{array}{c}1-{x}_{21}\\ {}{x}_{21}\end{array}\right]=\left[\begin{array}{cc}1& 0\\ {}0& {r}_2\end{array}\right]\left[\begin{array}{cc}1& 1\\ {}1& \upalpha \end{array}\right]\left[\begin{array}{cc}{c}_1& 0\\ {}0& {c}_2\end{array}\right]\left[\begin{array}{c}1-{x}_{20}\\ {}{x}_{20}\end{array}\right]. $$
(8)

Column vector xt gives the known state distribution of the population at time t (total population scaled to 1), xit denotes the number of persons in state i at time t, and α is the sole cross-product ratio. Column stochasticity in Z = RYC yields the two equations

$$ {\displaystyle \begin{array}{l}1={c}_1+{r}_2{c}_1\\ {}1={c}_2+{r}_2{c}_2\upalpha, \end{array}} $$
(9)

which imply

$$ {c}_1=1/\left(1+{r}_2\right) $$
$$ {c}_2=1/\left(1+{r}_2\ \upalpha \right) $$

and

$$ {r}_2=\left(1\hbox{--} {c}_2\right)/\left({c}_2\ \upalpha \right)=\left(1\hbox{--} {c}_1\right)/{c}_1. $$
(10)

Thus, column effects can readily be written in terms of the αs and the row effects.

From the first row of Eq. (8), we have the scalar equation

$$ {x}_{11}=1\hbox{--} {x}_{21}={c}_1\ {x}_{10}+{c}_2\ \left(1\hbox{--} {x}_{20}\right). $$
(11)

Equations (9) and (11) provide three equations in unknowns r2, c1, and c2. The algebra yields a quadratic with two solutions, but only one yields all positive factors. Taking the negative root,

$$ {c}_2=\left[1\hbox{--} \left({x}_{10}\hbox{--} {x}_{11}\right)\ \left(1\hbox{--} \upalpha \right)\hbox{--} {H}^{\frac{1}{2}}\right]/\left[\ 2\left(1\hbox{--} \upalpha \right)\ \left(1\hbox{--} {x}_{10}\right)\right], $$
(12)

where

$$ H=1+\left({x_{10}}^2+{x_{11}}^2\right)\ {\left(1\hbox{--} \upalpha \right)}^2+2\ {x}_{10}\ {x}_{11}\ \left(1\hbox{--} {\upalpha}^2\ \right)\hbox{--} 2\left(1\hbox{--} \upalpha \right)\ \left({x}_{10}+{x}_{11}\right). $$

The values of r2 and c1 follow from Eqs. (9), and the transition probabilities follow from Eq. (6). Here, π11 = c1, and π22 = r2c2 α.

To show a numerical solution, let the initial fraction in state 1, x10, be 0.4 and the end of interval fraction in state 1, x11, be 0.45, with α = 8. Eqs. (9) and (11) yield two solutions, {r2 = .357685, c1 = .736548, c2 = .258968} and {r2 = –.427129, c1 = 1.745595, c2 = –.413730}. Taking the all positive solution set and using the relationship Z = RYC produces

$$ \mathbf{Z}=\left[\begin{array}{cc}.736548& .258968\\ {}.263452& .741032\ \end{array}\right], $$

a column-stochastic probability matrix that satisfies projection Eq. (8).

Because the preceding approach is quite general, it can be used when the population composition is assumed to be stationary. The initial and ending populations then have the same composition.

In the simultaneous equation approach for N states, 2N – 1 equations need to be solved. Even for N = 3, an explicit algebraic solution is usually not feasible. Maple program XPR3, available in online appendix 1, provides an annotated numerical solution for the N = 3 case.

Using Known Probabilities to Calculate Transition Matrices

Adjacent populations represent just one possible source of data for finding transition probability matrices when the set of cross-product ratios is known. Another possible source is knowledge of (N – 1) probabilities. There are numerous ways in which investigators might obtain those probabilities. Here, I assume that the (N – 1) probabilities of remaining in one’s initial state (i.e., π11 through πN – 1,N – 1) are known.

Calculating the remaining (N – 1)2 transition probabilities requires (N – 1)2 equations of the form of Eq. (2) that relate each of the known α values to their constituent probabilities. Again, the solutions can be found by solving that set of simultaneous equations. When N = 2, there is only one cross-product ratio, α, and the unknown value of π22 is given by

$$ {\uppi}_{22}=\upalpha\ \left(1\hbox{--} {\uppi}_{11}\right)/\left[\upalpha\ \left(1\hbox{--} {\uppi}_{11}\right)+{\uppi}_{11}\right]. $$
(13)

It is evident from Eq. (13) that a larger π11 implies a smaller π22. Hence, if state 1 becomes more attractive, movements from state 2 to state 1 increase, and π22 falls. To gauge the effect of a change in α, we can examine

$$ \partial\ {\uppi}_{22}/\mathrm{\partial \upalpha }={\uppi}_{11}\ \left(1\hbox{--} {\uppi}_{11}\right)/{\left[\ \upalpha\ \left(1\hbox{--} {\uppi}_{11}\right)+{\uppi}_{11}\right]}^2. $$
(14)

Equation (14) indicates that an increase in α increases π22.

When N = 3, there are four αs and four unknown probabilities—say, π13, π31, π23, and π32. From Eq. (2), the cross-product ratio equations can then be written

$$ {\displaystyle \begin{array}{c}{\upalpha}_{22}=\left({\uppi}_{11}{\uppi}_{22}\right)/\left[\left(1\hbox{--} {\uppi}_{11}\hbox{--} {\uppi}_{13}\right)\ \left(1\hbox{--} {\uppi}_{22}\hbox{--} {\uppi}_{23}\right)\right]\\ {}{\upalpha}_{23}=\left({\uppi}_{11}\ {\uppi}_{32}\right)/\left[\left(1\hbox{--} {\uppi}_{11}\hbox{--} {\uppi}_{13}\right)\ {\uppi}_{31}\right]\\ {}\begin{array}{c}{\upalpha}_{32}=\left({\uppi}_{11}\ {\uppi}_{23}\right)/\left[\left(1\hbox{--} {\uppi}_{22}\hbox{--} {\uppi}_{23}\right)\ {\uppi}_{13}\right]\\ {}{\upalpha}_{33}={\uppi}_{11}\ \left(1\hbox{--} {\uppi}_{31}\hbox{--} {\uppi}_{32}\right)/\left({\uppi}_{13}\ {\uppi}_{31}\right)\end{array}.\end{array}} $$
(15)

From Eqs. (15), the solutions for the unknown probabilities are

$$ {\displaystyle \begin{array}{c}{\uppi}_{13}= NUM/\left[{\upalpha}_{22}\left(1\hbox{--} {\uppi}_{22}\right)+{\upalpha}_{32}\ {\uppi}_{22}\right]\\ {}{\uppi}_{31}={\uppi}_{11}\ \left[{\upalpha}_{22}\left(1\hbox{--} {\uppi}_{22}\right)+{\upalpha}_{32}\ {\uppi}_{22}\right]/ DEN\\ {}\begin{array}{c}{\uppi}_{23}={\upalpha}_{32} NUM/\left[{\upalpha}_{22}\left\{{\upalpha}_{32}\left(1\hbox{--} {\uppi}_{11}\right)+{\uppi}_{11}\right\}\right]\\ {}{\uppi}_{32}={\upalpha}_{23}{\uppi}_{22}\ \left[{\upalpha}_{32}\left(1\hbox{--} {\uppi}_{11}\right)+{\uppi}_{11}\right]/ DEN\end{array},\end{array}} $$
(16)

where NUM = α22 (1 – π11) (1 – π22) – π11 π22, and DEN = α22 α33 (1 – π11) (1 – π22) + α23 α32 π22 (1 – π11) + α22 π11 (1 – π22) + π11 π22 (α23 + α32 – α33). Solutions for other combinations of known and unknown probabilities can be found in the same fashion.

Although the unique solution is readily found here, negative probabilities can arise with N > 2. The solution for the πs in Eq. (16) does not follow from a Z = RYC formulation, which assures positive values. Simply put, there are combinations of probabilities and αs that cannot exist in any actual population. For example, if α22 = 50, π11 = 0.95, and π22 = 0.90, then NUM = –0.605, and π13 and π23 are negative. When negative values arise, the choice of αs (or πs) should be revisited.

In the three-state model, a change in either of the two chosen probabilities changes all of the other probabilities. Some numerical explorations indicated that those changes are consistent with the attractiveness notion—for example, a decrease in π11 leads to decreases in the likelihood of moving to state 1 and increases in the probability of leaving state 1. In the four-state model, the same pattern holds for a change in π11, but there is some complexity in the changes in transition probabilities between the other states in the model.

Using Known Transfer Rates to Calculate Transition Probabilities

Knowledge of the cross-product ratios and (N – 1) transfer rates offers another route to finding the transition probabilities in constant-α models. Because the αs are expressed in terms of probabilities, the first step is to go from the rate matrix to the probability matrix. To do so, array the rates in the column-stochastic rate matrix

$$ \mathbf{M}=\left[\begin{array}{cccc}-\Sigma {m}_{1j}& {m}_{21}& \dots & {m}_{N1}\\ {}{m}_{12}& -\Sigma {m}_{2j}& \dots & {m}_{N2}\\ {}\dots & \dots & \dots & \dots \\ {}{m}_{1N}& {m}_{2N}& \dots & -\Sigma {m}_{Nj}\end{array}\right], $$
(17)

where mij is the occurrence/exposure rate of transfer from state i to state j. The sums of the mij over j span all states except i. Under the generally reasonable assumption of linear change (Schoen 1975; Schoen 2006: chap.1), the matrix of transition probabilities is given by

$$ \boldsymbol{\Pi} ={\left(\mathbf{I}-\left(n/2\right)\mathbf{M}\right)}^{-1}\left(\mathbf{I}+\left(n/2\right)\mathbf{M}\right), $$
(18)

where n is the length of the age/time interval, and I is the N × N identity matrix (which has ones on the main diagonal and zeros elsewhere).

Equation (18) yields the transition probabilities in terms of the (N – 1) known rates and the (N – 1)2 unknown rates. The solution for the unknown rates and the probabilities they imply follows from solving the equations that relate those unknown rates to the cross-product ratios.

When N = 2 and rate m12 is known, we have the transition probability matrix

$$ \Pi =\left[\begin{array}{cc}\left(2-n\ {m}_{12}+n\ {m}_{21}/ DEN\right)& 2n\ {m}_{21}/ DEN\\ {}2n\ {m}_{12}/ DEN& \left(2+n\ {m}_{12}-n\ {m}_{21}/ DEN\right)\end{array}\right], $$
(19)

where DEN = (2 + nm12 + nm21). The solution for unknown rate m21 is then

$$ {m}_{21}=\left\{n\ {m}_{12}\ \left(1\hbox{--} 2\upalpha \right)+2{\left[1+\upalpha\ {n}^2{\left({m}_{12}\right)}^2\left(\upalpha \hbox{--} 1\right)\right]}^{\frac{1}{2}}\right\}/n. $$
(20)

The cross-product ratio approach easily handles π-zeros—that is, structural zeros in transition probabilities that occur when persons in a given state cannot move to a specified state. For example, a π-zero probability describes the likelihood that a married person enters the never married state. Each such π-zero rate generates a zero α and has a corresponding zero transfer rate. However, there are also m-zeros—that is, structural zeros in rate matrices when there are indirect but no direct moves between two states, such as when a widowed person moves to the divorced state. Then the rate is zero but the corresponding transition probability is not zero, because a widowed person can marry and then divorce. Statisticians have explored models of that kind under the label “quasi-independence” (cf. Agresti 2013; Goodman 1994; McDonald 2005), and quasi-independence models have seen some use by demographers (e.g., Rogers et al. 2003; Sweeney 1999).

The transfer rate calculation procedure may need to be modified when there are m-zeros, and the known cross-product ratios need to be chosen from a population that has the same pattern of interstate transfers as the multistate model. In the probability matrix derived from a rate matrix with m-zeros, the cross-product ratios may not all be independent. For example, consider the N = 3 path model that has four rates—m12, m21, m23, and m32—but no rates directly connecting states 1 and 3. In that model, α23 = α32, so there are only three distinct cross-product ratios. The model’s rates and transition probabilities can still be found, however, using the three distinct α plus one known nonzero transition rate.

When N = 2, m-zeros do not arise. With N = 3, rates and transition probabilities in models with only one m-zero can be calculated without any modification of the basic approach. In three-state models, there are 15 possible cases with two rates equal to zero. Six cases yield two π-zeros, hence two αs must be equal to zero for Y to be consistent with the structure of the model. Rates and probabilities in these models can be determined from the two nonzero αs and two known nonzero rates. Another three cases are of the form mij = mji = 0. In these cases, α23 = α32, and the rates and probabilities can be found as indicated previously for the N = 3 path model. In the remaining six models, the four nonzero rates can be found from the four nonzero αs. Procedures for models with more states and/or more m-zeros similarly depend on the number of independent αs.

Using Known Numbers of Transfers to Calculate Transition Probabilities

Counts of the number of movements (or decrements) between specified states can also provide some or all of the (N – 1) additional items of information needed to find transition probabilities in constant-α models. Because the αs are defined in terms of πs and the decrements are numerators of rates, the approach here again expresses the πs in terms of rates. In some of these constant-α models, the flows alone can determine the stocks.

In the two-state model, with xit being the number of persons in state i at time t and dij being the number of transfers (decrements) from state i to state j in the given n year interval, the solution equations can be written

$$ {\displaystyle \begin{array}{c}\begin{array}{c}{x}_{11}={x}_{10}\hbox{--} {d}_{12}+{d}_{21}\\ {}{x}_{21}={x}_{20}\hbox{--} {d}_{21}+{d}_{12}\\ {}\upalpha =\left[1\hbox{--} \left({n}^2/4\right)\ {\left({m}_{12}\hbox{--} {m}_{21}\right)}^2\right]/\left[{n}^2\ {m}_{12}\ {m}_{21}\right]\end{array}\\ {}{m}_{12}={d}_{12}/\left[\left(n/2\right)\ \left({x}_{11}+{x}_{10}\right)\right]\\ {}{m}_{21}={d}_{21}/\left[\left(n/2\right)\ \left({x}_{21}+{x}_{20}\right)\right]\end{array}}, $$
(21)

where the first two equations in (21) are flow equations specifying the possible interstate movements, the last two equations in (21) specify the occurrence/exposure transfer rates under the linear assumption, and the third equation in (21) is the cross-product ratio in the N = 2 linear model. Thus, (21) shows five equations with eight unknowns (x10, x11, x20, x21, d12, d21, m12, and m21). Because three more known values are needed to solve the model, knowledge of the two decrements alone is not sufficient. However, the model can be solved numerically if the two decrements and one population value are known.

The same approach can be applied when N = 3. There are 13 equations: three flow equations paralleling the first two equations in (21), four α equations, and six occurrence/exposure equations paralleling the last two equations in (21). Those 13 equations have 18 unknown values (six values of x, six of d, and six of m), and hence the model can be determined if five values of d are known. In trial calculations, the set of equations yielded eight numerical solutions, but only one was demographically valid. In general, with all transfers possible, (2N – 1) known decrements suffice to fully identify an N state model. If the rate matrix has m-zeros, the procedure needs to be modified to use only independent αs.

The four approaches to finding transition probability values from different data sources are summarized in Table 1. The choice of approach is largely data-driven, but each approach builds on a different aspect of constant-α models.

Table 1 Four approaches to finding transition probability matrices in N state multistate models with known constant cross-product ratios

Extensions of the Basic Model

Recognizing Mortality

The previous sections describe how to find the matrix of multistate transition probabilities of a constant-α model from a variety of data sources under the assumption of no mortality. However, it is not difficult to incorporate mortality or other forms of attrition, and I next examine two different ways to do so.

Recognizing Mortality That Is Uniform Over All States

In many analyses, it is reasonable to assume that the same level of mortality affects all states in the model. If the mortality rate in the n year age/time interval is mδ, then the mortality-adjusted rate matrix simply has the term (–mδ) added to every diagonal element of the matrix in Eq. (17).

Recognizing Differential Mortality

Mortality often differs significantly by state, and it is particularly important to incorporate differential mortality when health or disability states are in the model. Additional information that quantifies the nature of the mortality differentials is needed, and this information can come in many forms. Following Preston and Taubman (1994), I focus on measures of survivorship rather than mortality, given that assessing survivorship is the objective and survivorship ratios are much more stable than ratios of mortality rates.

With the beginning and ending populations by state known, I make two plausible assumptions. The first assumption is that (N – 1) ratios indicating relative survivorship are known; that is, the kj, j > 1, where

$$ {k}_j={\uppi}_{i,\mathrm{not}\updelta}/{\uppi}_{1,\mathrm{not}\updelta}, $$
(22)

with πj,notδ denoting the probability that a person in state j at the beginning of the interval is alive (i.e., not in dead state δ) at the end of the interval. The value of π1,notδ can be set equal to 1.

Second, I assume that the risk of death depends only on the person’s state at the beginning of the interval. With the kj ratios known, the level of survivorship, s, can be found from the overall survivorship relationship

$$ {\Sigma}_i\ {x}_{i, end}=s\left({x}_{1, beg}+{k}_2\ {x}_{2, beg}+\dots +{k}_N\ {x}_{N, beg}\right), $$
(23)

where the sum ranges over all living states, and subscripts beg and end refer, respectively, to the beginning and end of the interval. It follows that

$$ {\uppi}_{i,\mathrm{not}\updelta}=s\ {k}_j. $$
(24)

To implement the differential survivorship reflected by the πj,notδ, it is only necessary to calculate probability matrix Z under the assumption that column j sums to πj,notδ. The same approach can also be used to incorporate fertility or migration into the model.

Models With Multiple Ages and States

The multiage and multistate projection matrix typically incorporates fertility, mortality, and interstate transfer. Although large and data intensive, it can be a powerful tool in demographic analyses. With the present age-state (as opposed to age-stage) models, I follow Feeney (1970) and consider a block Leslie projection matrix, with the (1) first row blocks representing the “fertility” that produces the number in each state in age group 1 at the end of the interval, and (2) block subdiagonal matrices that advance the persons in age group j to age group j + 1. The fertility values for each row of the first block can be obtained from standard age-state schedules, scaled to yield the appropriate end of interval population when applied to the relevant initial population.

Each block subdiagonal survivorship matrix Zj can be found from the set of cross-product ratios specific to that subdiagonal block matrix, combined with the beginning and ending populations (or other data inputs). The procedure is identical to the previously described calculation of a transition probability matrix from αs and (N – 1) additional known values.

After all subdiagonal block matrices are found, the multiage and multistate context allows further calculations of interest. Employing the calculated Zj matrices from the lowest age to the highest traces the life course of the synthetic cohort that begins with a known state composition and advances, age by age, according to transition matrices Zj. The multistate life table embodied in the Zj permits the calculation of a variety of measures that summarize the life experience of the synthetic cohort.

Analytical Applications of Constant Cross-Product Ratio Models

Potential Areas of Application of Constant-α Models

Constant-α models are broadly applicable: they can determine transition probabilities in any state space and can structure the trajectories of dynamic populations. For example, models of urbanization, labor force status, and marital status can be applied to subnational and specialized populations, such as counties or religious groups. Demographic aspects of political affiliation, voting behavior, and the presence of specific traits or health conditions can be analyzed from just two cross-sectional surveys. The future evolution of the second demographic transition can be examined via alternative scenarios, as can a dynamic stationary model, in which the number of births is constant but probabilities of interstate transfer vary over age and time.

Here, I provide applications to the understudied area of poverty status. Knowledge of population distributions by poverty status and age, and the assumption of fixed cross-product ratios, allow an in-depth analysis of poverty transitions over the life course.

Applications to Analyses of Poverty Status

Poverty, homelessness, and income inequality are areas of continuing interest to social scientists in the United States (e.g., Fusaro et al. 2018; Hokayem and Heggeness 2014a). An excellent review of the literature in the area can be found in Cellini et al. (2008). The official poverty thresholds, updated annually by the U.S. Census Bureau, are designed to reflect the minimum level of resources needed to meet the basic requirements of a family unit (Hokayem and Heggeness 2014a). The dollar amounts of the thresholds vary by the number and age of family members, but they are the same for all states (except Alaska and Hawaii). Persons whose family is at or below 100% of their poverty threshold can be deemed “in poverty.” There has also been considerable interest in persons who are “near poverty,” typically taken to be persons in families between 100% and 125% of their poverty thresholds. Persons above 125% of their poverty thresholds can be termed “above poverty.”

Movements in and out of poverty are not routinely captured by any data collection system. Much analysis has thus focused on examining cross-sectional levels of poverty prevalence, relying heavily on survey data such as the Panel Study of Income Dynamics and the Survey of Income and Program Participation, as well as the National Longitudinal Survey of Youth and the Annual Social and Economic Supplements (ASEC) to the Current Population Surveys (CPS) (Bane and Ellwood 1986; Iceland 1997; Rank and Hirschl 2015). Those cross-sectional and retrospective data impose serious methodological limitations on the measurement of poverty dynamics. In particular, left truncation, in which the length of initially existing spells of poverty cannot be determined, is a major problem (Edwards 2015; Iceland 1997; Stevens 2011).

A life table approach, which is rate- or probability-driven, can deal with that truncation bias. By using data on the risks of movement into and out of poverty in every age interval, the life table reveals their implications for the experience of a hypothetical cohort of persons. Among recent efforts at estimating poverty transitions are the work of Lee et al. (2017) with Korean data and of Fernandez-Ramos et al. (2016) with Mexican data. The most ambitious recent effort is that of Bernstein et al. (2018), which combined several data sets to estimate transition probabilities and produce a two-state poverty status life table for the United States.

Here, I use the constant-α approach, combined with adjacent population distributions by age and poverty status, to produce a poverty status life table that follows the life course of a birth cohort as it moves through the states of in poverty (IP), near poverty (NP), and above poverty (AP).

The Poverty Data and the Analytical Approach

The poverty data I use come from the 2010 and 2015 CPS ASEC. For 2009 and 2014, respectively, table POV34 from those surveys provides poverty status data by single year of age, for males and females, for below 100% of poverty, between 100% and 125% of poverty, and above 125% of poverty (U.S. Census Bureau 2010, 2015). Those data were combined into five-year age groups, consistent with the five-year time interval used, and provide the needed adjacent population figures.

Probabilities of transition into and out of poverty are from Hokayem and Heggeness (2014b: table 3), who estimated two-year transition probabilities for a three-state (IP, NP, and AP) model from matched cross-sections of March CPS data. Their transition matrix, raised to the power 2.5 to convert the two-year interval to a five-year interval, was used as the base matrix from which the cross-product ratios were calculated. The same base matrix was used at every age.

Mortality data come from the 2012 U.S. life tables for males and for females (Arias et al. 2016: tables 2 and 3). Persons in poverty undoubtedly have higher risks of death that merit incorporation in life table survivorship, a finding that is reinforced by the recent study of U.S. educational mortality disparities by Montez et al. (2019). To approximate those mortality differences, I draw on the analysis of Preston and Taubman (1994) that examined results from the National Longitudinal Mortality Study. Taking mortality in state AP as 1.00, I assume the mortality rate differentials shown in Table 2.

Table 2 Estimated mortality rate differentials by age, sex, and poverty status

Those rate differentials, translated into differential survival probabilities, are used to determine column totals in the transition probability matrices while preserving the 2012 U.S. life table overall age- and sex-specific death rate. Details on the construction of the poverty status life tables are provided in the online appendix.

Results From the Poverty Status Life Tables

The complete male and female poverty status life tables (PovLTs) for the United States for the period 2009–2014 are shown in Tables A1 and A2 in online appendix 2. The principal results are given in Table 3.

Table 3 Summary measures from male and female poverty status life tables for the United States, 2009–2014

Life expectancy at birth (e(0)) for the life table cohorts is quite close to that of the U.S. 2012 life tables. The male e(0) is 76.3 years in the PovLT, compared with 76.4 in the U.S. 2012 life table; the corresponding figures for females are 81.0 and 81.2. A small difference is to be expected because different life table calculation methods were used, and the PovLT has a somewhat different state composition by poverty status than the 2012 U.S. population.

Females spent more of their lifetime in states IP or NP than males. For males, about 10 years (13% of total lifetime) were spent in state IP, and 3.3 years (4.3%) were spent in NP. For females, those figures are 13 years (16%) for IP and 4.2 years (5.2%) for NP.

Both males and females showed a great deal of movement into and out of poverty, with females having more transitions than males. For every person born, there were about two moves from IP to AP and two moves from AP to IP. For other pairs of states as well, there was considerable similarity in the number of moves in each direction.

The average duration (or spell) in a state, after entry by birth or transfer, is shown in Item 5 of Table 3. The average spell in state AP was fairly long, about 18 years for males and 16 years for females. In contrast, stays in states IP and NP were relatively short. Each entry into poverty implied a spell of 4.2 years for males and 4.4 years for females. Stays in state NP were even shorter, at about 2.6 years for both genders.

The full tables show only modest variations over age. Let subscripts A, I, and N to the life table rate (m) and decrement (d) functions denote the states AP, IP, and NP, respectively. Thus, for example, mAI represents the transfer rate from state AP to state IP. For males, rates of entry into poverty—both mAI and mAN—tended to decline over age, although mAN increased at ages over 62.5. Rates of leaving poverty, mIA, were substantially larger. The male rates tended to increase with age up to age 22.5, remain fairly flat until age 72.5, and then decline somewhat. For females, rates into poverty were relatively high below age 17.5, peaked around the age of majority, and then remained fairly constant over the rest of the life course. The female rates of leaving poverty, mIA, increased a bit irregularly to a plateau around ages 37.5–67.5, and then declined.

The life table figures on average duration in a state are not directly comparable to those given in most prior research: the latter estimates were constrained by the interval of observation and/or emphasized the risk of transition by duration in the state. The synthetic cohort approach of the life table does not provide probabilities by duration. However, the average life table duration in state IP of 4.2 years for males and 4.4 years for females is quite close to the 4.2 years found by Bane and Ellwood (1986). Their research found that most spells of poverty were of short duration and that most persons in poverty were in the midst of a rather long spell. The life table figure is thus the average of a heterogeneous distribution of durations.

Life table figures on prevalence are generally quite comparable to results previously found from survey data. Hokayam and Heggeness (2014a: table A4) found that in 2012, 13.6% of men and 16.3% of women were in poverty, nearly the same as the 13.2% and 16.2% figures in the PovLTs. Hokayam and Heggeness (2014a: table A4) reported that 4.4% of men and 5.1% of women were NP—proportions almost identical to the 4.3% and 5.2% figures shown in Table 3. The life table rates of entry and exit are also fairly similar in level to other findings. For the pre-1996 welfare reform period, Cellini et al. (2008) saw about a 4% chance of entering poverty in a year and a one-third chance of leaving. For 2009–2014, the life table showed a rate of entering poverty (mAI) of around 3% and a rate of exiting (mIA) of about one-fourth. Given the limitations of data and methods, including the lack of age specificity in the cross-product ratios and the life table’s failure to recognize a person’s past history and duration in a state, the present results are most supportive of the constant-α/life table approach.

Summary and Conclusions

A new approach to dynamic multistate modeling based on constant cross-product ratios (αs) is advanced here. Cross-product ratios are statistically sound and demographically interpretable measures that can be seen as part of a log linear modeling of state-specific main effects and two-way interactions. Constant αs offer an efficient estimating technique that makes a broader range of subjects susceptible to demographic analysis. In N state models, constant αs and (N – 1) additional pieces of information allow the calculation of the full array of interstate transition probabilities.

To illustrate the constant-α approach, poverty status life tables were constructed. Probabilities of transfer between poverty statuses were found from observed cross-product ratios and adjacent population distributions by state. The results reproduce the poverty prevalence figures found in earlier research and reflect a cohort’s experience of poverty from birth to death.

More research is needed to better understand how cross-product ratios of demographic behaviors change over age and time, and how those changes influence transition probabilities. The dependence of the approach on a standard set of αs and optimal calculation procedures in the presence of m-zeros are areas that need further study. Nonetheless, by providing analysts with a new basis for constrained dynamic modeling and an additional method for exploiting existing data, the constant-α approach expands the range of dynamic analysis and enlarges the scope of empirical studies of multistate phenomena.