1 Introduction

The skilled human capital is an endogenous driver of economic growth, since its intensive use accounts for increased productivity and technological growth; this stimulates the Gross Domestic Product of a country. However, a high level of education produces a limited impact on economic growth if it does not meet the needs of the labour market.

Since the early 1970s, several studies have analyzed the impact of mismatch between knowledge and expertise obtained through higher educational level and skills required for the job, in different developed countries (Berg and Gorelick 2003; Di Pietro and Urwin 2006; Freeman 1976; Rahona-López and Pérez-Esparrells 2013; Schomburg and Teichler 2011). In Italy, this educational mismatch has been widely analyzed in the last 20 years and the probability of working for graduates has been treated by applying diversified linear and nonlinear models, as specified hereafter. In particular, Biggeri et al. (2001) proposed a multilevel discrete time survival model able to assess the determinants which affect the time of entry into the labour market, while Belloc et al. (2011) developed a non-linear relation between students’ income and drop-out probability, where a multinomial latent effects model with endogeneity that accounts for both heterogeneity and omitted covariates was introduced. Grilli and Rampichini (2007) introduced a multivariate multilevel model for polytomous responses (with a non-ignorable missing data mechanism), in order to identify the factors which influenced the skills acquisition for graduates. Advances were also proposed in Grilli et al. (2016). Pozzoli (2009) analysed, through a non-parametric discrete-time single-risk model, the hazard of the first job for italian graduates, taking into account the graduates? characteristics and the effects relating to degree subject. Moreover, Grilli and Mealli (2008) focused on a methodology concerning the use of non-parametric bounds in the principal strata approach. In particular, this approach aimed at assessing the relative effectiveness of two specific degree programmes on the graduates’ job status.

More recently, Bini et al. (2011) proposed a multilevel logit model to measure the external effectiveness of the university education focusing on both the graduates’ characteristics and some contextual factors that differently affect the Italian regional labour markets. Ballarino and Bratti (2009) described through a multivariate nonlinear model how the effect of different fields of study on the university-to-work transition changed between 1995 and 2004 in Italy. In addition, they analysed the early (3 years after graduation) destination outcomes of graduates by means of a multinomial logit model and focused on the impact of their field of study on the probabilities of being jobless, attending a postgraduate education or a training course, and having unstable or stable jobs. Sciulli and Signorelli (2011) applied a Cox proportional hazard model with competing risk, suitable to affect the employment probabilities of graduates belonging to a middle-sized Italian university, while Lombardo and Passarelli (2011) provided a specific analysis of the determinants of graduates’ job quality, such as the contract type, the educational match and wage. On the other hand, Cammelli et al. (2011) proposed a comparative descriptive analysis regarding the mobility and employability of Bachelor graduates in Italy with respect to other European countries, after the Bologna reform process. In particular, they discussed on the positive and negative effects of the university reform by analysing the surveys undertaken by the AlmaLaurea-Interuniversity Consortium from 2008 to 2009. Finally, Grilli et al. (2016) used a quantile regression model of gained university credits to evaluate the role of pre-enrolment assessment tests, while Mollica and Petrella (2017) applied a binary quantile regression approach to analyze the Bachelor-Master transition phenomenon.

In order to contribute to this extensive literature, in the present paper a multilevel approach is applied. In particular, after a brief review on the multilevel modelling (Sect. 2), the microdata regarding the job opportunities are described (Sect. 3). The probability of being employed for Italian graduates within 3 years from degree is estimated through a multilevel binary logistic regression model (Sect. 4.1). Afterwards, the interest in the job dynamics has led towards a multilevel multinomial logistic regression model (Sect. 4.2). With respect to other classical models, often applied in the literature for this kind of data, the proposed multinomial model takes into account the nowadays job’s status, characterized by at least three outcomes (such as unemployed/unsteady employed/steady employed). The choice of this new multinomial model is due to the continuous changes in a labour market, which includes a wide range of unsteady job opportunities. Indeed, the labour market is characterized by new forms of employment, which are more flexible for the employer, often based on limited period of time (Martinelli et al. 1999).

An Italian National Institute of Statistics (ISTAT) data set regarding the employment opportunities of Italian graduates, is considered.

2 Brief Theoretical Background on Multilevel Models

Multilevel approach is a statistical methodology for the analysis of hierarchical data structure with complex patterns of variability, by focusing on nested sources of variability (Goldstein 2010; Snijders and Bosker 2012).

As highlighted in Gill and Womack (2013), multilevel models extend the linear model and the generalized linear model by incorporating different levels directly into the model statement. Therefore, all of the common models for linear, dichotomous, count, restricted range, ordered categorical, and unordered categorical outcomes are supplemented by adding a structural component. This structure classifies cases into known groups, with their own set of explanatory variables at the group level (Hox 2002; Scott et al. 2013). Unlike traditional regression models, explanatory variables for groups are identified, the variability at different levels of hierarchy is estimated, consequently the effects of groups on the response variable are estimated; moreover, unbiased estimates of standard errors are obtained (Longford 1993).

In the next two sections, a brief review of the multilevel binary and multinomial logistic regression models used in this paper, is proposed.

2.1 Binary Logistic Regression Model

Let \(Y_{ij}\) be a binary variable which takes values 0/1 (response categories) and let \({x}_{ij}\) be a single explanatory variable for the ith unit at level one and the jth unit at level two and \({{\varvec{\varepsilon }}}_{j}\) be a random effect vector at level two.

The two-level binary logistic regression model is the following:

$$\begin{aligned} \eta _{ij}= \alpha +\beta \,x_{ij}+\varepsilon _{j}, \end{aligned}$$
(1)

where

$$\begin{aligned} \eta _{ij}=\log it({{\pi _{ij}}})\quad \text{ and } \quad \pi _{ij}=P(Y_{ij}=1|{ x}_{ij},{{\varvec{\varepsilon }}}_{j})=\frac{\exp \{\eta _{ij}\}}{1+\exp \{\eta _{ij}\}}, \end{aligned}$$

represents the response probability for \(Y_{ij}=1\).

Note that the binary response variable \(Y_{ij}\) follows a Bernoulli distribution taking values 0/1 (where the value 0 corresponds to the reference category), whilst the random effect vector \(\varepsilon _{j}\), is assumed to be normally distributed, with the expected value zero and the variance \(\sigma ^2_{\varepsilon }\) (Agresti 2002; Tutz 2012).

The model (1), structured in two levels, can be easily extended to many levels.

2.2 Multinomial Logistic Regression Model

Multinomial logistic regression is a technique that basically fits multiple logistic regressions on a multi-category unordered response variable that has been dummy coded (Moutinho and Hutcheson 2011; Tutz 2012). In other terms, the multinomial logit model is a regression model that links a categorical response variable with unordered categories to explanatory variables.

Let \(Y_{ij}\) be a multinomial response variable which takes values \(s=1,2,\ldots , S\) (response categories) and let \({x}_{ij}\) be an explanatory variable for the ith unit at level one and the jth unit at level two.

The two-level multinomial logit model is given as follows (Snijders and Bosker 2012):

$$\begin{aligned} \eta ^{(s)}_{ij}=\mathbf{\alpha }^{(s)}+\mathbf{\beta }^{(s)}\,{ x}_{ij}+{\varepsilon}_j^{(s)}+{\delta }_{ij}^{(s)}, \end{aligned}$$
(2)

where

$$\begin{aligned} \eta ^{(s)}_{ij}=\log it({{\pi _{ij}}^{(s)}})\quad \text{ and } \quad {\pi _{ij}^{(s)}}=P(Y_{ij}=s|{x}_{ij},{{\varvec{\varepsilon }}}_{j},{\varvec{\delta }}_{ij})=\frac{\exp \{\eta ^{(s)}_{ij}\}}{1+\displaystyle\sum _{r=2}^{S}\,\exp \{\eta _{ij}^{(r)}\}}, \end{aligned}$$

corresponds to the response probabilities for each category s, whilst \({{\varvec{\varepsilon }}}_j\) and \({\varvec{\delta }}_{ij}\) are vectors of random errors representing unobserved heterogeneity related to the jth unit at level two and the ith unit at level one.

Note that model (2), structured in two levels, can be easily extended to many levels.

The response variable \(Y_{ij}\) has a multinomial distribution, taking values in the set of categories \(\{1,2,\ldots ,S \}\), where \(s=1\) is the reference category for which all the parameters and the random errors are set to zero and the conditional probability of \(Y_{ij}=1\) is \(1/(1+\displaystyle\sum _{r=2}^{S}\,\exp \{\eta _{ij}^{(r)}\})\) (Bini et al. 2011; Grilli and Rampichini 2007).

The model (2) consists of \(S-1\) contrasts or sub-equations, one for each category apart from the reference one (Rasbash et al. 2009).

In the model (2), each sub-equation has specific parameters \(\alpha ^{(s)}\) and \(\beta ^{(s)}\); moreover, \({{\varvec{\varepsilon }}}_j\) and \({\varvec{\delta }}_{ij}\) are vectors of random errors with the following assumptions:

  • The errors at different levels are independent;

  • \({{\varvec{\varepsilon }}}_j^{\prime }=(\varepsilon _j^{(2)},\ldots ,\varepsilon _j^{(S)})^{\prime }\sim \,N(\mathbf{0}, {\varvec{\Omega }}_{\varepsilon })\);

  • \({\varvec{\delta }}_{ij}^{\prime }=(\delta _{ij}^{(2)},\ldots ,\delta _{ij}^{(S)})^{\prime }\sim \,N(\mathbf{0}, {\varvec{\Omega }}_{\delta }).\)

3 Data

The microdata used in this paper concern an Italian National Institute of Statistics (ISTAT) survey referred to the job opportunities of a cohort of Italian graduates, within 3 years from degree. Although the respondents to the sample survey, conducted in 2007, correspond to 47,300 units (26,570 in long courses and 20,730 in 3-year courses) on a population of 260,070 graduates (167,886 in long courses and 92,184 in the 3-year courses), after removing missing values, the data set used for the analysis includes 44,775 graduates (23,779 women and 20,996 men).

Note that these microdata have been available thanks to a cooperation with IPRES (Apulian Institute of Economic and Social Research).

In this context, a multilevel approach has been applied by considering, first of all, a multilevel binary logit model and then, a multilevel multinomial model also suitable to describe the new dynamics of employment for graduates.

In particular, three hierarchical levels have been introduced:

  • The first level, that is the graduates (44,775 graduates) 3 years after graduation;

  • The second level, which denotes the groups of study courses (16 groups of degree courses);

  • The third level, corresponding to the italian regions where the universities are located (20 Italian regions).

The choice of three hierarchical levels is justified whilst taking into account that, as well known, the university education is organized as a hierarchical structure, where the universities are considered as the highest level, in which the faculties operate by activating the degree courses; on the other hand, the students represent the lowest level of nesting (Bini et al. 2011).

A thorough descriptive analysis on ISTAT microdata has been performed by focusing on the social and educational background of graduates, as well as on their employment status after completing their studies. On the basis of this exploratory data analysis, the covariates shown in Table 1 have been selected and recoded for computational purposes.

Note that the ISTAT data set contains exhaustive information only regarding the socio-demographic profile of graduates, without providing any further detail about the degree courses and the university regions. Therefore, only covariates related to the first level have been incorporated in the two models proposed in the paper, whilst no internal contextual factors at the third and second level have been considered. On the other hand, the regional unemployment rate (measured in the reference year), has been included in the binary and multinomial model. Indeed, this external contextual factor is an important indicator of economic and job market development of a territory (Bini et al. 2011) and as such it could affect the dynamics of employment for graduates. Hence, this covariate is useful to recognize the different effects that both individual and regional characteristics might cause on the three alternative job status of graduates (unemployed/unsteady employed/steady employed).

Table 1 Individual covariates selected for the study

Before introducing the sections on the multilevel modelling applied in the paper, it is worth focusing on some descriptive results regarding the effectiveness and coherence of the university degree with respect to the labour market. In particular, the following scenarios have been identified:

  • Downgrading status concerning the worst condition for the employed graduate, whose degree is neither required to access to the current work, nor useful for job duties; in the reference year, about 20% of employed graduates were in this status, with completely underestimated skills;

  • Formal use concerning the employed graduate whose degree is required to access to the current job, but not effectively necessary: 10% of employed graduates, in the reference year, considered a degree only as an important landmark, but completely useless to execute the current work;

  • Substantial use describing the employed graduate’s condition, whose degree is not required to access to the current employment, but with skills (acquired through graduation) which are useful for the job: in the reference year, about 12% of employed graduates were in this status;

  • Optimal use concerning the ideal position for the employed graduate, whose degree is a requirement to access and perform the current work: in the reference year, only 58% of employed graduates were entirely satisfied of the work.

Regarding the entry to the labour market, a comparison between the current employment status (in the reference year) and the employment status 1 year after graduation has been analyzed. In particular, by using some tools of multivariate analysis, four possible conditions have been identified:

  • Difficulty of entry, which is the worst position for graduates that neither worked 1 year after graduation nor in the reference year: 12.6% of graduates were in this status;

  • Delayed access, referred to employed graduates that did not work 1 year after graduation but were hired within the reference year: 26.1% of graduates had a delayed entry in the labour market;

  • Immediate entry and unsteady work, which involves graduates that obtained a job 1 year after graduation, but who did not work in the reference year: 58% of Italian graduates were in this status;

  • Immediate access and steady work, which is the best placement for graduates that were hired 1 year after graduation and were still working in the reference year: only 3% of graduates were in this status.

4 Multilevel Models for the Probability of Working

By taking into consideration the above-mentioned results, the probability of being employed for Italian graduates within 3 years from degree has been modelled:

  • First of all, by a multilevel model for the binary response variable “not working/working” (Sect. 4.1),

  • Then, by a multilevel multinomial logit model for the polytomous response variable with three categories “unemployed/unsteady employed/steady employed” (Sect. 4.2).

This last model represents a natural extension of the binary logit model, since the response variable can suitably take into account that the possible job’s status is influenced by the new dynamics of the labour market based on more flexible forms of contracts.

Moreover, a comparison between the binary logit model and the multinomial model has been realized by using the Deviance Information Criterion (DIC) index (Aitkin 2010; Spiegelhalter et al. 2002) and MCMC (Markov chain Monte Carlo) diagnostic (Brooks and Roberts 1998; Browne 1998, 2014; Browne and Draper 2006; Cowles and Carlin 1996; de Leeuw and Meijer 2008; Goldstein 2010). Spiegelhalter et al. (2002) used the deviance with MCMC sampling to derive the DIC diagnostic. Indeed, the DIC diagnostic is a generalization of the Akaike’s Information Criterion and can be used to compare models by a criterion based on a trade–off between the fit of the data to the model (measured by deviance) and the corresponding complexity of the model (measured by an estimate of the effective number of parameters). Any decrease in DIC suggests a better model (Browne 2014; Rasbash et al. 2009). The MCMC engine in MLwiN (Browne 2014) calculates the estimated probabilities as part of the DIC diagnostic command. In particular, the MCMC diagnostic implemented in MLwiN software (Rasbash et al. 2009) assesses the convergence of MCMC algorithms and provides more detailed information about the parameters included in a model, such as the parameter trace, and some other diagnostic accuracy measures.

4.1 A Multilevel Logit Model for the Binary Response Variable

Let \(Y_{ijk}\sim Ber(\pi _{ijk})\) be the binary response variable which takes values 0 for “not working” and 1 for “working”, where \(\pi _{ijk}\) denotes the probability of being employed, with the index i (\(i=1,\dots ,n_{jk}\)) representing the graduates (level 1 unit), the index j (\(j=1,\dots , N_{k}\)) corresponding to the groups of degree courses (level 2 unit) and the index k (\(k=1,\dots ,K\)) indicating the university regions (level 3 unit). Moreover, let \(\mathbf{X}_{ {\large \cdot }}=\{X_{1 {\varvec{\cdot }}},X_{2 {\varvec{\cdot }}}, \ldots, X_{14 {\varvec{\cdot }}}\}\), be a set of covariates, which influences the dependent response variable.

The binary logistic regression model has been defined by using the covariates listed in Table 2, as follows:

$$\begin{aligned} \eta _{ijk}=\beta _{0jk} +\beta _{1}\,x_{1ijk} +\beta _{2j}\,x_{2ijk} +\beta _{3} \, x_{3ijk}+\ldots+\beta_{14} \, x_{14ijk}, \end{aligned}$$
(3)

where

$$\begin{aligned} \eta _{ijk} =\log it(\pi _{ijk} ) \quad \text{ and } \quad \pi _{ijk}=\frac{\exp \{\eta _{ijk}\}}{1+\displaystyle \exp \{\eta _{ijk}\}} \end{aligned}$$

with \(i=1,\ldots ,44,775,\; j=1,\ldots ,16,\; k=1,\ldots ,20\); \(\beta _{0jk} =\beta _{0} +{{{\varepsilon }}} _{0k} +{{\delta }}_{0jk}\); \(\beta _{2j} =\beta _{2} +{{\delta }}_{2jk}\); \(\left[ {{\varvec{\varepsilon }}} _{0k} \right] \sim N\left( 0,\varOmega _{{{\varvec{\varepsilon }}} } \right) ,\quad \varOmega _{{{\varvec{\varepsilon }}} }=\left[ \sigma ^2 _{{{\varvec{\varepsilon }}} 0} \right]\); \(\left[ \begin{array}{l} {{{\delta }}_{0jk} } \\ {{{\delta }}_{2jk} } \end{array}\right] \sim N\left( {\varvec{0}},{\varvec{\Omega }} _{\mathbf{\delta }} \right) ,\quad {\varvec{\Omega }} _{{{\delta }}}=\left[ \begin{array}{l} {\sigma ^2_{{\delta 0}} } \\ {\sigma _{\delta 02} }\quad {\sigma ^2_{\delta 2} } \end{array}\right]\).

The model (3) allows the intercept to vary across the group of degree courses (the 2nd level) and university region (the 3rd level); on the other hand, the slopes are assumed to be constant for each covariate, except for the regression coefficient of degree mark varying across the 2nd level. The multilevel logistic regression model has been fitted by using the MLwiN software (Rasbash et al. 2009).

4.1.1 Results of Multilevel Binary Logit Model

Table 2 shows the estimates of the significant covariates’ coefficients obtained by using maximum-likelihood method. It is worth noting that the covariates which are not statistically significant  (that is the father’s profession, the social class and the parents’ educational level, the regional unemployment rate) have been removed from the model.

Table 2 Estimates of fixed and random parameters, together with the standard errors (SE), the p value and the odds-ratios for the model (3)

In order to evaluate the covariates effect on the probability of working, the odd-ratios (OR) have been calculated. From the OR, given in the last column of Table 2, it can be pointed out that:

  • Being a female graduate reduces of 11% the probability of obtaining a job;

  • A degree mark greater than or equal to 105/110 reduces of 19% the probability of obtaining a job;

  • Being a female graduate and obtaining a degree mark greater than or equal to 105/110, increases of 20.4% the probability of working;

  • Being habitually resident in Southern Italy, Central Italy and abroad leads to a decrease in the probability of working, respectively, of 55, 24 and 19% with respect to the regions of Northern Italy;

  • The degree in Engineering, Architecture, Economics-Statistics and Medicine compared to the degree in the humanistic area, increases of 130, 41, 26.5 and 12.3%, respectively, the probability of obtaining a job.

The sociological implications associated with the ORs’ results will be analysed thoroughly in the following section.

By focusing on the random part of the model shown in Table 2, it is evident that the variability is higher among the groups of degree courses (0.48) compared with the university regions-level (0.11); indeed, this is confirmed by the estimated intra-class correlation coefficient (ICC) equal to 6.6% for the 2nd level and 2.84% for the 3rd level (Snijders and Bosker 2012).

In order to interpret the above mentioned results referred to the model (3), the predicted probabilities of working have been calculated for the sample of graduates with respect to the university region (Table 3) and group of degree courses (Table 4).

Table 3 shows the estimated probability of working for all graduates, with respect to the university region of graduates, classified by gender and degree mark.

Table 3 Estimated probabilities of working with respect to the university region, classified by sex and degree mark

From Table 3, it is highlighted that the probability of being employed in the reference year is higher for graduates in the university regions of Northern Italy than in Southern Italy and islands. This behaviour confirms the existence of strong economic constraints deriving from the territorial inequalities in the distribution of household income. Moreover, a gender gap emerges from the estimates: the probability of working is higher for male graduates with a score less than 105/110 compared to those with a score higher than 105/110, while the female graduates have a higher probability of working in case of degree mark greater than or equal to 105/110.

These estimates might be supported by the tendency of male graduates with a high degree mark, unlike the female ones, to postpone the entry time into the labour market, since they probably hope for a highly-skilled and profitable job. Moreover, Table 4 shows the estimated probabilities of working for graduates classified by degree mark, with reference to the groups of degree courses.

It is worth noting that the probability of working is higher for graduates belonging to the scientific areas than for graduates belonging to the other humanistic areas. In particular, graduates in Engineering, Architecture and Economics-Statistics are more likely to be employed in the long term. The gender gap highlighted for the university region-level (in Table 3) persists also when graduates have studied the same subject and achieved an identical degree mark.

Table 4 Estimated probabilities of working for graduates classified by degree mark, with reference to the groups of degree courses

4.2 A Multilevel Multinomial Logit Model

The interest in the dynamics of job for graduates within 3 years from graduation has led to focus on their possible employment status, that is unemployed/unsteady employed/stable employed. The probability of being seasonal or steady employed has been estimated by a three-level multinomial logistic model.

The multinomial structure of the model enables the discernment of the different effects that both individual and regional characteristics might cause on the three alternative job status of graduates.

Let \(Y_{ijk}\) be the multinomial response variable, which takes value 1 for “unemployed” status, whilst it assumes value 2 or 3 for the “unsteady employed” or “steady employed” status, respectively; moreover:

$$\begin{aligned} \pi _{ijk}^{(s)}=\frac{\exp \{\eta _{ijk}^{(s)}\}}{1+\displaystyle\sum _{r=2}^{3}\,\exp \{\eta _{ijk}^{(r)}\}} \end{aligned}$$

represents the probability of being unemployed (\(s=1\) representing the reference category), unsteady employed (\(s=2\)), steady employed (\(s = 3\)), for graduate i, \(i=1,\ldots ,44,775\) in the group of degree courses j, \(j = 1, \ldots , 16\) and the university region k, \(k = 1, \ldots , 20\).

The following multinomial logit sub-equations are defined for covariates in Table 5 related to the category unsteady employed and steady employed, respectively:

$$\begin{aligned} \log it{\left( \frac{\pi _{ijk}^{(2)}}{\pi _{ijk}^{(1)}}\right) }= \beta _{0j}+ \beta _{2k}\, x_{2ijk}+ \beta _{3}\, x_{3ijk}+ \cdots + \beta _{14}\, x_{14ijk}+ h_{k}, \end{aligned}$$
(4)

where \(\beta _{0j}=\beta _{0}+v_{0jk}, \quad \beta _{2k}=\beta _{2}+f_{2k},\)\(h_{k}=\beta _{28}\cdot x_{28k}\);

$$\begin{aligned} \log it{\left( \frac{\pi _{ijk}^{(3)}}{\pi _{ijk}^{(1)}}\right) }= \beta _{1j}+ \beta _{15k}\, x_{15ijk}+ \beta _{16}\, x_{16ijk}+ \cdots + \beta _{27}\, x_{27ijk}+ h_{k}, \end{aligned}$$
(5)

where \(\beta _{1j}=\beta _{1}+v_{1jk}, \quad \beta _{15k}=\beta _{15}+f_{15k},\)\(h_{k}=\beta _{28}\cdot x_{28k}\).

Table 5 Estimates of fixed and random parameters, with the standard errors, p value and odds-ratio for the multinomial model

The random effects v and f are, respectively, university region- and group of course-specific effects, assumed to be independent across levels and such that the following assumptions hold:

  • \(\left[ \begin{array}{l} {f_{2k} } \\ {f_{15k} } \end{array}\right] \sim N\left( \varvec{0},{\varvec{\Omega }} _{f} \right) ,\quad {\varvec{\Omega }} _{f}=\left[ \begin{array}{l} {\sigma ^2_{f2} } \\ \sigma _{f2\,15}\quad \sigma ^2_{f15} \end{array}\right]\),

  • \(\left[ \begin{array}{l} {v_{0jk} } \\ {v_{1jk} } \end{array}\right] \sim N\left( \varvec{0},{\varvec{\Omega }} _{v} \right) ,\quad {\varvec{\Omega }} _{v}=\left[ \begin{array}{l} {\sigma ^2_{v0} } \\ \sigma _{v01}\quad \sigma ^2_{v1} \end{array}\right]\).

Estimation is carried out using the Iterative Generalized Least Squares procedure with Maximum-likelihood method.

Unlike the binary logit model proposed in (3), in the multilevel multinomial model:

  • The intercept of the model has been assumed to vary across the groups of degree courses;

  • The slopes have been assumed to be constant for each covariate except for the regression coefficient of the gender which varies across the third level;

  • A regional-level indicator (contextual factor) concerning the unemployment Italian rate, referred to the year 2007 (source: www.istat.it), has been incorporated because statistically significant.

Note that the interaction term degree-mark per gender has been removed because not statistically significant.

The multilevel logistic regression model has been fitted by using the MLwiN software (Rasbash et al. 2009).

4.2.1 Results of Multilevel Multinomial Logit Model

Table 5 shows the estimates of fixed and random parameters of the multinomial model. It is worth pointing out that the covariates which are not statistically significant (that is the father’s profession, the social class and the parents’ educational level) have been removed from the multinomial model.

From the ORs of the multinomial model it is worth pointing out that:

  • Being a female graduate increases the chance of occasionally working of 270% and, at the same time, reduces of 24% the probability of getting a stable job: that is probably because women graduates, for childbearing intentions, could be more likely to get a unsteady job rather than a stable one;

  • The habitual residence in Central Italy and abroad reduces the probability of occasionally working (of 6 and 7% respectively) or getting a steady job (of 15 and 11%, respectively) compared to the regions of Northern Italy. On the other hand, the probability of getting a steady work in the regions of Southern Italy, is lower than in Northern Italy, while, the probability of occasionally working is higher in Southern Italy than elsewhere. These results support the analysis on regional disparities regarding the quantitative and qualitative dimensions of job unsteadiness, as well as the job characteristics, wages and employment contracts (Dyker 2010; Di Berardino et al. 2015; Sverke 2004). In particular, contracts that provide opportunities for career development and prospects of steady employment are more diffused in the North-Central of Italy than in Southern Italy; on the other hand, irregular work and specific contractual forms (more exposed to the risks of job insecurity) are concentrated in the South of Italy;

  • A degree in Architecture, Medicine, Engineering and Economics-Statistics areas leads to increase the probability of obtaining both a precarious and a stable job with respect to the humanistic area. This is confirmed by some social studies. In particular, it has been highlighted that for the group of medical sciences, almost two-thirds of graduates work in health and social services, while more than half of graduates in Architecture are employed in Industry, Commerce and Transport and Service sectors. On the other hand, the degrees in Engineering, Architecture and Economics-Statistics are considered more flexible and profitable than the ones obtained in other areas, because of more outstanding job opportunities in several fields of activity (Szanton 2004);

  • A degree mark less than 105/110 increases of just 6% the probability of obtaining both a precarious and a stable job. Indeed, graduates with high degree mark take longer to find a job: that is the reason why they fix higher reservation wages on the basis of their higher expectations about job qualities and wages. Obviously, this behaviour determines a reduction in the range of employment opportunities (Sciulli and Signorelli 2011);

  • The regional unemployment rate reduces of 2% the probability of working both occasionally and continuously. Then, the probability of obtaining both a unsteady or stable job is very low in those university regions where the unemployment rate is high.

Moreover, according to the ICC (Table 5), the random-effect variances explains:

  • 53.23 and 12.5% of the total variance in the sub-equation (4) for university region-level and the groups of degree course-level, respectively;

  • 49.55 and 6.4% of the total variance in the sub-equation (5) for university region-level and the groups of degree course-level, respectively.

These estimates suggest that, unlike the binary model (3), significant unexplained variability across the group of degree course-level exists with reference to the probability of unsteady and stable working rather than across the university region-level.

In order to interpret these results, the predicted probabilities of seasonal and stable work have been calculated with respect to the groups of degree-course-level and the university region-level. In particular, from the Table 6, it is worth pointing out that with reference to the group of degree-course:

  • The estimated probability of seasonal work is higher for graduates with degree mark less than 105/110 and belonging to the Medicine, Architecture, Literary, Language, Teaching and Physical Education;

  • The estimated probability of countinously working is higher for graduates with degree mark greater than or equal to 105/110, without significant differences in terms of group of degree courses.

Table 6 Predicted probabilities of unsteady and steady job, with respect to the groups of degree course

Regarding the university regions (Table 7), the estimated probabilities of being employed are higher for graduates in the university of North and Central Italy than in the South of Italy. In particular:

  • The estimated probability of being seasonal employed is always higher for female graduates than male graduates, independently on the degree mark;

  • On the other hand, the estimated probability of being continuously employed is higher for male graduates than for female graduates, in case of degree mark less than 105/110.

Table 7 Predicted probabilities of unsteady and stable job, with respect to the university regions

Thus, a degree mark gap exists for the stable work, while the gender gap is referred to the unsteady work.

5 Concluding Remarks

This study aimed at modeling the probability of working for graduates within 3 years from graduation, taking into account the effectiveness of a degree with respect to the labour market and some contextual factors (such as the unemployment rate). First of all, a multilevel binary logit model was proposed.

From the binary model, it was highlighted that the probability of being employed is higher for graduates in the university regions of Northern Italy than in Southern Italy and islands. As previously mentioned, this behaviour confirms the existence of strong economic constraints deriving from the territorial inequalities in the distribution of household income, which often determine the interregional mobility of Italian university students. Moreover, a gender gap emerges from the estimates: the probability of working is higher for male graduates with a score less than 105/110 compared to those with a score higher than 105/110, while female graduates have a higher probability of working in case of degree mark greater than or equal to 105/110.

These estimates might be supported by the tendency of male graduates with a high degree mark, unlike the female ones, to postpone the entry time into the labour market, since they probably hope for a highly-skilled and profitable job. Moreover, with reference to the groups of degree courses, it is worth noting that the probability of working is higher for graduates belonging to scientific areas than for graduates belonging to humanistic areas. In particular, graduates in Engineering, Architecture and Economics-Statistics are more likely to be employed in the long term.

Similar conclusions were reached and better supported from the estimates of the second model.

In particular, from the multinomial model it was underlined that:

  • The estimated probability of seasonal working is higher for graduates with degree mark less than 105/110 and belonging to the Medicine, Architecture, Literary, Language, Teaching and Physical Education;

  • The estimated probability of continuously working is higher for graduates with degree mark greater than or equal to 105/110, without significant differences in terms of groups of degree course.

As regards the university regions, the estimated probabilities of being employed are greater for graduates in the university of North and Central Italy than in Southern Italy. In particular:

  • The estimated probability of being seasonal employed is always higher for female graduates than male graduates, independently on the degree mark;

  • On the other hand, the estimated probability of being continuously employed is higher for male graduates than female graduates, in case of degree mark less than 105/110.

Thus, the degree mark gap exists only for the steady job, while the gender gap is referred to the unsteady work. Indeed, with reference to the probability of seasonal and stable job, the multinomial model reveals a significant unexplained variability across the groups of degree course-level rather than across the university region-level. Therefore, unlike the binary model, the variability in the probability of unsteady or stable working depends more on differences between the university region-level rather than the groups of degree course-level. The DIC and deviance indexes suggest that the multinomial multilevel model is suitable to fit the data significantly better than the binary model.

Moreover, the multinomial model, focusing on the graduates’ job status 3 years after degree and accounting for both the regional unemployment rate and individual characteristics of graduates, has to be considered much more informative and appropriate than the binary model to assess their probability of being employed.