This study examines the 2008 U.S. News and World Report (USNWR) peer assessment ratings for graduate/professional schools of business, education, engineering, law, and medicine. Despite the abstract nature of prestige in higher education, there has been no shortage of ratings and rankings that attempt to measure it. Even with some skepticism about their accuracy, ratings and rankings of both undergraduate and graduate programs are, in fact, important to colleges and universities as indicators of comparative standing. Many colleges and universities publicize their current place in the USNWR undergraduate rankings. Likewise, many graduate and professional programs highlight their current ranking on their websites, and in publicity materials distributed within their disciplines. Some institutions even target the rankings in their strategic plans.

Research evidence suggests, however, that students base their college attendance decision less on an institution’s specific rank (like the ones in various guidebooks and USNWR) and more on its overall academic reputation or prestige (Pryor et al. 2006). Although an institution’s reputation may not be easily separated from its multi-measure rank, this study assumes that the USNWR “peer assessment score” reflects the general reputation or prestige of each school, and that this judgment may be empirically different from the school’s “rank” which includes several quantitative variables. Nevertheless, among the several variables USNWR employs in determining each school’s overall rank, the peer assessment score is weighted the heaviest, thus making this particular measure of reputation an influential one.

A growing body of scholarship has analyzed the reputation ratings of either undergraduate or graduate programs in higher education. Most studies at the undergraduate level conclude that two “inputs”—institutional size and admissions selectivity—are the primary drivers of reputation (Astin 1970; Astin and Lee 1972; Astin and Solomon 1981; Grunig 1997; Porter and Toutkoushian 2002; Schmitz 1993; Solomon and Astin 1981; Volkwein 1989;). In the aggregate, these studies indicate that institutions with large enrollments and high SAT/ACT averages for entering freshmen receive the greatest prestige.

At the graduate level, several studies have examined the National Research Council (NRC) ratings (Jones et al. 1982; Goldberger et al. 1995). Similar to studies of the undergraduate rankings, studies of the graduate rankings have also shown that the same two factors—institutional size and admissions selectivity—are strongly associated with the variance in reputation scores. For example, Grunig (1997) found that size and selectivity explain between 85 and 90% of the variance in average 1995 NRC ratings of the scholarly quality of graduate program faculty. These 1982 and 1995 NRC ratings of graduate program reputation are arguably the most respected ratings of graduate education that exist. However, many years have passed since the NRC published its latest version, and the academic community has long anticipated the publication of another update (which should be released in 2009). Since the early 1990s, USNWR has published its annual edition of America’s Best Graduate Schools, which to some degree has filled the void left by the absence of an updated NRC publication. More recent studies indicate that graduate and undergraduate reputation are highly correlated with similar sets of variables, as well as with each other (Volkwein and Grunig 2005; Volkwein and Sweitzer 2006).

Some studies of reputation and prestige at the graduate level have aggregated the ratings to reflect the entire institution’s graduate programs as a whole (Grunig 1997; Volkwein 1986, 1989). However, studies that examine the correlates of prestige for individual graduate and professional schools are practically nonexistent, despite the fact that each graduate discipline is rated separately, not only by the reputable 1982 and 1995 NRC graduate ratings, but also by the annual USNWR rankings of graduate schools. Indeed, studies that examine the variables that may influence these discipline-specific ratings are few and far between. This study examines the variables that appear to be most strongly associated with the prestige of graduate and professional schools in the fields of business, education, engineering, law, and medicine. What are the correlates of prestige among the nation’s leading graduate and professional schools? Are there parallels with research findings at the undergraduate level? Are there similarities and differences among the disciplines? In short, does prestige in graduate education boil down to the same factors across disciplines?

Measuring Quality in Higher Education

Prestige ratings and rankings in higher education have been nothing if not controversial since the introduction of the first ratings in 1870 (Webster 1986). Histories of the rankings are contained in Webster (1992a) and Stuart (1995). Although the national rankings of graduate and doctoral programs, like those in Cartter (1966), Roose and Anderson (1970), and the 1982 and 1995 NRC ratings, have had their critics, they have demonstrated high reliability and consistency over time with coefficients ranging around .93 to .98 (Astin 1985; Goldberger et al. 1995; Webster 1992a). No rating or ranking historically has received as much notoriety, nor as much criticism, as the USNWR rankings. Most of the debate over the USNWR rankings has been at the undergraduate level, although the methodology that USNWR employs to rank undergraduate and graduate schools is very similar. Debate over rankings and ratings is driven by two additional factors—doubts about the validity of the ratings as measures of quality, and even more basically, conflicting views about what constitutes quality in higher education.

Types of Validity and Educational Ratings

Face Validity

McGuire (1995) suggests that there are three types of validity with regard to prestige ratings in higher education—face validity, construct validity, and predictive validity. A measure has face validity if it appears to measure what it proclaims to measure (Anastasi 1988; Krathwohl 1998). If a measure appears to be accurate, then the construct has face validity. Grunig (1997) points out that the USNWR reputation survey is remarkably similar to a survey measuring perceptions of quality in business marketing (Boulding et al. 1993). Indeed, prestige ratings in higher education may be an excellent example of the cliché “perception is reality.” McGuire (1995) suggests that face validity is likely the strength of the USNWR methodology. In fact, one of the recurring criticisms of the USNWR rankings may very well be one of the strengths of those rankings; that is, that the rankings do not change much and the same schools are listed at the top each year. People tend to have a preconceived notion of which schools are the best in a given field, and indeed, most of those schools are at the top of the USNWR lists each year. Thus, if nothing else, the rankings at least appear to be accurate, providing face validity to the list.

Construct Validity and Predictive Validity

In relation to educational ratings, the other two types of validity—construct validity and predictive validity—are very similar. Construct validity refers to how accurately a scoring system or survey measures a theoretical construct (Carmines and Zeller 1979; Krathwohl 1998). Many critics see a disconnect between prestige ratings and the construct of quality in higher education. Predictive validity refers to how well a system predicts the outcome(s) it purports to measure. McGuire (1995) questions the construct validity (and, in effect, the predictive validity) of the USNWR rankings because the undergraduate rankings, in particular, have little to do with student educational experiences and outcomes. Indeed, if one institution receives a higher peer rating than another, then the educational outcomes of the higher-rated institution should be more favorable. However, in the absence of good outcomes measures, it is difficult to determine the predictive validity of educational rankings (Bogue and Saunders 1992). See below for a more detailed discussion of educational outcomes and rankings.

Models of Quality in Higher Education

An additional issue surrounding rankings is the lack of agreement about what constitutes quality in higher education. The literature describes at least four competing models or philosophies about what constitutes excellence (Burke 2005; Burke and Minassians 2003; Burke and Serban 1998; Seymour 1992).

Resource/Reputation Model

First, the academic community traditionally embraces the resource/reputation model initially described by Astin (1985). This model emphasizes the importance of financial resources, faculty credentials, student test scores, external funding, and ratings by peers. Under this model, faculty, sometimes joined by accreditation bodies, argue for more resources to support educational effectiveness. Institutional and program reputation plays an important if not central role in this model because reputation ratings by experts are generally accepted as legitimate reflections of prevailing beliefs about institutions and their programs (Volkwein and Grunig 2005).

Client-Centered Model

Second, many parents, students, and student affairs professionals cling to a client-centered model. Derived from the literature on quality management, this service-oriented model links quality to student and alumni satisfaction, faculty availability and attentiveness, and robust student services. Seymour (1992) articulates the client-centered model, and suggests that the first priority of a college or university is to fulfill the needs of students, parents, employers, and other “customers.”

Strategic Investment Model

Third, the corporate and government community generally believes in the strategic investment model (Burke 2005; Ewell 1994). This model emphasizes the importance of return on investment, cost-benefit analysis, and results-oriented and productivity measures such as admissions yield, graduation rates, time-to-degree, and expenditures per student.

Talent Development Model

A fourth model of quality is the most oriented toward educational outcomes. Astin’s (1985) talent development model places the development of both students and faculty as central. An institution is successful if it develops student and faculty knowledge, skills, attitudes, and interests. Institutions that are most successful in encouraging such development are those that provide the most quality.

Of the four models of quality described above, the resource/reputation model is the one that most closely aligns with the concept of rankings and the methodology used in most rankings—in particular the USNWR rankings. USNWR employs several variables to rank graduate programs, and many of these variables are similar to those listed above as being central to the resource/reputation model of quality. Indeed, financial resources, faculty credentials, student selectivity, and external funding are all components of the USNWR graduate rankings (via a variety of variable names). Moreover, the variable that is weighted the heaviest in the USNWR methodology is the peer assessment rating, which is a subjective measure of academic quality as determined by deans and program directors. These top administrators serve as the “experts” referenced above in the explanation of the resource/reputation model.

Since the resource/reputation model so closely resembles many rankings methodologies, the conceptual framework employed in this study applies several components of the resource/reputation model, which allows for an assessment of how well this model equates to academic quality as determined by deans and administrators. Figure 1 depicts the conceptual framework, and is explained in detail later.

Fig. 1
figure 1

Conceptual model of prestige (adapted from Volkwein and Sweitzer 2006)

Research on Prestige

The scholarship on prestige and reputation in graduate and doctoral education can be divided into two groups—studies that treat the whole institution as the unit of analysis versus those that examine individual departments or disciplines as the unit of analysis.

Prestige Across Institutions as a Whole

Studies of institutional prestige most frequently examine undergraduate education, but a few have examined the correlates of prestige and ratings of quality at the graduate level, usually using NRC data. The institutional variables found to be significantly correlated with reputation ratings of doctoral universities include the following: the age and size of the university; average salary for full professors; total annual research expenditures of the institution; the institution’s share of national R&D expenditures; the annual number of doctorates conferred; the total number of full-time faculty; the average SAT scores of entering undergraduates; the student/faculty ratio; the number of library volumes; the percentage of faculty with doctorates; total publications; and publications per faculty member (de Groot et al. 1991; Diamond and Graham 2000; Geiger and Feller 1995; Graham and Diamond 1997; Grunig 1997; Lombardi et al. 2006; Porter and Toutkoushian 2002; Toutkoushian et al. 2003; Volkwein 1986, 1989). This list of variables is similar to the resource/reputation model of quality as detailed above.

Prestige for Individual Disciplines or Departments

The studies of prestige ratings for individual departments and disciplines indicate that reputation ratings are strongly correlated with variables reflecting the size of the department and its graduate programs, its resources, and its research and scholarly activity. The departmental variables found to be significantly correlated with ratings of faculty and program quality include the following: number of faculty in the department; the number of doctoral students and number of doctorates granted; annual research spending; number of articles published and the average per faculty member; the percentage of faculty holding research grants from selected government agencies; and the percentage of postdoctoral fellows (Abbott 1972; Abbott and Barlow 1972; Baldi 1994; Conrad and Blackburn 1986; Drew 1975; Elton and Rodgers 1971; Geiger and Feller 1995; Guba and Clark 1978; Hagstrom 1971; Hartman 1969; Keith and Babchuck 1998; Knudsen and Vaughn 1969; Tan 1992). All of these discipline-level studies were conducted at least a decade ago, and most are much older. None of them research the USNWR graduate ranks.

Indeed, few studies have examined the USNWR graduate school ratings, despite their yearly publication since 1990. In one of the few exceptions, sociologists Paxton and Bollen (2003) examined USNWR graduate reputation ratings (as well as the NRC ratings) in three social science disciplines—sociology, political science, and economics. The authors conclude that 20–30% of the variance in the ratings for these three disciplines is a result of systematic method error, but most of the variance in these ratings is due to perceived departmental quality, which is what the ratings claim to measure. Clarke (2001) also examined the USNWR graduate rankings in order to study the effects of the yearly changes in the magazine publisher’s ranking methodology. She concludes that comparing yearly shifts in an institution’s overall rank is not possible due to the annual changes in methodology.

Politics and Importance of the Ratings

While the precision of the ratings may be questionable, and debate continues over what they actually measure, many authors do suggest that they nonetheless are important. Inside the university, deans and faculty alike believe in the value of “knowing where we stand” and staying competitive in the eyes of respected colleagues. Indeed, Paxton and Bollen (2003) suggest that graduate program ratings can influence the status of a department within its own university as well as among others in its discipline across universities. Rankings influence university planning and resource allocation decisions by identifying those doctoral programs to be developed as centers of excellence, to be resurrected from obscurity, or even to be phased out (Shirley and Volkwein 1978). Comparing current with past ratings can alert universities to departments that are in serious decline, as well as to confirm areas of significant improvement.

Finally, reputation ratings serve as marketing and recruitment tools for departments and professional schools. For prospective graduate students, faculty, and deans alike, the ratings constitute valuable information that influences both application and acceptance decisions (Ehrenberg and Hurst 1996). For example, a Wall Street Journal article documents the importance of the USNWR law school rankings to prospective law students (Efrati 2008). The article also suggests that the USNWR rankings played a part in a law school dean resigning her post after the school (University of Houston Law Center) dropped several places within a few years.

Like the undergraduate rankings, the methodology that USNWR employs in its graduate rankings weights the subjective reputational measure the heaviest among the several variables that determine the overall rank of each individual graduate school or college. Ehrenberg (2003) suggests that some administrators who USNWR surveys choose not to participate due to questions about the survey’s merit; likewise, some administrators participate specifically to rate some of their competitors lower than their reputation would warrant.

Despite weaknesses like reputation lag, rater bias, and halo effects (Arenson 1997; Astin 1985; Webster 1992b), the graduate and professional school ratings receive little criticism and are widely perceived to have legitimacy, especially as indicators of faculty scholarly quality, rather than as “educational quality” (Volkwein and Grunig 2005). For better or worse, graduate school reputation and prestige continue to play a role in defining quality in graduate education, and the USNWR peer assessment ratings have persisted in quantifying such prestige.

Conceptual Framework

The conceptual framework that drives the variable selection and analysis for this study is adopted from Volkwein and Sweitzer (2006). Based on the organizational systems and higher education literature, this framework incorporates elements primarily from the resource/reputation model of quality in higher education. Summarized in Fig. 1, this conceptual model assumes that the size and resource base of each school enables it to invest in both faculty and student resources and selectivity. Volkwein and Sweitzer argue that universities interact with their environments in ways that enhance the acquisition of human and financial resources. Whether public or private in origin, these investments by environmental stakeholders produce a student/faculty/program mix that has a measurable impact on both instructional outcomes and scholarly productivity, which in turn influence institutional prestige. Institutional prestige in turn makes the institution appear more attractive to prospective students and faculty, as well as to those who provide financial resources. Thus, the reputational cycle reinforces faculty and student quality and funding. Finding strong support for the model, the Volkwein and Sweitzer study (focusing on undergraduate rankings) explained over 90% of the variance in USNWR peer assessment scores among 242 research universities, and over 88% of the variance in reputation among 205 liberal arts colleges.

Although the Volkwein and Sweitzer model was developed for the USNWR peer ratings at the undergraduate level, it can be applied substantially to individual graduate/professional schools as well. The schools of business, education, engineering, law, and medicine represented in this study compete in the marketplace for talented students and talented faculty by investing in faculty and student recruitment, as well as in the services and activities that serve students and staff and make the school appear more attractive. Successful faculty recruitment and student recruitment interact with each other, exerting an influence on enrollment and financial resources, as well as on instructional and research productivity. The budgets of both public and private professional schools are substantially enrollment driven, thus providing incentives to grow larger in size. Finally, faculty talent and effort combine with student talent and effort to produce various faculty and alumni outcomes that build prestige.

The USNWR reputation or peer assessment ratings for these five graduate and professional schools have received little analysis. It is not known if the dynamics that lead to high and low prestige among schools of business, education, engineering, law, and medicine are similar or different. For example, do larger schools have an advantage in some disciplines but not others? How important is graduate student selectivity? Are there parallels with other research findings at the undergraduate level and are the results consistent with the Volkwein and Sweitzer model for undergraduate reputation? What are the variables that appear to have the greatest impact on the prestige of schools of business, education, engineering, law, and medicine? How does faculty productivity relate to prestige across the disciplines? For example, does research activity within schools of engineering relate to prestige to the same degree that research activity relates to prestige for schools of business? These are important questions for presidents, provosts, and deans, given the size and importance of these graduate and professional schools at most universities, and given (for better or worse) the growing importance of the rankings.

Research Methodology

The research design for this study is driven both by the conceptual framework and by the availability of data. The measure of prestige is the average “peer assessment” rating in USNWR—America’s Best Graduate Schools, 2008 edition (USNWR 2007). USNWR ranks graduate programs in the disciplines of business, education, engineering, law, and medicine every year and provides both survey peer assessment ratings and program statistical indicators about each school. Graduate programs in other fields like fine arts, health, library science, public affairs, sciences, social sciences, and humanities are covered every 2–5 years and their ratings are calculated solely on survey opinion. Thus, the USNWR data collection limited this study’s focus to the five graduate/professional fields for which the most data are available (which are also the five fields USNWR ranks on an annual basis). The study only examines 1 year of reputation ratings. Due to the slowly evolving nature of higher education, as well as the annual consistency in rankings, it is doubtful that including additional years in the study would add much to the analysis.

Sample

The institutions in this study include most of those appearing in the lists of “The Top Schools” in any of the five graduate and professional school disciplines of business, education, engineering, law, and medicine in the USNWR 2008 edition of America’s Best Graduate Schools (USNWR 2007). We include only those schools or colleges in the analysis with complete data for every variable in that discipline. After excluding those with missing data, the number of those in each disciplinary category are: 49 schools of business, 50 schools of education, 50 schools of engineering, 92 schools of law, and 51 schools of medicine (in the medical “research” category).

Variables and Data Sources

Dependent Variable

The average peer assessment rating for each school among all respondents serves as the measure of prestige in this study, and accordingly, serves as the dependent variable in the model. USNWR (2007) indicates that they send their reputation survey to deans, program directors, and/or faculty members in each discipline on each campus. For example, deans and program directors at business schools receive the survey and are asked to rate the other business schools across the country. Table 1 lists the specific types of individuals who were surveyed within each discipline, the number of programs surveyed, and survey response rates. Each survey respondent judges the overall “academic quality of programs” in their field on a scale from 1 (“marginal”) to 5 (“outstanding”).

Table 1 USNWR targeted survey population and response rates

Predictor Variables

Most of the predictor variables included in the current study are drawn from prior research on academic prestige, as well as from the resource/reputation model of quality outlined above. Table 2 provides the descriptive statistics for the available variables, separated by academic discipline. Table 3 categorizes the variables according to their section of the conceptual model in Fig. 1.

Table 2 Descriptive statistics for analytical variables
Table 3 Conceptual model and associated variables
Institutional Characteristics

Consistent with the conceptual framework in Fig. 1 and the resource/reputation model, institutional characteristics, especially size and wealth, enable institutions to invest in faculty and student resources that in turn generate faculty and student outcomes. Measures of size include the headcount (full-time and part-time) graduate student enrollment and the estimated number of faculty. In most cases the faculty count is derived from the reported faculty/student ratio, but for schools of business an internet search was conducted in order to obtain the estimated faculty size for each school. Number of degrees awarded is excluded because it is both a measure of size and a measure of program outcomes, and it is difficult to separate one from the other.

While the best measures of wealth in higher education may be revenues, expenditures, and endowment, such measures are not available for specific colleges or schools within an institution. Only university-wide numbers are available or even applicable for such measures. Thus, the study uses tuition price as an indirect indicator of wealth on the assumption that the income produced from tuition is the most significant source of revenue for each graduate school. Moreover, many institutions charge differential tuition prices for graduate students by discipline, especially in fields like medicine and law. Taking these issues, as well as the availability of data, into consideration, the study assumes non-resident tuition to be the best available measure of school-specific wealth among these schools of business, education, engineering, law, and medicine.

Faculty Resources and Student Resources

Although the conceptual model assumes that faculty resources are products of institutional investments in faculty salaries and recruitment, valid faculty salary data at the program or school level is not available. However, USNWR considers the student–faculty ratio to be a relative measure of faculty resources, which is where it appears in this study’s conceptual framework. USNWR provides student–faculty ratios for four of the five disciplines. For the business schools, student–faculty ratios were calculated for each school based on an internet search of school-specific websites.

Student resources include admissions recruitment and selectivity variables. USNWR reports the average standardized test score and the acceptance rate for all five of the disciplines in the study, and average undergraduate GPA is reported for three fields—business, law, and medicine. As noted below, the average admissions test score (GMAT, GRE, LSAT, MCAT) proved to be a robust contributor to the variance in reputation for all five disciplines in this study.

Faculty Outcomes and Student Outcomes

Consistent with other studies, this study examines both research expenditures and faculty publications as indicators of faculty outcomes (de Groot et al. 1991; Goldberger et al. 1995; Graham and Diamond 1997; Grunig 1997; Toutkoushian et al. 2003; Volkwein and Grunig 2005; Volkwein and Sweitzer 2006; Zheng and Stewart 2002). USNWR provides a measure of research expenditures for some of the disciplines in the study. Specifically, total funded research expenditures (averaged over 2005 and 2006) and research expenditures per full-time faculty are two variables that are available from USNWR for three of the five disciplines (education, engineering, medicine).

Other studies have measured faculty activity with total publications and per capita publications. For each of the five fields, publication count information was collected from the Institute for Scientific Information (ISI) Web of Science Citation Indices, drawing upon the Science Citation Index and Social Sciences Citation Index. So as to ensure counting publications that are attributable only to faculty in the particular college or school under consideration (as opposed to the university as a whole), the specific name of the college/school was entered into the search criteria in ISI. In addition, each professional field in ISI contains sub-categories that were used to select journals. Separating journals via the “subject category” function in ISI further ensured counting publications specific to a particular discipline. Moreover, searching on subject category enabled a more accurate count for those colleges/schools with no specific name, or for those researchers who did not indicate their specific college or department affiliation upon publication submission. Table 4 lists the subject categories under which journals were selected for each academic discipline. The publication counts are for the period from January 2001 through December 2005 so as to capture a multi-year measure from the period before USNWR began this particular administration of the reputation surveys in 2006 (for the 2008 edition).

Table 4 Subject categories that were searched in the Web of science citation indices (alphabetical, by discipline)

Total number of publications reflects faculty size as well as faculty productivity. Thus, most investigators now use per capita publications as better indicators of research and scholarly productivity (Graham and Diamond 1997; Toutkoushian, et al. 2003). For the per capita productivity measure, total publications for each professional school (obtained from the ISI database) are divided by the full-time faculty.

There is a dearth of variables representing student outcomes. USNWR reports a small handful of variables that may be considered student outcomes, but only for schools of business and law. Starting salary is measured for the business schools, the pass rates on the bar exam is reported for law schools, and the percentage of graduates employed at graduation is reported for both business and law schools. The few number of student outcome variables in the USNWR methodology is due to the fact that there continues to be a lack of outcome measures across higher education in general. A more detailed discussion of student outcome measures appears below in the “Discussion and Implications” section.

Analysis

Using the conceptual framework (Fig. 1) as a guide, the peer assessment score was employed as the dependent variable and a blocked (set-wise) regression model was estimated for each of the five separate graduate/professional school disciplines. So as to ensure comparability across the models, the same variables were entered in each of the regressions for the most part. The purpose of the study was to see which variables remained significant in the final model for each discipline and to compare the disciplines for similarities and differences in what relates to prestige. In the first block of the regressions, institutional characteristics were entered, such as the size and wealth of the school. In the second block, faculty and student indicators were entered. In the third block, variables reflecting faculty and student outcomes were entered.

Collinearity was avoided by picking the strongest indicator from each set of variables. For example, standardized admissions tests and admissions acceptance rates are highly correlated for each of the disciplines. Likewise, student enrollment, faculty size, and total publications are highly correlated with each other. The variable space was reduced by running correlations among all of the variables for each of the disciplines in order to select those variables in each group that display the strongest relationships with the peer assessment dependent variable. After preparing the available measures for multivariate analysis, enrollment, tuition, student–faculty ratio, the standardized admission test specific to each field, publications per faculty, and (where available) the post graduation employment and salary information were the most robust indicators for the regression analyses.

Results

Results for Individual Disciplines

Tables 5, 6, 7, 8, and 9 display the results of the regression analysis for each of the five graduate and professional school disciplines. The five tables display the standardized beta coefficients for each of the three blocks of variables, and with R-square values ranging from .474 to .878, the final (Model 3) results are robust in each case. The standardized beta coefficients are displayed (as opposed to the unstandardized coefficients) in order to better compare the results since there are differing units of measurement across the variables, allowing for a better comparison of results across disciplines.

Table 5 Blocked (set-wise) regression for peer assessment rating in schools of business (N = 49)
Table 6 Blocked (set-wise) regression for peer assessment rating in schools of education (N = 50)
Table 7 Blocked (set-wise) regression for peer assessment rating in schools of engineering (N = 50)
Table 8 Blocked (set-wise) regression for peer assessment rating in schools of law (N = 92)
Table 9 Blocked (set-wise) regression for peer assessment rating in schools of medicine (N = 51)

Table 5 shows the results for the schools of business. In Model 1 the size and wealth indicators account for 74% of the explained variance, the student and faculty resource indicators add another 7%, and the outcomes indicators add another 7%. The final model (Model 3) shows that full-time enrollments, average GMAT scores, and the starting salaries of graduates explain almost 88% of the variance in the USNWR reputation scores.

Table 6 shows the results for the schools of education. In Model 1 the size and wealth indicators account for 37% of the explained variance, the student and faculty resource indicators do not add significantly, but the faculty publications measure adds another 10%. Model 3 shows that the full-time enrollments and publications per faculty explain over 47% of the variance in the USNWR reputation scores for education schools.

Table 7 shows the results for the schools of engineering. In Model 1 the size and wealth indicators account for 45% of the explained variance, the admissions test score (GRE) adds another 17%, and the per capita publications add another 10%. Model 3 shows that the full-time enrollments and the publications per faculty explain over 72% of the variance in the USNWR reputation scores.

Table 8 shows the results for the schools of law. In Model 1 the size and wealth indicators account for about 40% of the variance, the student and faculty resource indicators add another 40%, and the outcomes indicators add another 5%. Model 3 shows that the full-time enrollments, the student/faculty ratios, the average LSAT scores, and the publications per faculty explain almost 85% of the variance in the USNWR reputation scores for law schools.

Table 9 shows the results for the schools of medicine. In this case the size and wealth indicators are not initially significant, but the student MCAT score combined with full-time enrollment explains 54% of the explained variance, and faculty resource and publications indicators add another 11%. Model 3 shows that the full-time enrollments, the student/faculty ratios, the average MCAT scores, and the publications per faculty explain over 65% of the variance in the USNWR reputation scores.

Comparison of Results Across Disciplines

The indicator of college/school size (full-time enrollment) is the only variable that appears significant in the final block (model 3) for all five of the graduate schools, but it has the greatest standardized beta coefficient for only two of the five (education and engineering). Thus, for schools of education (Table 6) and for schools of engineering (Table 7), the size of the full-time graduate enrollment explains over half the variance in reputation scores.

The average admissions test score remains significant in the final block (model 3) for four of the five disciplines, and it has the greatest standardized beta coefficient for two of those disciplines—law schools (Table 8) and medical schools (Table 9). The results indicate that the tested “quality” of the incoming students is twice as important as any other measure among schools of law and medicine. The discipline for which the admissions test is not significant in explaining reputation scores is education.

The indicator of faculty productivity (per capita publications) also proves significant for four of the five disciplines. Interestingly, in each case (education, engineering, law, and medicine), the faculty productivity variable exhibits the second largest standardized beta coefficient in the final regression model. It is probably not surprising that faculty productivity plays a significant role in explaining the variance in the reputation of graduate schools. Indeed, graduate education and research/scholarship often are presumed to go hand-in-hand.

The student–faculty ratio is not significant for three of the five disciplines, but for law schools (Table 8) it appears consistently significant, but weak in its influence on the results. For schools of medicine (Table 9), the ratio is not significant when entered in Model 2, but came back into the final model, albeit with the smallest standardized beta. Thus, in general, student–faculty ratio does not seem to play much of a role in explaining the reputation of graduate and professional schools.

Although tuition price is a rather crude measure of wealth or revenue, it proved to be an important foundational variable for all except medical schools. Perhaps it should be encouraging that the reputational “quality” of a medical school education has nothing to do with how much the student pays (or borrows) in tuition dollars.

What is also telling about the results is the amount of variance in reputation scores that is explained by each block of the regressions. Enrollment size and tuition are the foundational variables that were entered into the first block (Model 1) of each regression. For schools of business, just the size and tuition variables alone explain almost 74% of the variance in reputation. The regression for which the first block of variables is least related to reputation scores is that of medical schools, with Model 1 explaining not even 2% of the variance in reputation (Table 9).

In looking at the final blocks (Model 3) for each of the five disciplines, most of the regressions explain a substantial amount of the variance in reputation scores. The regression model for business schools (Table 5) and law schools (Table 8) are the best, explaining almost 88% and 85% of the variance in reputation scores, respectively. The regression for engineering schools is also robust (Table 7), with the final model explaining over 72% of the variance in reputation scores. The regression for medical schools explains 65% of the variance in medical school reputation (Table 9). The regression for schools of education is least impressive, with the final model explaining 47% of the variance in the reputation ratings of education schools (Table 6). Indeed, education is the only discipline for which its regression model explains less than half of the variance in reputation ratings. Apparently, in the field of education, something other than size, wealth, selectivity, and faculty productivity influences the reputation of graduate programs.

Conclusions

There are two broad conclusions from the five regression analyses in this study. First, the correlates of reputation in these five graduate/professional school fields are highly similar to the pattern seen in prestige studies at the undergraduate level (Grunig 1997; Volkwein and Grunig 2005). More specifically, this study’s results are similar to the findings in the Volkwein and Sweitzer (2006) analysis, which serves to confirm the choice of the Volkwein and Sweitzer model as a framework for the current study. Indeed, similar to undergraduate reputation, measures of enrollment size, admissions test scores, and faculty publications all explain a significant amount of the variance in USNWR reputation scores of graduate and professional programs. Enrollment size is significantly and positively associated with reputation in all five disciplines; admissions test scores are positively related to prestige in four of the five disciplines (with education the exception); and faculty publication productivity is significantly associated with reputation scores in four of the five disciplines (with business schools the exception).

Second, these results suggest that the determinants of graduate program prestige are similar across the various disciplines. Size is significant in explaining prestige for all five disciplines. Likewise, admissions selectivity and faculty publications are significant for four of the five fields. However, the study’s results also show that what relates to reputation does vary, at least somewhat, across different graduate and professional fields, and the conceptual framework appears to apply more appropriately to the fields of business, engineering, and law, than to the fields of education and medicine. Education is different due to the non-relevance of admissions test scores, and its lower R-square values. Medicine is different due to the non-relevance of the foundational measures of size and wealth.

Limitations

The study’s findings must be interpreted with the recognition that the study is based on a convenience sample of the top 50 or so graduate/professional schools in business, education, engineering, and medicine (top 100 in law), using primarily the data published in USNWR. Although the USNWR data collection effort includes the leading schools in each of these fields, the study lacks measures for some of the concepts outlined in the conceptual model, and other indicators are rather crude. For example, the indicators of school wealth, faculty resources, and student outcomes are not ideal.

The best wealth measure in higher education is either endowment or expenditure data. But such measures are not specific to a college or school, but are only available at the institutional level. In terms of faculty resources, other studies typically include average faculty salary in the analysis, but again, such a measure is not available at the specific college or school level. As far as student outcomes, as discussed above, the analysis is limited by the absence of broadly accepted and consistent outcomes measures, not only at a college or school level, but across higher education in general.

As is the case in any statistical analysis of reputation ratings, the endogenous relationship between school reputation and practically any of the explanatory constructs and predictor variables is an important point to consider. Using a systems perspective, the conceptual model in this study assumes that the size and wealth of these graduate and professional schools enable them to invest in faculty and student recruitment and resources. These investments can be expected to produce measureable faculty and student quality, productivity, and outcomes, which over time lead to prestige that is reflected in the USNWR reputational survey results. However, a systems perspective also suggests that the reputation of an institution strengthens or weakens resources, enrollment demand, admissions selectivity, and productive faculty, so the relationship among these and prestige is interactive and circular rather than linear. The reputations of strong institutions build on themselves, allowing the better schools to accumulate more and better resources over time, to hire increasingly productive faculty, and to attract students with high test scores, who upon graduation are viewed favorably by employers. This study is cross-sectional, however, and only shows which indicators are the most strongly associated with prestige ratings for a given year. Thus, the results are suggestive rather than definitive.

Discussion and Implications

Just as there are many problems with rankings at the undergraduate level, there are similar concerns with rankings at the graduate level. For example, graduate rankings that include standardized tests in their methodology encourage graduate programs to place a greater emphasis on such tests, perhaps turning their back to the value of a more diverse student body.

Another issue regarding the USNWR rankings is the emphasis the magazine places on the reputational surveys of deans and directors. The weight given to the USNWR peer assessment survey certainly suggests the significance of the resource/reputation model of quality.

Perhaps one of the underlying sources of controversy over the USNWR rankings is the title of the annual magazine. At the graduate level, the title of the USNWR rankings edition is America’s Best Graduate Schools. If USNWR truly evaluated which of America’s graduate schools are the “best,” the most important measure in the ranking methodology would perhaps be a measure of student learning outcomes. However, measuring student outcomes continues to be a significant challenge in higher education. At the undergraduate level, decades of research have proven how difficult a task it is to measure the impact that a college education has on a student. Pascarella and Terenzini (1991, 2005) needed hundreds of pages to summarize and synthesize that research. Measuring the impact of graduate education has not necessarily been any more successful.

Tests of student learning outcomes and assessment measurements do exist, but they are highly correlated with admissions statistics, and are therefore questionable as to their worth or validity. At the undergraduate level, the Collegiate Learning Assessment, one of the best-known and well-regarded of these assessment instruments, is correlated with incoming student characteristics at 0.9 (Banta 2008). Thus, incoming student abilities (as measured by the SAT) alone account for approximately 81% of the variance in scores on a test of learning outcomes. Beyond the incoming characteristics, after other factors are taken into account, such as various socioeconomic measures and student motivation, just how much of an impact a college education itself has on a student is difficult to measure.

It is surely the case that any measurement of learning outcomes in graduate education must also be confounded by incoming student characteristics. Indeed, at the graduate level, it would be naïve to assume that scores on admissions tests such as the GRE, GMAT, MCAT, and LSAT are not significantly correlated with any measurement of educational outcomes.

One possible solution to the lack of broadly-constructed outcomes measures that has been proposed is discipline-specific outcomes measures. Indeed, field-specific measures may be especially appropriate for graduate education. Study at the graduate level is much more discipline-specific than study at the undergraduate level, where general education and liberal arts courses are a common, if not emphasized, component of the curriculum. Dwyer et al. (2006) propose a system of “workforce readiness” to assess the quality of education at the postsecondary level, and acknowledge that certain professional fields already have such measures, including medicine and law.

Of course, one outcome measure that is certainly discipline-specific is starting salary of graduates. Although many might argue that such a measure should not be associated with educational quality, USNWR includes the variable in its methodology for business schools. Certainly, business schools are closely tied to external market forces, but so are other fields, such as engineering. Beyond salary data, job placement information might be more meaningful. If ratings and rankings are going to continue to use such measures, it is hoped that more detailed placement information on graduates will be captured across the disciplines. One could argue that a job placement measure should outweigh all other variables in determining overall rank.

No other academic discipline has a greater number of rankings than that of business, and it is likely that business schools place the most importance on the rankings, perhaps due to the competitive nature of the business world. As an alternative to rankings, the Association to Advance Collegiate Schools of Business (2005) has been making the case that disciplinary accrediting bodies and their member institutions should emphasize the importance and significance of the accreditation process. The AACSB proposes that schools should highlight their accreditation results on their websites, rather than their high ranking.

Perception Versus Reality

It is difficult, perhaps impossible, to know how accurately the perception of deans and admissions directors matches real quality in graduate education, as distinct from the large amount of marketing and promotional material that schools produce and distribute to their peers, not coincidentally around the same time the USNWR surveys are mailed. It is likely that schools could better utilize their limited resources by focusing their efforts on their students and faculty, rather than on those who will be rating them in a magazine. Similarly, there are those who feel that emphasizing graduate school reputation may have negative consequences for undergraduates (AACSB 2005; Policano 2005).

Perhaps the bigger issue is the connection between reputation and the professional and technical skills actually acquired by students. As discussed above, how accurately the USNWR ratings, or any reputation ratings, capture the construct of educational quality in either graduate or undergraduate education remains a significant issue in higher education. Indeed, the resource/reputation model of quality may not be worth much to students who are increasingly putting themselves deeper into debt every year in order to pay for their education.

The case could be made, however, that the difficulty in measuring the true impact a given college or university has on a student is precisely the reason why reputation matters. Winston (1997) makes the case that in a non-profit industry there is asymmetric information between buyer and seller. Moreover, in higher education, the customer does not truly know what he or she is buying, or the result of the purchase, until many years down the road. In such a situation, Winston argues, the reputation of the product is all a customer has. Nevertheless, even if institutions follow the will of many in the higher education community by deemphasizing their ranking, the consumer’s fascination with finding out “Who’s Number One?” may be the deciding force in determining whether or not rankings at any level are here to stay.