1 Introduction

Recent regulatory changes coupled with increasing globalization of the US economy requires certified public accountant (CPA) firms to provide clients more breadth and depth of services. Consequently, CPA firms has expanded their operations through a series of mergers and acquisitions (M&A) and organic growth. While continuing growth has been a pivotal part of CPA firms’ long term viability, little is known empirically about whether CPA firms has exhausted economies of scale through expansions. Therefore, the purpose of this study is to empirically examine the returns to scale pattern of CPA firms in the post Sarbanes–Oxley Act (SOX hereafter) era. Towards this end, we use operations data on 70 non-Big 4 CPA firms to estimate their underlying production function. If these CPA firms’ production function exhibits increasing returns to scale (IRS), then their recent expansions were justified and they could benefit further by continuing increasing their operation scale size. However, if these CPA firms’ production function exhibits decreasing returns to scale (DRS), then they could have exhausted economies of scale and would need to consider decreasing their operation scale size in order to improve their productive efficiency. CPA firms exhibiting constant returns to scale (CRS) are at their optimal productive scale size (Banker 1984) and no change in their operation scale size is required.

Since its enactment in July 2002, SOX has significantly changed the playfield of CPA firms through the requirements imposed by its Sections 201 and 404. For example, Section 201 prohibits CPA firms from providing non-audit services, such as consulting services, to their audit clients and can reduce their revenues, whereas Section 404 requires CPA firms to perform additional audit work on clients’ internal control system evaluations and can increase the audit fee charged for audit services (Chang et al. 2009a). Further, public companies hire other CPA firms instead of their auditors to provide consulting services as well as to assist in the establishment, documentation and assessment of internal control systems in order to comply with the requirement of Section 404. Chang et al. (2009a, 2010) demonstrate that the increased audit work load has forced leading CPA firms, such as the Big 4,Footnote 1 to drop their less profitable or high-risk audit clients. These clients have to turn to other, likely smaller, CPA firms for their audit services. In addition, some public companies switched to smaller CPA firms for better personalized services and reduced audit fees. The shift of clients from the Big 4 firms to smaller firms, together with the increased demand for additional audit services, leads to another wave of expansions through M&A and internal growth among the non-Big 4 CPA firms in the post SOX era (AICPA 2011).

The requirement for CPA firms providing audit services on public companies to register with the Public Company Accounting Oversight Board (PCAOB) and to perform its inspection has also affected the operation of CPA firms. SOX requires that CPA firms with 100 or less public clients be subjected to triennial inspection while those with more than 100 public clients be subjected to annual inspection. With the 100 public client level as the cutoff point for the different inspection requirements, small CPA firms with Less than 100 public clients can reduce their inspection costs per client by expansion (e.g. increase their public clients up to 100). Since a large portion of the inspection cost is fixed as long as the number of clients stays below 100, the more clients a CPA firm has, and the lower the inspection cost per client is. Likewise, for CPA firms with more than 100 public clients, the expansion also reduces the inspection cost per client. Consistent with this notion, Hood (2005) observes a significant increase in M&A activities within the second-tier CPA firms.Footnote 2 In particular, he finds a jump in M&A activities in 2004: about one-third of the second-tier CPA firms were involved in M&A activities during that year and the firms, on average, expanded their sizes by 13 %.Footnote 3 In addition, with increasing globalization, providing audit and accounting services to large multinational clients requires technical expertise and global resources that are not readily accessible to non-Big 4 CPA firms (Chang et al. 2009b). To build their geographic footprint and seize global opportunities, many second-tier CPA firms augment their operations by pursuing M&A of other firms in order to attract new talent, expand practice areas, and better leverage their expertise and resources (Accounting Today Institute 2012).Footnote 4

While the expansion enables CPA firms to pursue economies of scale, it can also lead to diseconomies of scale due to the difficulties of managing a larger workforce as these firms are getting bigger. More specifically, there are three major factors causing diseconomies of scale to occur when CPA firms over-expand their operations: (1) poor communication: As CPA firms grow their business communicating between different divisions and departments and along the chain of command becomes more difficult; (2) lack of coordination and control: As CPA firms get too big, coordination and control of activities get harder since it is harder for managers to closely monitor their subordinates and ensure they are all working together efficiently and effectively; (3) lack of motivation: it becomes more difficult for managers to build up a good team environment and sense of belonging as CPA firms expand their operations. This may result in lower employee loyalty and motivation with damaging consequences for production efficiency.

Since it is not clear in priori whether firms involved in expansions have exhausted scale economies, it is essentially an empirical research question. We address this question by examining how recent expansions among the second-tier CPA firms in the post SOX era affects their returns to scale patterns. Specifically, we adopt data envelopment analysis (DEA) to evaluate whether CPA firms’ production function is characterized by CRS, DRS, or IRS. DEA is a useful analysis tool in this setting because it helps to determine whether a CPA firm has achieved their optimal productive scale size in using its inputs to generate outputs (Chandra et al. 1998; Emrouznejad et al. 2008). The optimal productive scale size is identified based on comparisons of the average productivities of CPA firms within the sample (Banker 1984). As such, it sets a realistic target of production rate improvement for the less efficient firms, based on the production frontier of the most efficient firms in the comparison group.

In a related study, Chang et al. (2009b) provides descriptive information of returns to scale patterns for 2003 and 2004, the first 2 years after the passage of SOX, on the Big 4 and non-Big 4 CPA firms using annual survey data from Accounting Today. They observe a CRS prevailing for the Big 4 CPA firms. However, approximately half of their second-tier CPA firms exhibit IRS while the remaining CPA firms reveal either CRS or DRS. Thus, they suggest that about half of the second-tier CPA firms can improve their production efficiency by increasing their exiting operation scale size. This current study extends Chang et al. (2009b) by examining the returns to scale pattern of non-Big 4 firms surveyed by Accounting Today for the period 2005–2010. The pattern of returns to scale can differ in our sample period from those documented in Chang et al. (2009b) for the following three main reasons.Footnote 5 First, the deadline for compliance with Section 404 for accelerated filers (i.e. firms with a public float of at least $75 million) is November 15, 2004. As the workload and cost of compliance with Section 404 is likely to be highest in the initial phase and would gradually reduce in subsequent years, the returns to scale pattern can vary with this change in workload. Therefore, it is interesting to examine how the returns to scale pattern of CPA firms evolve subsequent to the first 2 years after the passage of SOX. Second, expansions through M&A and organic growth helped the second-tier CPA firms significantly increase their operation scale size (AICPA 2011; Hood 2005). The impact of this change on operation scale size is more likely to show up in the years following the expansion rather than in the expansion year. Our 6 years sample period covering 2005–2010 allows us to capture the potential lagging effect of the expansion activities. Third, our sample period includes the period both before and after the financial crises which started in 2008. This enables us to examine whether the financial crises have a significant effect on CPA firms’ returns to scale patterns. While Chang et al. (2009b) observe mixed results based only on descriptive information, we provide empirical evidence from DEA-based statistical tests of returns to scale for the non-Big 4 CPA firms. Our results indicate that DRS prevails in these CPA firms in the post SOX era, suggesting that these firms had exhausted economies of scale with their recent expansions. As a result, they may consider stopping their expansion plans and contracting the operation scale size in order to get back to their optimal productive scale size.

The rest of the paper is organized as follows. In “Data envelopment analysis and its statistical tests for returns to scale” section, we discuss the DEA methodology and its statistical tests for returns to scale. In “Sample and descriptive statistics” section, we describe our sample selection process including the characteristics of our sample CPA firms, and discuss our empirical results. In “Conclusions” section, we present a summary of our findings and conclude the paper.

2 Data envelopment analysis and its statistical tests for returns to scale

Data envelopment analysis provides a nonparametric approach to estimate the production correspondence between inputs and outputs and evaluates the production inefficiency of decision making units (DMUs) from observed data. DEA has been extensively applied in various service industries including auditing, health care, and education industries. For instance, Feroz et al. (2005) apply DEA in their analytical procedure to determine audit scope and to assess the preliminary risk level of a client. Ozcan et al. (2010) use DEA to evaluate the performance of Brazilian university hospitals. Banker et al. (2010) employ DEA to compute the relative productivity of retail outlets of a high-end retailer. Grigoroudis et al. (2013) utilize DEA to evaluate the relative efficiency of a set of bank branches. Duh et al. (2014) measure the operating efficiency of universities using DEA. To elaborate on how this is involved, we define Y and X as the observed output and input vectors for DMUs (e.g. CPA firms). Then we can specify the production possibility set S as:

$$\begin{aligned} \hbox {S }=\{\left( {\hbox {Y},\hbox { X}} \right) |\hbox { Y}\ge 0\hbox { can be produced from X} \ge 0\}. \end{aligned}$$
(1)

The inefficiency is measured radially by the reciprocal of Shephard’s (1970) distance function. Therefore, the inefficiency of a DMU \((\hbox {Y}_{0,} \hbox {X}_{0}) \in \) S is given by:

$$\begin{aligned} \uptheta \left( {\hbox {Y}_{0,} \hbox {X}_0 } \right) =\hbox { sup}\{\uptheta |\left( {\uptheta \hbox { Y}_{0,} \hbox {X}_0 } \right) \in \hbox {S }\} \end{aligned}$$
(2)

Banker (1993) makes three main assumptions: (i) the production possibility set S is convex [i.e. If \((\hbox {Y}_{1,} \hbox {X}_{1}) \in \) S and \((\hbox {Y}_{2, }\hbox {X}_{2}) \in \)S then \(({\uplambda }_{1}\hbox {Y}_{1}+{\uplambda }_{2}\hbox {Y}_{2, }{\uplambda }_{1}\hbox {X}_{1}+{\uplambda }_{2}\hbox {X}_{2})\; \in \) S for all \({\uplambda }_{1}, {\uplambda }_{2} \ge 0\) such that \({\uplambda }_{1}+{\uplambda }_{2} = 1]\), (ii) the production possibility set S exhibits monotonicity [i.e. If \((\hbox {Y}_{1,}\hbox {X}_{1}) \in \) S, \(\hbox {Y}_{2 }\le \hbox {Y}_{1}\) and \(\hbox {X}_{2} \ge \hbox {X}_{1}\) then \((\hbox {Y}_{2}, \hbox {X}_{2}) \in \hbox {S}]\), and (iii) the probability density function \(\hbox {f}(\uptheta )\) is such that \(\hbox {f}(\uptheta )=0\) if \(\uptheta < 1\) and \(\int \limits _1^{1+\delta } {f(\theta )\hbox {d}\theta } > 0\) for \(\theta >0\). By following Banker (1993), we can obtain a consistent estimator of the inefficiency \(\hat{{\theta }}\) by solving the following DEA model of Banker et al. (1984) (BCC hereafter):

$$\begin{aligned} \hat{{\theta }}\left( {\hbox {Y}_0 ,\hbox { X}_0 } \right) \equiv \hbox {Max }\uptheta \end{aligned}$$
(3.0)

subject to

$$\begin{aligned} {\mathop \sum \limits _{{n = 1}}^\mathrm{{N}} }{\uplambda }_{{\text {j}}} {\text {Y}}_{{\text {j}}} \ge \uptheta \,{\text {Y}}_{0} \end{aligned}$$
(3.1)
$$\begin{aligned} {\mathop \sum \limits _{{n = 1}}^\mathrm{{N}} }{\uplambda }_{{\text {j}}} {\text {X}}_{{\text {j}}} - {\text { X}}_{0} \le 0 \end{aligned}$$
(3.2)
$$\begin{aligned} {\mathop \sum \limits _{{n = 1}}^\mathrm{{N}} }{\uplambda } _{{\text {j}}} = {\text { 1}} \end{aligned}$$
(3.3)
$$\begin{aligned} \theta \hbox { and } {\uplambda } _{{\text {j}}}\ge 0 \end{aligned}$$
(3.4)

Banker (1993) demonstrates that the DEA estimator \(\hat{{\theta }}\) using the BCC model as in (3.03.4) is statistically consistent and the asymptotic empirical distribution of the DEA estimator retrieves the true distribution of \(\hat{{\theta }}\) under the three assumptions described in the above. Since these assumptions do not impose the constraint of CRS on the production possibility set, they are consistent with variable returns to scale (VRS). We refer to such an inefficiency measure as \(\hat{{\theta }}^{B}\). However, if we impose additional constraint that the production possibility set exhibits CRS, then by adding the assumption of CRS (i.e. If (Y, X) \(\in \) S then (kY, kX) \(\in \) S for any k \(>\) 0), we can obtain the Charnes et al. (1978) (CCR hereafter) ineffi-ciency estimates \(\hat{{\theta }}\) for a DMU by solving the same linear program as before; except that the objective function in (3.0) is maxi-mized subject only to constraints (3.1), (3.2) and (3.4), while constraint (3.3) is eliminated.

By analogy, the CCR estimator is also statistically consistent under the four assumptions described in the above. We refer to the CCR inefficiency measure as \(\hat{{\theta }}^{C}\). Given that the CCR model enables us to estimate the aggregate technical and scale inefficiency and that the BCC model helps us estimate the pure technical inefficiency of a DMU at the given scale of operation, we can calculate the scale inefficiency measure \(\hat{{\theta }}^{S}\) as the aggregate inefficiency (\(\hat{{\theta }}^{C})\) obtained from CCR model divided by the technical inefficiency obtained from BCC model \((\hat{{\theta }}^{B})\). That is, \(\hat{{\theta }}^{S}=\hat{{\theta }}^{C} /\hat{{\theta }}^{B}\).

As described earlier, Banker (1993) proves that DEA estimators are statistically consistent. Therefore, under the null hypothesis of CRS, the asymptotic empirical distributions of DEA estimators, \(\hat{{\theta }}^{B}\) and \(\hat{{\theta }}^{C}\) are identical, each retrieving the true distribution of \(\theta \) (Banker 1993). The asymptotic correspondence between the empirical distributions of \(\hat{{\theta }}^{B}\) and \(\hat{{\theta }}^{C}\) under the null hypothesis of CRS allows us to construct the following two DEA-based statistical tests for our study:

  1. (i)

    If the inefficiency \(\uptheta \) is assumed to have an exponential dis-tribution over \([1, \infty )\) with mean \(1 + \upsigma \), then \(\sum \limits _{j=1}^N (\hat{{\theta }}_j \hbox {1})/\upsigma \) follows the Chi-square distribution with 2N degrees of freedom. Thus, we can construct the following test statistic:

    $$\begin{aligned} \hbox {T}_1 \equiv \sum \limits _{j=1}^N {(\hat{{\theta }}_j^C -1)\!\Bigg /\! \sum \limits _{j=1}^N {(\hat{{\theta }}_j^B -1)} } \end{aligned}$$
    (4)

    which can be evaluated using the F-distribution with (2N, 2N) degrees of free-dom.

  2. (ii)

    If the inefficiency \(\theta \) is assumed to have a half-normal distribution over the range of values \([1, \infty )\) with mean \(1+ \upsigma \), then \(\sum \limits _{j=1}^N {(\hat{{\theta }}_j -\hbox { 1})^{\hbox {2}}\!\Bigg /\!\upsigma ^{\hbox {2}}} \) follows the Chi-square distribution with N degrees of freedom. Thus, we can construct the following test statistic:

    $$\begin{aligned} \hbox {T}_2 \equiv \sum _{j=1}^N {(\hat{{\theta }}_j^C -1)^{2}\!\Bigg /\!\sum _{j=1}^N {(\hat{{\theta }}_j^B -1)^{2}} } \end{aligned}$$
    (5)

    which can be evaluated using the F-distribution with (N, N) degrees of freedom.

Since Banker (1993) shows that the DEA estimator is biased for finite samples, the above DEA-based test statistics need not follow the F distribution. Notwithstanding this limitation, similar DEA-based statistical tests of inefficiency differences between two groups have been found to outperform conventional parametric and non-parametric tests for finite samples in extensive Monte Carlo simulation studies (Banker and Chang 1995).

Note that rejection of the null hypothesis of CRS test simply suggests that non-constant returns to scale prevails. In order to further investigate whether or not DRS holds for CPA firms, we can impose non-decreasing returns to scale (NDRS) constraint on the production possibility set to obtain the inefficiency estimator by solving the BCC model in (3) with “\(\le \)” instead of “=” in constraint (3.3). By referring to such inefficiency measure as \(\hat{{\theta }}^{D}\), we can construct the test statistics for NDRS similar to those for CRS except that \(\hat{{\theta }}^{D}\) is substituted into the denominator for \(\hat{{\theta }}^{B}\)under the null hypothesis of NDRS. Since the CRS production frontier envelops the data less closely than NDRS, its inefficiency estimate \(\hat{{\theta }}^{C}\)is greater than or equal to \(\hat{{\theta }}^{D}\)calculated from the NDRS production frontier. Thus, for any DMU, equality suggests that the prevalence of NDRS and inequality implies DRS prevailing. Alternatively, we can impose non-increasing returns to scale (NIRS) constraint on the production possibility set to obtain the inefficiency estimator by solving the BCC model in (3) with “\(\ge \)” instead of “=” in constraint (3.3). We refer to this inefficiency measure as \(\hat{{\theta }}^{E}\)and construct the test statistics similar to those for CRS except that \(\hat{{\theta }}^{E}\) is substituted into the nominator for \(\hat{{\theta }}^{C}\)under the null hypothesis of NDRS. Since the NIRS production frontier envelops the observed data less closely than the VRS production frontier, the resulting inefficiency measure \(\hat{{\theta }}^{E}\) is greater than or equal to \(\hat{{\theta }}^{B}\) calculated from the VRS production possibility set. Thus, for any DMU, equality implies the prevalence of NDRS and inequality suggests the more restrictive property of DRS.

Conversely, we can evaluate whether or not IRS holds for CPA firms by constructing the test statistics for NIRS similar to those for CRS except that \(\hat{{\theta }}^{E}\) is substituted into the denominator for \(\hat{{\theta }}^{B}\) under the null hypothesis of NIRS. Since the CRS production frontier envelops the data less closely than NIRS, its inefficiency estimate \(\hat{{\theta }}^{C}\) is greater than or equal to \(\hat{{\theta }}^{E}\) calculated from NIRS production frontier. Thus, for any DMU, equality implies the prevalence of NIRS and inequality indicates IRS prevailing. Alternatively, we can construct the test statistics for NIRS similar to those for CRS except that \(\hat{{\theta }}^{D}\) is substituted into nominator for \(\hat{{\theta }}^{C}\) under the null hypothesis of NIRS. Since the NDRS production frontier envelops the observed data less closely than the VRS production frontier, the resulting inefficiency measure \(\hat{{\theta }}^{D}\) is greater than or equal to \(\hat{{\theta }}^{B}\) calculated from the VRS production possibility set. Thus, for any DMU, equality implies the prevalence of NIRS and inequality suggests the more restrictive property of IRS.

3 Sample and descriptive statistics

Each year Accounting Today surveys 100 accounting firms with the highest revenue in US Information on these firms’ revenues and number of employees is published in the issue of the following year. All data reported in the annual surveys are for domestic US operations and exclude foreign holdings. Our initial sample includes only firms in the top 100 list for each of the 6-year period 2005–2010 (Accounting Today issues 2006–2011). We then exclude non-CPA firms, such as Liberty Tax Services and Schenck Business Solutions, from our sample. We also exclude Big 4 CPA firms from our sample as all but one, Deloitte Touche Tohmatsu, do not report any revenues from MAS revenues from 2006 onward. Eventually, we use a total of 70 non-Big 4 top CPA firms in our analyses.

Following Banker et al. (2003) and Chang et al. (2009b), we focus on the relationship between service revenues generated and human resources employed by CPA firms. We partition the service revenues, by the types of output, into Accounting & Auditing Services (A&A), Tax Services (TAX) and Management Advisory Services (MAS). A&A includes compilations, special reports, and reviews in addition to engagements involving the attest function. TAX consists of tax research, planning and preparation work. MAS comprise such services as consulting, systems development, integrating and reselling computer equipment and software, and any other management assistance (Banker et al. 2003). The three human resource input variables considered are number of partners (PARTNERS), number of professionals (PROFESSIONALS), and number of non-professional employees (OTHERS). The designation PARTNERS includes all owners and shareholders of the accounting firms. PROFESSIONALS refer to staffs trained to perform the accounting and auditing services, and include such personnel as staff accountants, senior accountants, and managers. Those who passed CPA examinations but have not obtained CPA licenses are also included in this category. OTHERS encompasses the clerical and support personnel involved in such activities as administration, printing of reports, record keeping and the like (Banker et al. 2003).

We report in Table 1 descriptive statistics on outputs and inputs of our sample CPA firms. There is generally an increase in both total revenues and number of employees over the 6 years period. For example, the mean total revenues increase by approximately 48 % from 98.8 million dollars in 2005 to 146.5 million dollars in 2010 while the mean total employees (sum of PARTNERS, PROFESSIONALS and OTHERS) increase by about 39 % from 642 in 2005 to 891 in 2010. The increase in PARTNERS is particularly acute because it increases by over 36 % from 65 in 2005 to 89 in 2010. The number of PROFESSIONALS increases by over 23 % from 431 in 2005 to 533 in 2010, and that of OTHERS increases by nearly 16 % from 145 in 2005 to 168 in 2010. The relatively larger increase in PARTNERS is consistent with the argument that CPA firms increase their size by expanding their hiring of partners mainly through mergers and acquisitions to accommodate the increasing demand for audit services in the post SOX era. The high standard deviations of the variables suggest that these non-Big 4 CPA firms vary significantly in their operation scale size and composition. Likewise, median values for all variables are much smaller than the means indicating that large disparities exist between the smallest and largest CPA firms in the sample even without the presence of Big 4 CPA firms.

Table 1 Descriptive statistics on service revenues and human resource \((\hbox {N} = 70)\)

Based on the data on three outputs (A&A revenues, TAX revenues and MAS revenues) and three inputs (PARTNERS, PROFESSIONALS and OTHERS) of CPA firms, we estimate their relative inefficiencies using DEA for each of the six post SOX years separately as well as for all 6  years pooled together as a whole. We test three hypotheses regarding returns to scale: (i) the null hypothesis of CRS against the alternative hypothesis of VRS, (ii) the null hypothesis of NDRS against the alternative hypothesis of DRS, and (iii) the null hypothesis of NIRS against the alternative hypothesis of IRS. We perform the tests for two different inefficiency distribution assumptions: the exponential distribution and the half-normal distribution.

Table 2 reports our statistical test results when the inefficiency \(\theta \) is assumed to be exponentially distributed. For our sample period of 2005–2010 and for \(\alpha = 10\,\%\), the first row of Table 2 indicates that the null hypothesis of CRS is rejected in favor of the alternate hypothesis of VRS in all 6 years exampled. In addition, when we pool all the data point together over the 6 years, we find that the null hypothesis of CRS is rejected at 1 % significant levelFootnote 6 as indicated in the last column of Table 2. Taken together, our empirical results seem to indicate that the financial crises started in 2008 have little impact on CPA firms’ returns to scale patterns.

Observing that the production function of CPA firms does not exhibit CRS, we next evaluate whether VRS stem from DRS or IRS during the post SOX era. The statistical test results are reported in the second and third rows of Table 2 under the null hypothesis of NDRS. As can be observed, the null hypothesis of NDRS is rejected at 10 % significant level for all 6 years except 2006 which has p values that are close to the 10 % level, suggesting that CPA firms’ production function exhibits DRS.Footnote 7 As expected from this finding, the null hypothesis of NIRS is not rejected at any conventional significant level as indicated in the fourth and fifth rows of Table 2. Similarly, estimating inefficiency over the entire sample period reaches the same conclusion that CPA firms exhibit DRS and the financial crises do not have any impact on returns to scale patterns.

We present the statistical test results in Table 3 when we maintain the assumption that the inefficiency \(\theta \) is half-normally distributed. Consistent with, though statistically somewhat weaker than, those reported in Table 2, we find that the null hypothesis of CRS is rejected in favor of the alternate hypothesis of VRS in each of the 6 years and all 6 years together. Similar to those reported in Table 2, the null hypothesis of NDRS is rejected for all 6 years except 2006. Further, when we pool all the data points together over the 6-year period to estimate the production frontier of CPA firms, the null hypothesis of NDRS is rejected at 1 % significant level. Obviously, the null hypothesis of NIRS can’t be rejected at any conventional significant levels. Collectively, our findings infer that to large extent the non-Big 4 CPA firms over-expanded their human resources through M&A and organic growth in the post SOX era and may consider stopping expansion and reducing their operation scale size in order to improve their production efficiency.

Table 3 Statistical test results of returns to scale when the inefficiency is assumed to be half-normally distributed (p values in parentheses)

4 Conclusions

Public accounting industry has experienced remarkable changes in the past two decades. The surging demand for management advisory services in 1980s and 1990s has led CPA firms to go through several waves of expansions through M&A and organic growth in pursuit of scale economies. With the new sets of regulations introduced by the Sarbanes–Oxley Act, CPA firms have been required to deal with intensified audit services but restricted non-audit services such as tax and management advisory services. These restrictions on the public accounting industry alter CPA firms’ output (revenue sources) and input (allocation of professionals) structures. In addition, continuing globalization of the US economy requires CPA firms to build their competencies by differentiating services as well as acquiring experienced talents. To sustain growth and enhance competitive edge, many non-Big 4 CPA firms actively involved in M&A and internal growth activities after SOX. This raises an intriguing question of whether the expansion has enabled these firms to exploit scale economies or led them to overpass their optimal productive scale size.

Using data from Accounting Today’s annual surveys of the top 100 accounting firms, we examine the returns to scale pattern of the non-Big 4 firms in the post SOX era. Specifically, we use a non-parametric DEA methodology to estimate production inefficiency and construct DEA-based statistical tests of returns to scale for these CPA firms. Our empirical test results indicate that the production function of the non-Big 4 CPA firms in our sample exhibits DRS pattern during 2005–2010. This result suggests that our sample non-Big 4 CPA firms had overgrown and exhausted economies of scale through M&A and organic growth. Therefore, in order to improve their production efficiency, these CPA firms may consider stopping their expansion plans and downsizing or divestitures. However, a caveat is in order. Our sample CPA firms may be focusing on long terms growth or improving service at the expense of short term production inefficiencies. That is, DRS may simply appear as a short term phenomena in the early years of expansions. Thus, our findings about the returns to scale pattern need to be interpreted with caution.

Given that the findings of this study indicate scale diseconomies, it would be interesting for future research to estimate the most productive scale size (MPSS) for each of these firms. Additionally, many CPA firms continue to engage in M&A activities after 2008 (AICPA 2011) in spite of the prevailing DRS pattern observed in this study. Thus, it would also be interesting for future research to look into this issue by investigating other factors, unrelated to returns to scale, affecting these firms’ M&A decisions.