Introduction

Faculty development programs offer professional development activities intended to support faculty in performing their academic roles (Centra 1978; Steinert 2014). Although, such programs are now commonplace in the health professions, questions remain about their long-term impact, and the (in)adequacy of existing research approaches for addressing impact questions (McLeod and Steinert 2010; O’Sullivan and Irby 2011; Singh 2013). In this paper, we contend that theoretically-grounded evaluation approaches are critical to further understanding of the workings and outcomes of faculty development in the health professions. Ergo, we provide rationale for and detailed explanation of key theory-based evaluation strategies and their utility in evaluative research. Our research draws on these strategies to evaluate an intensive longitudinal faculty development program,

The value of theory-based evaluation

Whereas demonstrating faculty development program outcomes is a worthwhile and increasingly mandated objective, it should be corroborated by an understanding of the mechanisms by which programs realize outcomes (Haji et al. 2013; Parker et al. 2011). Though there have been a few recent efforts at exploring mechanisms of change in health professional education programming (Parker et al. 2011; Sorinola et al. 2015), research remains scant, and the scope of outcomes examined tends to be limited to individual learning. Too often, evaluators tend toward black box (or input–output design) evaluations that seek to lay claim to outcomes and/or impact without offering accompanying logical or theoretically grounded linkages as to how these outcomes may come about (Stame 2004). Without considering what unfolds between inputs and outputs, evaluation research can provide a narrow and distorted understanding of the workings of programs (Chen and Rossi 1983).

Faculty development programs are characterized by complexity as they unfold in clinical and educational systems where many factors, including institutional structures, cultural norms, and individual differences influence outcomes. In complex environments, outcomes are reliant upon multiple variables with interacting or moderating effects, and are influenced by complex chains of human reasoning and action, and interplays between individuals and contexts (Patton 2011). In order to understand how faculty development programs generate or contribute to various outcomes, one needs to adopt evaluative strategies that develop integrated understanding of (1) how participants interact with program activities and (2) how those interactions influence practice contexts. Consequently, we adopt theory-based evaluation methodology to elucidate how faculty development programming can work to generate or support a range of outcomes among individuals and sub-systems in the academic health sciences.

Conceptual frameworks

Theory-based evaluation describes a family of evaluation approaches generally focused on exploring the conceptual underpinnings of how social programs work to produce change. (Blamey and Mackenzie 2007; Stame 2004). It includes both realist and theory-driven evaluation approaches, which we integrate in this study. Realist evaluation views programs as social systems in which there is a constant interplay between individual agency and social structures (Pawson and Tilley 1997). Change is viewed as a product of structure and agency, and the task is to determine what works, for whom, in what circumstances, and why (Pawson and Tilley 1997). Proponents contend that programs, in themselves, do not create change but, rather, introduce resources and opportunities for change; it is people who, when exposed to programs, activate certain mechanisms and create change, depending on their capacity and context (McEvoy and Richards 2003; Pawson and Tilley 1997). Thus, the central proposition is that given a generative mechanism and a conducive context, one can expect to observe specific outcomes:

$${\text{Context}}\,\left( {\text{C}} \right) + {\text{Mechanism}}\,\left( {\text{M}} \right) = {\text{Outcome}}\,\left( {\text{O}} \right)$$

Diverse CMO permutations can exist in any one program (Astbury 2013) and mechanisms may be efficacious in some contexts, but not in others. Consequently, over time and across individuals, a program can produce mixed outcomes. Rather than merely identifying discrete outcomes, realist evaluators seek to understand how and why various outcome patterns are generated (Astbury 2013).

The theory-driven evaluation approach also demands examination of the transformational processes that occur between program inputs and observed outcomes (Chen and Rossi 1983, 1989, 1992). In this approach, evaluators are urged to draw upon extant social science theories from various disciplines (e.g., sociology, psychology, microeconomics) that can adequately anticipate and explain the effects that programs are expected to have. Program activities are presumed to have effects that unfold along a temporal gradient, from short term (proximal) outcomes, to longer-term (distal) outcomes (West and Aiken 1997). Ultimately, the evaluator’s role is to conceptualize a coherent account how of a given program spurs various mechanisms to generate various short- and long- term outcomes (Rogers et al. 2000), or construct a plausible and defensible model of how a given program works (Chen and Rossi 1983). This account/model should be grounded in extant social science research on the issues the program seeks to address (Chen and Rossi 1992). Where empirically grounded social science theory supports conceptualized links, there is evidence for the efficacy of the program.

Despite the merits of realist evaluation, challenges in its application and noteworthy criticisms exist. First, implementation can be daunting; several studies that wield the realist banner fail to adhere to its principles (Astbury 2013; Monaghan 2012). Second, the concept of mechanisms is often misunderstood; investigators often conflate mechanisms with program activities (Astbury and Leeuw 2010). Third, identifying specific CMO configurations can be a methodological headache due to difficulties inherent in disentangling relationships, and dealing with conflicting theories (Astbury 2013; Davis 2005; Greenhalgh et al. 2009). Finally, original specifications for reporting findings include developing matrices that tabulate tight hypotheses in the form of C1 + M1 = O1, C2 + M2 = O2 etc. (Pawson and Tilley 1997; Pawson 2013). However, rigid adherence to these CMO formulas may undermine the very goals of realist evaluation by (1) inadvertently implying that CMO relationships are simple and linear and (2) limiting recognition of complex multi-mechanism interactions and concatenation (Astbury 2013). Rejection of the CMO matrices in favour of alternative means of illustrating CMO interactions has been proposed (Astbury 2013).

Heeding these warnings, we propose a blended approach that integrates realist and theory-driven evaluation. Our goal is to construct a cohesive account of—and model how—intensive longitudinal faculty development programming can work to produce change. To achieve this, we embrace the central realist evaluation premise and seek to specify CMO features and interrelationships. At the same time, we contend that the utilization of relevant social science theory—as proposed by theory-driven evaluation—can provide the needed focus and direction in clarifying CMO relationships. Consequently, we draw on relevant psychological and education theory to inform our analysis and interpretation of findings. As we conceptualize our model, we explore the temporal gradient posited by theory-driven evaluation—seeking to identity short- and long-term outcomes of faculty development. Simultaneously, we explore the possibility that change mechanisms may be triggered at different points along this gradient, with longer term (secondary) change mechanisms being generated, in part, as a function of earlier CMO interactions and concatenation.

Finally, we strive to develop more fluid methods of articulating CMO relationships. We eschew realist evaluation’s prescriptions for developing tight CMO hypotheses (e.g., C1 + M1 = O1), and instead aim to develop a more flexible description and visual illustration of contextual features, mechanisms and outcomes. This approach should provide the structure needed for examining identified CMO relationships, while accommodating further theoretical exploration.

Methodology

Theory-based evaluation methodology can generally be grouped into three phases including: (1) stakeholder consultations, (2) data collection for targeted evaluation questions, and (3) data analysis and development of a coherent account of how the program works.

Target program and setting

We conducted the outlined phases within the Education Scholars Program (ESP) at the Centre for Faculty Development in Toronto. This 2-year, intensive, longitudinal program for health professional educators is intended to enhance faculty performance in teaching, curriculum design, leadership, education scholarship and to create a community of faculty developers. The ESP uses an iterative emerging curriculum design (Lieff 2009) to solicit feedback from participants throughout the program. Admission is offered biennially. Applicants are recruited through advertisements to health professional faculties affiliated with the University of Toronto. Selection is based on both sponsorship by host programs/departments, and alignment of applicants’ expressed goals with program goals. Five cohorts have graduated since the program launch in 2004. Cohort sizes ranged from 12 to 20, with an average of 15.8 participants. Overall graduates include 51 females and 28 males from a range of health professions: 62 physicians, 5 physiotherapists, 2 pharmacists, 2 occupational therapists, 2 speech-language pathologists, 2 scientists, and 4 other health professional educators.

Phase one: mapping provisional program logic

The first author facilitated a reflective review of the program’s curriculum and goals with key stakeholders including program leadership (BM, BAM, SL), coordinators and facilitators (BM, SL, LB). We then developed a logic model to diagrammatically represent provisional linkages between program activities, outputs, and anticipated outcomes (see “Appendix 1”). Some have recommended logic models in realist evaluation to help guide analysis and avoid confusion between program activities and program mechanisms (Astbury 2013).

Phase two: gathering data

We conducted retrospective secondary analysis of qualitative data, collected over 10 years of the program. Data sources included: (a) focus groups with program participants, and (b) semi-structured follow-up interviews with program graduates. These data were initially collected to inform program improvement and planning. Focus groups and interviews were recorded and transcribed verbatim. Data were managed using QSR Nvivo (Version 10). St. Michael’s Hospital Research Ethics Board approved this research (REB 07-182).

Focus groups with program participants

We conducted focus groups with each cohort at the end of each year of the 2-year program, using guiding questions developed collaboratively by research and program staff. Though the phrasing of questions evolved over the years, three consistent categories of questions were asked: (1) experiences of curriculum content and delivery, (2) experiences of personal and professional growth, and (3) perceived changes in their educational practice.

All program participants were invited to attend; focus groups were conducted during regular class time. Overall, 10 focus groups (58–80 min) were conducted between 2005 and 2014, with an average of 15 participants per focus group.

Semi-structured interviews with program graduates

Via e-mail, we invited program alumni (two-to-four years post-graduation in 2012) to participate in follow-up interviews. We developed a semi-structured interview guide (see “Appendix 2”), to explore how program participation impacted participants’ personal and professional outcomes, and the contextual factors that influenced these outcomes. Before scheduled interviews, we sent participants a copy of reflective papers they wrote while participating in the program; these papers were intended to trigger reflection upon their experiences through the program. LB conducted the interviews which lasted between 60 and 90 min. Fifteen alumni participated. Data collection was considered complete once no new themes arose through the ongoing data analysis, consistent with principles of theoretical saturation (Dey 1999).

Phase three: data analysis

We employed a framework analysis approach (Ritchie and Spencer 1994; Ward et al. 2013) as it aligns well with our integrated theory-based evaluation approach as both derive from a critical realist paradigm (Snape and Spencer 2003). Framework analysis can include both deductive and inductive elements, and provides a structured process for managing data, while allowing for the flexibility associated with qualitative inquiry (Ward et al. 2013).

We developed a preliminary coding frame using key concepts of realist evaluation as well as codes derived from the logic model. Data that reflected change processes in affect, cognition or behaviour at individual or group levels, in response to changes introduced by the program were coded as mechanisms. Data that illustrated constraining or facilitating conditions that support mechanisms or outcomes were coded as contexts. Data that highlighted predicted or unexpected consequences of the program at individual, organizational, or system levels were coded as outcomes.

Coding proceeded iteratively but is summarized linearly, for clarity. We analyzed focus group transcripts first, applying codes to relevant data extracts and adding emergent codes for unanticipated outcomes. We thus developed a refined coding framework that we then used to analyze the interview data, allowing for the emergence of additional codes. Our review of codes across data sets showed significant overlap across focus groups and interviews. Once both data sets were coded, we organized coded data extracts into higher order thematic categories. We then considered the relationship between these thematic categories, provisional program logic (phase one) and resonant theoretical perspectives, using our stated psychological and psycho-educational lens. Thereafter, we compared coded contexts, mechanisms, and outcomes iteratively, mapping out potential CMO interrelationships (see Abhyyankar et al. 2013) and exploring possible concatenation, whereby certain mechanisms and outcomes were seen to spur additional mechanisms and outcomes.

Results

Figure 1 models our findings, by drawing on the concept of a mandala, which is a circular illustration of interrelated elements with special meaning within a given system. This illustrative model answers calls for a more fluid representation of CMO relationships than traditional realist evaluation matrices. The findings suggested two distinct, but critical, contexts that influence program participants’ behaviour and outcomes: (1) immediate faculty development program context, represented by the innermost circle, or core, of the mandala (C1), and (2) the broader institutional context of the academic health sciences system, represented by the outermost circle (C2). Through our analysis, we classified mechanisms as either primary (M1) or secondary (M2) depending on whether participants’ accounts suggested they were directly (primary) or indirectly (secondary) influenced by the program context. Primary mechanisms (M1) are represented in the second innermost layer of the mandala; Outcomes (O), are represented in the third, while secondary mechanisms are represented in the fourth layer, of the mandala model.

Fig. 1
figure 1

A mandala of faculty development: Modelling contexts, mechanisms, and outcomes

The results leading to the formulation of this mandala model are discussed below along with supporting data extracts. Explication of pertinent theoretical concepts is provided in the “Discussion" section.

Immediate program context (C1)

Findings on the within-program context (C1) suggest that faculty development programming can create a unique structural reality, by providing conditions that facilitate certain change mechanisms. Four features of this context were salient across participant data, including:

Protected time

Participants were required to secure four and a half hours of protected time weekly, prior to admission. For most participants, this provided an unprecedented opportunity for focused attention on their roles as educators.

The protected time gives me the opportunity to free up my mind… I wouldn’t have been able to think about any other issues around my career as an educator without that opportunity. (Focus Group 2009, group 2)

Conversely, participants who received protected time from departmental but not clinical duties, experienced a sense of ongoing conflict between clinical and educational roles that constrained their engagement with program activities.

I’d be checking my phone during class to make sure everything is good with the patients. So I developed this mentality … when I came here it was like, it better be worth it. If I had coverage it would be different. (Focus Group, 2014)

Safe learning climate

Program participants perceived the program as a safe space that provided a platform for exploring ideas, sharing experiences, challenges, and concerns, without fear of suffering negative consequences.

They did a really good job of fostering a sense of safety… if I feel safe in an environment, I’m more likely to share. And I’m more likely to be open and I’m more likely to develop relationships (Focus Group, 2014)

Social interaction and networking opportunities

The program provided an opportunity for health professional educators across the academic health sciences system to interact with other educators with whom they would not otherwise have engaged in a meaningful way.

The most rewarding thing has been spending an afternoon for two full years sitting in a room with people who actually care about this stuff, and that’s a rare opportunity. (Focus group 2012, group 2)

Exposure to educational theory and experienced practitioners in education

The curriculum was designed to expose participants to multiple theoretical perspectives in education, and exposure to relevant skills. It was delivered by experienced facilitators and established academic leaders (and practitioners) in the academic health sciences.

Really the who’s who of education… [are]… not just sharing their knowledge and their craft but showing you how to deliver that content. (Interview, Scholar 8)

Primary mechanisms (M1)

The described program context (C1) functioned to create a safe, social learning space, where multiple inter-related mechanisms unfolded, and subsequently influenced affective, cognitive, and behavioural outcomes. Four mechanisms (M1) were identified consistently across data as being spurred by the program.

Collaborative-reflection (or co-reflection)

Program activities triggered ongoing reflective dialogue about real-life challenges affecting participants’ academic practices. As active practitioners, participants could draw from, share, and reflect upon each other’s experience-based knowledge. This process of collaborative reflection was highly valued by program participants; they perceived it as central to their development as it was a source of ongoing insight and support.

Self-reflection and self-regulation

The program also triggered introspective reflection, as program activities often necessitated examination of participants’ sense of self in relation to their educational practices, roles, career ambitions, and personal goals. In turn, participants could regulate or modify personal behaviours in order to act upon insights gained.

On a very personal basis, I’m shifting my focus on teaching to [something that’s] not a chore, but something that I can enjoy doing … I transform it to make it fun not only for me but for the learners as well. (Interview, Scholar 16)

Relationship building

Through the course of the program, ongoing social interactions facilitated the development of both personal and professional relationships, among participants. The resulting social affiliations formed the basis for ongoing support and future collaborative initiatives.

These are new lifelong or career long relationships that we’re building. What’s great about them is the support and the kind of encouragement (Focus Group 2011, group 3)

Pedagogical knowledge acquisition through various forms of learning

Participants reported the acquisition of knowledge about educational theory and practice.

I’ve always felt that I’ve been faking it up to now. And now I can learn the terms and names and theories behind what I was doing. (Focus Group 2009, group 1)

Although this knowledge acquisition likely occurred through various forms of learning, many participants reported observational learning as central to their development.

Outcomes (O)

Data suggest that faculty development programming influenced a broad range of outcomes. They are organized thematically into several levels: individual (OI), educational practice (OE), and organizational (OO), with the express acknowledgement that these levels are not mutually exclusive. Outcomes are discussed below, and select data extracts are presented in Table 1.

Table 1 Data Extracts Highlighting Outcomes of Longitudinal Faculty Development Programming

Individual outcomes (OI)

Participants attributed several changes in attitudes, knowledge, and work roles to their participation in the program. Some faculty reported less anxiety in their roles. Many felt more knowledgeable and skilled. Relatedly, most reported greater confidence and self-efficacy as both educators and leaders. The data also highlighted a stronger sense of academic identity among faculty and an enhanced sense of meaning and purpose in their roles. Several graduates also received promotions to senior roles, which they attributed to their completion of the program and gains made therein.

Educational practice outcomes (OE)

Participants reported several positive changes to their educational practices as a function of program participation. These included modifications to their teaching, including giving feedback, relating with learners, and performing teacher evaluations. Many believed their educational practices were generally more theoretically informed. One of the ways this manifested was in the adoption of a shared language of practice with other educators, corresponding to educational concepts and theories acquired during the program. Most participants reported significantly greater engagement in educational scholarship; for instance, participants conducted research and disseminated their findings. Additionally, faculty reported having greater credibility as educators within their academic units. Indeed, many graduates reported functioning in consultative roles, serving as key contact persons for their peers who sought them out for advice and leadership on a range of educational initiatives within their work units.

Organizational outcomes (OO)

Many reported outcomes had implications that extended beyond the individual. Many faculty described having the agency to make changes to ongoing academic practices within their work units, upon completion of the program. This was manifested in two ways. One, participants recommended and enacted quality improvements to teaching and curriculum development practices, and supported relevant innovations. Two, participants took advantage of enhanced credibility and positional power to advocate for reform to reward systems within the academic health sciences that may disadvantage educators and undermine educational practice. Furthermore, program participants and graduates reported contributing to, and leading, the development of programs and revision of various curricula for students, residents, and/or faculty. Nearly all program graduates reported being part of a much larger network of educators upon graduation. They described having established greater social affiliations across professional and locational boundaries.

Secondary mechanisms (M2)

Our analysis revealed interactions among primary change mechanisms, shorter-term outcomes, and contexts that set the stage for secondary change mechanisms to unfold over time. For example, primary mechanisms (e.g., self-reflection and regulation) and specific outcomes (e.g., enhanced self-efficacy) could work in concert, spurring secondary mechanisms of change (M2) (e.g., psychological empowerment) and further outcomes. Our analysis suggests two such secondary mechanisms (M2) were influenced indirectly by faculty development programming.

Psychological and structural empowerment

We identified a pattern of empowerment underlying program graduates’ agency to enact changes in the workplace. We conceptualized this empowerment as partly intrinsic (psychological) and partly derived from emerging power within one’s organization (structural). For example, one participant credited both inspiration and motivation, as well as emerging professional networks with the ‘right people’ within the system as drivers of his/her educational scholarship work.

I got involved in doing some scholarly work in education which was very, very new to me and definitely it would not have happened if I hadn’t gone through the program, because it gave me relationships with the right people, but also some basic knowledge and inspiration and motivation to try something and take a risk. (Interview, Scholar 13)

Relationships with people in key positions, some of which were facilitated by program participation, were a critical, albeit informal, source of power that helped program graduates drive various agenda forward.

You go to the people for information, for access to tables, you go to them for favours … information, access, favours, ideas, inspiration, learning … It helps you get things done. (Interview, Scholar 8)

Having those ESP contacts has definitely changed the access to other hospitals (Interview, Scholar 13).

Conversely, program graduates who worked in contexts where they felt isolated or unable to engage peers felt disempowered.

There are not a lot of people that are very active on [my Committee]… it’s me that does all the thinking work and the planning work… I’m pretty much discouraged about that at the moment, about the whole [organization], and the [committee]. (Interview, Scholar 9)

Formal leadership positions were also a source of structural empowerment, and the absence of a formal leadership title, in some cases, limited participants’ capacity to drive an agenda forward, even in the presence of relationships with individuals in powerful positions.

I’m on [a] council … and it is herding cats. I’m the driving force and I’m not at the director level, so to be the driving force at my level is a little difficult, I have a strong [director] ally … but it’s challenging to just kind of, because I’m not at all the tables I need to be at….(Interview, Scholar 8)

System (academic health sciences) context (C2)

Faculty development program context, change mechanisms and outcomes were all influenced by the broader institutional context of the academic health sciences (C2). Our analysis showed that the realization of outcomes spurred by faculty development could be further facilitated or constrained by certain features of faculty members’ institutional environments. We identified four contextual factors (C2) as particularly critical influences on faculty outcomes.

Senior leadership support

The priorities and support of senior leadership (to whom faculty were accountable) had a significant impact on the capacity of several participants to realize the broader benefits from faculty development. Participants unable to engage the support of senior leadership, were limited in their capacity to transfer knowledge gains:

Part of your job when you take the course is to bring back to your department what have you gained. So one of my chiefs was asking me, I started talking about it and I think I heard his eyes rolling when I looked away… it’s difficult to explain from somebody who’s not buying what you’re selling. (Focus Group 2011, group 3)

Other program graduates worked under senior leaders who relegated educational activities to a low priority status.

I had a [formal educational role at my hospital]. We recently had a change of chief and basically I got notification that that job was gone…. The education role basically, without any discussion, she just decided that that was not a role that she wanted continued at all… And also she has taken away funding from things that I had gradually gained funding for (Interview, Scholar 4)

Resource availability

Time and financial resources were key influences on program graduates’ intention and capacity to incorporate gains from faculty development within their practice. Conflicts between educational demands and clinical duties, work-life balance, and/or secure compensation were frequent. In addition, competition for funding for research and new programming was tough; securing funds was critical to success.

I became [a program director]—I needed two half days a week to do what I needed to do… Of course I ended up doing it on evenings and weekends… I needed serious real protected time and was able to get money to protect my time so that I didn’t have to see patients. (Interview, Scholar 13).

Community engagement

Graduates described needing the engagement of other stakeholders within their institutions, in order to introduce successful change initiatives. These stakeholders could be students whose participation was needed, or peers whose support was essential for collective decision-making. Stakeholder engagement was not always forthcoming, and program graduates frequently encountered resistance against their efforts to enhance educational practices.

We have a [system of assessment] that is a complete disconnect with what they’re trying to prove and what they’re actually testing. So I want to get involved from within and to transform that vision or to modernize it… the reactions have been mixed. Again, there are people who are very old school, very set in their ways and say this is… nonsense and others are intrigued. (Interview, Scholar 16)

Community needs

Across the data set we identified the recurrent theme of responsiveness to emergent needs within various work units. Program graduates often needed to be presented with, or to seek out, opportunities to lead or support educational initiatives that addressed various needs within their academic community.

I’m building a curriculum in [my clinical practice area]… we got some industry funding to support that because there’s a need in the community. (Interview, Scholar 16)

These opportunities provided suitable platforms for participants for realization of transfer benefits gleaned from faculty development programming.

Discussion

Overview of findings

This research involved integration of realist and theory-driven evaluation principles to explore how faculty development works in the academic health sciences system. Specifically, we evaluated an intensive longitudinal faculty development program by examining relationships among contexts (C), primary and secondary change mechanisms (M), and outcomes (C). Concurrently we engaged in a theoretically-grounded examination of potential CMO interactions and interrelationships

CMO relationships in light of extant theory and literature

Certain features of the program context were critical to creating a space conducive to generating multiple change mechanisms: time, a safe space, exposure to pedagogical knowledge, and multiple means of social interaction. These findings in and of themselves are unsurprising. But a notable contribution of the mandala is its demonstration of the interactions among inextricable layers of program context, mechanisms, and broader contexts. Whereas exposure to curricular content can spark participants’ acquisition of knowledge, other contextual features are critical to supporting this process and triggering change mechanisms beyond individual learning. For instance, the protection of time to focus on education was deemed essential for professional development; when this contextual precondition was absent, meaningful participation in program activities was constrained and other mechanisms were inhibited. Existing research identifies dedicated time for learning as critical for collective learning, reflection, scholarship and leadership development (Rushmer et al. 2004; Zibrowski et al. 2008). The perceived safety of the faculty development learning space was also an enabler of change mechanisms. Indeed, safe educational environments have been characterized as spaces of mutual trust between teachers and learners, where learners feel comfortable enough to experiment, voice concerns, acknowledge uncertainty, and stretch limits (Hutchinson 2003; Young et al. 2015). Program conditions that allow faculty to openly exchange information and creative ideas, as well as discuss systemic challenges without fear of social or political repercussions, are more likely to enable the identified change mechanisms. Similarly, ongoing opportunities for social interaction within the program provide the necessary foundation for relational and collaborative change mechanisms. We explore the identified primary change mechanisms next.

First, reflective processes were identified as key mechanisms for the facilitation of outcomes from faculty development, both at the social (collaborative-reflection) and individual (self-reflection) levels. Here, the theory-based approach was particularly useful in drawing meaning from the data. Most papers in health professions education relating to reflection focus on individually-oriented reflection (Ng et al. 2015); a notable finding in this study was participants’ valuing of both collaborative and self-reflection. Reflection has been described as a cognitive process, comprising intellectual and affective activities in which examination of actions performed, or incidents encountered, can lead to new understanding and appreciation (Boud 2001). Collaborative-reflection (co-reflection) has been defined as a shared critical thinking process that involves cognitive and affective interactions between two or more people to facilitate mutual sharing of insights and experiences to reach new inter-subjective understandings and appreciations (Yukawa 2006). Professional development program activities that bring together groups of practitioners can establish a social foundation for collaborative reflection on real-life challenges (Watkins et al. 2011). Hetzner et al. (2012) found that shared reflection within the workplace was a useful tool for isolated and stressed workers seeking to adapt their knowledge and skills to changes in their work environment. Co-reflective practice has been identified as a central mechanism of change for leadership development programming (Lyso 2010; Watkins et al. 2011) because through co-reflection, a group can share experiences, weigh arguments and evidence, as well as examine alternative perspectives (Yukawa 2006). Our research was consistent with these findings and may suggest the potential to continue to study reflection beyond the ‘self.’

Another interesting finding related to reflection was the linked nature of self-reflection and self-regulation. Self-reflection involves introspective analysis of feedback information (Moon 2004), which can arise from a variety of formal and informal sources (Ashford 2003; Moon 2004; Nesbit 2012). In this study, the program provided a rich source of ongoing, multi-source feedback that could be analysed introspectively in relation to one’s own educational practices, career ambitions, and personal goals. Consequently, self-reflection led to insights on how to enhance educational practices, as well as insight about career management, and clarification of the purpose and meaning of participants’ roles within the broader academic health sciences system. In our findings, self-regulation was paired with self-reflection as its active companion. Extant literature states that it is the process of self-regulation that guides the insights of self-reflection towards goal directed activities over time (Karoly 1993). Ultimately, ongoing self-regulation allows for the development of action strategies, adoption and maintenance of new behaviours as well as ongoing adaptation to changing environmental circumstances (Nesbit 2012). Within the context of longitudinal programming, there can be a cyclical developmental process between self-reflection and self-regulation; faculty have ongoing opportunities to use the program as a source of ongoing feedback that informs self-reflection. They can then regulate behavior accordingly and gain additional feedback.

Finally, relationship building was also identified as a key primary change mechanism. The program provided a platform for development and nurturing of collegial relationships across professional and organizational boundaries. These relationships set the foundation for subsequent collaborative initiatives and professional support networks. In the extant literature, faculty development has been described as an entry point into a new social and intellectual community of practice that can help participants overcome feelings of isolation in their own departments and divisions (O’Sullivan and Irby 2011). The layered nature of the multiple contexts represented in our mandala model builds on the layered community of practice model outlined by O’Sullivan and Irby (2011) and advocated by Steinert (2010) and Leslie et al. (2013).

Our integrated theory-based evaluation approach allowed us to explore how primary change mechanisms and short term outcomes can set stage for secondary change mechanisms that unfold over a longer time period. We identified empowerment processes as key secondary mechanisms of change, triggered by a confluence of CMO interrelationships across program and institutional contexts. Empowerment has been conceptualized as both psychological (Spreitzer 1995a; Thomas and Velthouse 1990) and structural (Kanter 1993; Laschinger et al. 2001, 2004). Psychological empowerment is an intrinsic motivational vehicle through which individuals can affect their work roles or context (Spreitzer 1995a; Thomas and Velthouse 1990). It has been empirically (positively) linked to specific individual-level outcomes identified in this research including a sense of meaning, competence, and efficacy about one’s work (Spreitzer 1995a, 1995b, 2009). These individual level outcomes (OI) can interact with the identified primary mechanisms (M1) such as self-reflection and self-regulation to catalyze psychological empowerment. Ergo, psychological empowerment can function as a secondary mediating mechanism (M2) of the relationship between faculty development (C1) and broader organizational outcomes (OO) such as those identified in this research.

Regardless of psychological empowerment, organizational structures can (dis)empower individuals from achieving various goals (Kanter 1993). Structural empowerment can be formal (discretionary decision making power) or informal (social networks) and is derived from access to opportunities, resources, information, and support (Kanter 1993; Laschinger et al. 2001, 2004). Our research shows that the faculty development context (C1) can be instrumental in facilitating access to critical information (e.g., pedagogical expertise), enabling development of relationships (M1) and the establishment of professional support networks (OO) or securement of formal leadership roles (OI) with discretionary decision making power. In this study, professional networks and formal leadership positions empowered program graduates’ advancement of various educational agendas. Conversely, we found that program graduates who reported limited access to opportunities, resources, information and support had constrained capacity to enact change. Existing research has linked psychological and structural empowerment positively to innovative behaviour in health care environments (Knol and Van Linge 2009). Empowerment is also positively related to job performance (Butts et al. 2009). Thus both psychological and structural empowerment can act to support change agency, with faculty bolstered to influence the institutional context (C2).

Implications for faculty development

First, our findings show that, within conducive institutional (C2) and program contexts (C1), faculty development can trigger primary mechanisms (M1) that generate positive outcomes (O) for both individual faculty and their institutions. Over time, the confluence of these outcomes and associated CMO interrelationships can generate secondary mechanisms (M2) that spur further outcomes. The breadth of reported outcomes underscores the broad span of functions within the academic health sciences system that can be influenced by faculty development. Not only does faculty development have the potential to enhance pedagogical competence, it can also aid the development of faculty as agents of change and community leaders, forming networks of education leaders at the forefront of curricular innovation and academic reform. Faculty developers should thus espouse a sense of responsibility for maintaining faculty development programs as bastions of best-practices—ensuring that the lessons conveyed reflect high standards and positive ideals for the academic health science systems of the future.

Second, the results provide a useful model for examining practice issues in the delivery of faculty development. The model provides some specification of what needs to happen, and what broader contextual features need to be considered, in order for desired goals or impact to be achieved. It allows for the examination of links between what program creators assume they are accomplishing, and what changes actually unfold along the way. Faculty developers can question their effectiveness in creating contexts where critical mechanisms can unfold, for example. They can explore whether program components work as expected by conducting targeted examination of areas where the program may excel or fails to meet expectations. They can also strive to find ways to offset the lack of empowerment that may constrain faculty agency in their organizations. Furthermore, theory-based evaluation can be expanded to theory-informed program planning (Baldwin et al. 2004) where results of evaluation are fed back into program development to inform revision of program design and delivery strategies. Finally, we can strive to avoid the short-termism that characterizes many faculty development evaluation efforts and draw on empirically established links to inform future scholarship.

Strengths and limitations

Few faculty development programs can attest to employing a consistent approach to collecting evaluative data over an extended period. The 10-year span of data collection, and subsequent follow-up with long-term program graduates, are major strengths of this study. One limitation lies the use of secondary data analysis. Integration of theory-based evaluation principles into the original design of data collection instruments may have yielded richer data. A second limitation is our reliance on self-reported data. However, this limitation is offset in several ways. Logic modelling with input from stakeholders, and curricular documents, enabled a clear understanding of program activities, ensuring that key attributes of the program were not ‘lost’ in the evaluation process. Additionally, the consistency in findings across the span of a decade supports trustworthiness of the data. Adherence to the central principles of realist and theory-driven evaluation enabled a theoretically informed and robust evaluation. Finally, several reported outcomes (e.g., scholarly publications, leadership roles, new programming etc.) are independently verifiable, and can be explored in future research. Finally, our circular illustrative model of CMO relationships may be an incomplete representation of what happens within longitudinal faculty development programming. Indeed, it has been said that “all models are wrong, some are useful” (Box and Draper 1987). It is our goal to provide a useful platform for ongoing evaluative efforts in faculty development, thus fulfilling one of the principle aims of theory-based evaluation.