By continuing to browse this site you agree to us using cookies as described in About Cookies
Notice: Wiley Online Library will be unavailable on Saturday 7th Oct from 03.00 EDT / 08:00 BST / 12:30 IST / 15.00 SGT to 08.00 EDT / 13.00 BST / 17:30 IST / 20.00 SGT and Sunday 8th Oct from 03.00 EDT / 08:00 BST / 12:30 IST / 15.00 SGT to 06.00 EDT / 11.00 BST / 15:30 IST / 18.00 SGT for essential maintenance. Apologies for the inconvenience.
Dr Geoff Wong, Centre for Primary Care and Public Health, Blizard Institute, Barts and the London School of Medicine and Dentistry, London E1 2AB UK. Tel: 00 44 20 3112 0800; Fax: 00 44 20 3112 0808; E-mail: email@example.com
Medical Education 2012: 46: 89–96
Context Education is a complex intervention which produces different outcomes in different circumstances. Education researchers have long recognised the need to supplement experimental studies of efficacy with a broader range of study designs that will help to unpack the ‘how’ and ‘why’ questions and illuminate the many, varied and interdependent mechanisms by which interventions may work (or fail to work) in different contexts.
Methods One promising approach is realist evaluation, which seeks to establish what works, for whom, in what circumstances, in what respects, to what extent, and why. This paper introduces the realist approach and explains why it is particularly suited to education research. It gives a brief introduction to the philosophical assumptions underlying realist methods and outlines key principles of realist evaluation (designed for empirical studies) and realist review (the application of realist methods to secondary research).
Discussion The paper warns that realist approaches are not a panacea and lists the circumstances in which they are likely to be particularly useful.
This paper is a response to calls for changes in the way educational interventions are researched. Although the randomised controlled trial (RCT) undoubtedly has an important place, its position as the reference standard study design is under question because: (i) intervention-on/intervention-off comparisons answer only some of the important questions in education research; (ii) such experimental comparisons are sometimes impractical, unethical, inappropriate or unaffordable, and (iii) people behave differently when they are participating in trials. Education researchers have highlighted a need for more research addressing ‘how’ and ‘why’ questions, especially for studies that help to build theory about the link between educational interventions, learner outcomes and service impacts.1,2
In relation to e-learning, for example, Cook et al.3 have argued that researchers need to move on from working out whether e-learning ‘works’ and measuring its ‘effect size’ to exploring the types of learner, social settings and pedagogical circumstances in which e-learning is likely to be acceptable, effective and cost-effective and, equally importantly, those in which it is not.
In this article we outline one approach, realist evaluation (and the secondary research equivalent, realist review), which is suited to exploring ‘how’ and ‘why’ questions in education research, although we do not claim that it is uniquely so. We start by considering the nature of medical education interventions and then consider what realism is and how it might be applied to education research. We include a brief summary of a realist review of e-learning4 and show how our findings might complement and extend the Cochrane-style systematic review conducted by Cook’s team.3 Finally, we suggest areas in medical education research in which realist methods might be used to extend the current knowledge base.
The nature of interventions in medical education
In the language of the Medical Research Council,5 education is a complex intervention in that it is ‘built up from a number of components, which may act both independently and inter-dependently’. Outcomes of medical education interventions are highly context-dependent – the impact of the ‘same’ intervention will vary considerably depending on who delivers it, to which learners, in which circumstances and with which tools and techniques. The decisions and actions taken by the ‘human components’– that is, driven by human agency6,7– generate the outcomes. An educational intervention (singular) actually consists of multiple components or parts that interact with one another in complex ways, often in a non-linear fashion. Even something as seemingly simple as an undergraduate lecture can be subdivided into multiple factors that influence its impact. These include: the learning objectives (and how these are linked, if at all, to the assessment method); the knowledge, experience and personal qualities of the lecturer; the preparatory work (if any) given to students; the content of the slides shown; the level of interactivity invited; the handouts or study notes, and the ‘homework’ set (not to mention the acoustics of the lecture theatre, the university’s policy on mobile phones and the background and motivation of the students involved). As a result of all these interactions (and of more, which may go unrecognised and unmeasured), the amount of time and effort put in by a course tutor does not necessarily result in an equal amount of learning on the part of each of his or her learners.
Although complexity, interdependency and non-linearity are familiar enough from our own experience as teachers, they pose challenges for the education researcher. In some (rare) circumstances, we can undertake controlled experiments (e.g. RCTs) and produce more or less generalisable statements of the general format ‘if learners are exposed to educational intervention X, outcome Y will be observed’ (X→Y). Indeed, in some cases the link between X and Y is so strong and so predictable that a trial may be unnecessary (e.g. there is probably no need to carry out an RCT to test whether putting a topic on the examination syllabus will make students more committed to attending classes on that topic).
In most cases, however, the link between intervention and outcome is less predictable, such as in the evaluation of problem-based learning (PBL) in undergraduate education. As a review commented: ‘[PBL] has swept the world of medical education since its introduction 40 years ago, leaving a trail of unanswered or partially answered questions about its benefits. The literature is replete with systematic reviews and meta-analyses, all of which have identified some common themes; however, heterogeneity in the definition of a “problem-based learning curriculum” and its delivery, coupled with different outcome measurements, has produced divergent opinions.’8
For most medical educators, the issue in question has progressed from that of ‘Does PBL work?’ to ‘When and for whom should we offer PBL, and how might we maximise its benefits?’ Implicitly, these questions raise the issue of causation: how, exactly, are the benefits generated and why do they differ for different students? We suggest that a more nuanced chain of causation of the general format exists, whereby if learners are exposed to educational approach X, outcome Y is more likely provided a whole range of contingencies A, B, C, D, etc. are in place. A key goal of research is, then, to explore the relationships and interdependencies between A, B, C and D.
A number of approaches can be taken to unpacking the ‘black box’ (i.e. the ifs and buts in the chain of causation) of an educational intervention. Experimental trials may be accompanied by systematic quantitative and qualitative data collection to try to elucidate the ‘why’ and ‘how’ of the X→Y link. Although qualitative data can describe key processes, such methodologies struggle with generalisation. Conversely, quantitative data can describe outcome patterns and the relationship between causal variables, but cannot elucidate the underlying processes that generate these patterns. The formidable challenge for the researcher is to utilise a method that is able to combine the strengths of these two types of data to produce a coherent and plausible explanation of the contents of the black box.
In this paper, we focus on a particular approach to studying systematically how, when and why medical education interventions work. This is known as realist evaluation and was originally developed by sociologists Ray Pawson and Nick Tilley to explore the underlying causal processes by which programmes achieve their outcomes.6,7
What is realism?
In this context, ‘realism’ refers to a philosophy of science which sits, broadly speaking, between positivism (‘there is a real world which we can apprehend directly through observation’) and constructivism (‘given that all we can know has been interpreted through human senses and the human brain, we cannot know for sure what the nature of reality is’). Realism agrees that there is a real world and that our knowledge of it is processed through human senses, brains, language and culture. However, realism also argues that we can improve our understandings of reality because the ‘real world’ constrains the interpretations we can reasonably make of it.
Realism can be used to help us understand the social world. When used in this way, it acknowledges the existence of an external social reality and the influence of that reality on human behaviour. For example, consider the marks awarded to an essay by three tutors who hold different philosophical stances. For the positivist tutor, the mark represents an objective given property of the essay. The constructivist tutor sees the mark as representing a subjective property of the essay (the value or quality of the essay lies in the eye of the beholder). The realist tutor accepts that the mark awarded represents some objective property of the essay, but considers that how the mark is arrived at and what it means are influenced by an external social reality.
This reality forms the context in which the essay is marked and includes, for instance, prevailing beliefs (e.g. whether use of the Queen’s English is viewed as a relevant indicator of scholarship or an irrelevant reflection of social privilege), social and cultural norms (e.g. whether an essay about a clinical case should be presented using a particular set of subheadings starting with ‘history of the presenting complaint’ or presented more narratively in a way that is faithful to the patient’s own words), regulations (including whether marks are allocated by an intuitive overview or according to a structured marking scheme) and economic forces (e.g. the funding situation of higher education, which will influence how much time the tutor allots and whether the essay is second-marked by another tutor).
To understand the relationship between context and outcome, realism introduces the concept of ‘mechanism’. A mechanism may be usefully defined as: ‘…underlying entities, processes, or [social] structures which operate in particular contexts to generate outcomes of interest.’9 Certain contexts in the social world around us ‘trigger’ mechanisms to generate outcomes (sometimes abbreviated to C–M–O). Mechanisms in social science are comparable but not identical to mechanisms in natural science (e.g. the mechanism of gravity accounts for why an object dropped from a window falls to the ground). Like the mechanisms in natural sciences, they possess a number of features: they are not ‘visible’, but must be inferred from the observable data; they are context-sensitive, and they generate outcomes.
Social programmes change the resources or opportunities available to participants and, in that sense, change the context for those participants. The new context then triggers new mechanisms. Thus, programme mechanisms can be identified by asking what it is about a programme that generates change. An intervention itself does not directly change its participants; it is the participants’ reaction to the opportunities provided by the programme that triggers the change (see examples below). A realist approach therefore looks for interactions among the opportunities or resources provided by the intervention and the reasoning or responses of the participants. One route to identifying mechanisms is to conduct a thought experiment: when asked how the programme influenced him, a participant might say, ‘It makes me ponder X, see alternative Y, realise opportunity Z, etc.’
Imagine, for example, asking learners on a widening-access course for medicine to write free text responses to the question: ‘How have you changed as a result of coming on this course?’ Students might variously respond: ‘I started to think more deeply about scientific problems’; ‘I met real medical students and saw that you could still have fun whilst learning medicine’; ‘I started to believe in my own ability’, or ‘I made some friends and we are planning to keep in touch on Facebook as we prepare for the UKCAT test’.10 These responses give an inkling of the complex outcomes that might be generated by a widening-access course and also suggest potential mechanisms by which the opportunities provided by the course might improve students’ likelihood of applying to medical school and their competitiveness for places, thereby widening access. Expressed at a slightly higher level of abstraction, these mechanisms might be described as ‘promoting reflection and deep learning’, ‘increasing motivation through vicarious experience’, ‘building confidence’ and ‘providing mutual support’.
This example illustrates that the mechanisms by which educational interventions ‘work’ are often multiple, that some mechanisms are obvious and correspond to those intended by the course designers, and that some are less obvious and are unanticipated by the designers. It also illustrates that a mechanism is not inherent to the intervention, but is a function of the participants and the context. Technology-naïve clinicians on a continuing professional development course are less likely to organise themselves into Facebook groups than are 16-year-olds on a widening-access course, even if the underlying course design – small-group project work – is similar. Note also that the same educational opportunity (context) may provoke different reactions (and therefore different mechanisms) in different learners. In a widening-access course, for example, two different students may perform well (outcome) when they are set the task of presenting a case example to their peers, one because he perceives it as a challenging academic task that ‘makes me get my finger out and go to the library’ (mechanism) and the other because she perceives it as a performative task that ‘makes me want to overcome my shyness’ (mechanism).
An important principle of realism is that, by contrast with, say, a drug–receptor interaction, the ‘causes’ of outcomes are not simple, linear and deterministic. Clear learning objectives, available reading materials, a culture of critical questioning and interactive discussions (all part of an educational intervention) will not cause students to pass their examinations in a stimulus–response manner, but they make this outcome more likely. The generative mechanisms in this example, which include perceptions of the course’s quality, ease of use, reflection and deep learning and so on, illustrate characteristics of mechanisms more generally: they cannot be seen or measured directly (because they happen in people’s heads); they are context-sensitive (reflection, for instance, may be preferentially triggered when students know that deep learning will be assessed); they are multiple (hence, when researched, they need to be unpicked, defined and prioritised), and they are best expressed at a somewhat abstracted level so that they are not tied unnecessarily to particular people, places or things.
Educators will be familiar with the observation that varying outcomes occur. Not all widening-access courses will produce the same outcomes: some fail to widen access at all, others have mixed outcomes and a few succeed spectacularly. The realist explanation for this variability revolves around mechanisms and their interactions with other mechanisms and context. Although the endless permutations and combinations of interactions among context and mechanisms, and among mechanisms themselves, might be expected to produce no observable patterns, the fact that they do points to a particularly important feature of mechanisms. Where patterns do occur, realism postulates that this occurrence may be in part explained by the notion that similar mechanisms are being triggered. Thus, it is likely that successful widening-access courses have something in common: broadly speaking, each course has been designed in such a way as to increase the chances that the ‘right mixture’ of contextual influences will trigger the most relevant mechanisms to generate the desired outcomes.
In short, realism holds that mechanisms matter a great deal because they generate outcomes, and that context matters a great deal because it changes (sometimes very dramatically) the processes by which an intervention produces an outcome. Both context and mechanism must therefore be systematically researched along with intervention and outcome. By implication, research or evaluation designs that strip away or ‘control for’ context with a view to exposing the ‘pure’ effect of the intervention limit our ability to understand how, when and for whom the intervention will be effective.
What is realist research?
Realist research explores the link between context, mechanism and outcome by asking the question: ‘What works, for whom, in what circumstances, in what respects and why?’ We can put this more accurately and specifically for this readership: ‘What kinds of educational interventions will tend to work, for what kinds of learners, in what kinds of contexts, to what degree, and what explains such patterns?’ Theoretical explanations of this kind are referred to as ‘middle-range theories’ (i.e. they ‘…involve abstraction… but [are] close enough to observed data to be incorporated in propositions that permit empirical testing’11). Realist middle-range theories (i.e. theories that are constructed as part of a realist synthesis or realist evaluation) will be built around one or more mechanisms, but will involve more than just the mechanisms (i.e. they will also involve C and O). Realist research does not prove or disprove particular middle-range theories. Rather, it produces explanations which: (i) plausibly account for observed patterns in the data; (ii) accommodate (as far as possible) the range of contingencies and exceptions found, and (iii) fit closely and build on current best understandings of the field. A good realist theory is open for further testing and iterative refinement against empirical data.
Realist evaluation is primary research that is firmly grounded in and applies the realist philosophy of science. How one undertakes a realist evaluation cannot be expressed simply in technical or sequential terms (first do X, like this, then move on and do Y, like this). Rather, a realist evaluation of a medical education intervention is an iterative explanation-building process and might draw from any of the following realist approaches, using them judiciously, flexibly and in combination:
1 designing the evaluation to take account of hypothesised contexts, mechanisms and outcomes;
2 collecting data from course designers and tutors on how the course is intended to produce learning outcomes (i.e. building ‘programme theories’), either quantitatively (e.g. using structured questionnaires) or qualitatively (e.g. using interviews, focus groups, bulletin board discussions, open-response questionnaire items, review of course documentation and materials);
3 collecting data from students through similar methods to identify additional mechanisms, unintended by the designers, which support or interfere with intended (or unintended) outcomes;
4 analysing data as the study unfolds and tailoring further data collection to help confirm, refute or refine emerging programme theories (comparisons between and within groups can act as powerful tools with which to raise questions about C–M–O relationships);
5 producing preliminary thematic summaries of findings, in which mechanisms are carefully defined and prioritised, and refining these further through discussion amongst team members and presentation to others (steering group, staff, students);
6 testing interpretations further by explicitly seeking disconfirming or contradictory data and alternative explanations (disagreements in interpretation may highlight different mechanisms firing in different contexts);
7 writing and refining an over-arching explanatory account, working mainly from interim analysis documents and using the narrative form as a synthesising device (i.e. ‘writing up’ begins early and is an integral part of the research process), and
8 if possible within an overall research programme, repeating the process in a different setting or with a different staff or student cohort. This supports comparisons across contexts and thus represents the essence of the realist endeavour.
Realist review (also known as realist synthesis) is the secondary research equivalent to realist evaluation. A realist review is an interpretive theory-driven narrative summary which applies realist philosophy to the synthesis of findings from primary studies that have a bearing on a single research question. It uses interpretive cross-case comparison to understand and explain how and why observed outcomes have occurred in the studies included in a review. The working assumption behind realist review is that a particular intervention (or class of interventions) will trigger particular mechanisms somewhat differently in different contexts. In realism, it is mechanisms that trigger change rather than interventions themselves and thus realist reviews focus on ‘families of mechanisms’ rather than on ‘families of interventions’.12 An explanation and understanding of the interplay between context, mechanism and outcomes are then sought. The reviewer constructs one or more middle-range theories to account for the findings.
Typically, early exploratory reading identifies a number of potential or ‘candidate’ middle-range theories, each of which is then tested against the data (i.e. the studies included in the review) to see how well it is able to explain the pattern of findings. The realist reviewer moves iteratively between the analysis of particular examples, an emerging picture of the over-arching programme theory, and an exploratory search for further examples to test particular theories or sub-theories. For this reason, a realist review does not begin with a firm search strategy or protocol. The steps and techniques of realist review have been summarised elsewhere7,13,14 and our own team is currently leading an international collaboration to produce guidance and methodological standards for realist reviews and related approaches.15
The pursuit of rigour in realist research reflects principles usually seen in qualitative research, although it may draw on qualitative, quantitative or mixed methods. Much rests on achieving immersion (i.e. spending enough time in the study to really understand what is going on), collecting data meticulously and analysing them systematically, thinking reflexively about findings, developing theory iteratively as emerging data are analysed, seeking disconfirming cases and alternative explanations (see above), and defending one’s interpretations to researchers within and outside one’s own team.16
Where might realist research be used in medical education?
Although realist evaluation and realist review have been applied in other fields involving complex interventions in varying contexts,17–22 they have been little used in medical education. Figure 1 illustrates an example of how realist review was used to elucidate the issue of what works for whom in what circumstances in medical e-learning.4
However, not all research questions in medical education are suited to realist approaches. We suggest that this method is likely to have particular strengths in six circumstances:
1 when RCTs (or non-randomised comparative trials) of particular interventions have produced inconsistent estimates of efficacy and there is no consensus on when, how and with whom to use these interventions;
2 when an intervention is broadly accepted as appropriate and effective for a particular purpose, but educators feel that it could be optimised or targeted at particular subgroups;
3 when the existing research on a particular intervention consists mainly of disparate qualitative studies and ‘grey literature’ accounts (e.g. internal evaluations, PhD theses) that do not lend themselves to statistical synthesis but provide a rich source of qualitative data;
4 when new interventions are being trialled in order to identify how and for whom they are effective;
5 when changes are being introduced to the systems or structures that support educational interventions because these are likely to affect delivery (even if indirectly) which may, in turn, alter the pattern of the context, mechanism and outcome configurations generated, and
6 when routinely collected data on learner outcomes or programme impacts reveal unexplained changes in these patterns that require explanation.
Realist evaluation and realist review offer particular advantages for practice and policy. Education research has long sought to unpick, understand and explain why particular interventions help one group of students learn effectively, but are less effective with another group. However, until recently, appropriate methods of grappling with the complexity of educational interventions have not been available. Because realist approaches acknowledge and accommodate the messiness of real-world interventions, and because they ask different questions (not just ‘whether’ but ‘how’ and ‘for whom’), they can inform the tailoring of interventions and policy to particular purposes (such as for particular kinds of learning), particular target groups and particular sets of circumstances. This, in turn, has the potential to increase both effectiveness and efficiency, and perhaps to decrease unintended negative impacts of interventions. Realist approaches are not a panacea but represent one way to move the next generation of education research and evaluation to a position from which it can answer the next generation of questions.
Contributors: the initial draft of this paper was written by GWo and TG and revised for important intellectual content by GWe and RP. All authors contributed to the subsequent revision of successive drafts of this manuscript and their substantial expert contributions on conceptual issues have been invaluable in making the paper not only more precise, but also (we hope) more accessible. The final manuscript was approved by all authors.
Acknowledgements: we acknowledge the many academic colleagues, collaborators and students, who are too numerous to mention individually, who have over the years helped shape and refine our own understanding of the realist approach.