RCTs in education: yes, no or maybe?

Health professional practice is both a science and an art.  Students learn about the scientific method and how medical knowledge is advanced through experiment, research and evaluation.  Randomised controlled trials (RCTs) are a gold standard approach to gathering evidence as to whether treatments work and for whom.  Treatments here may mean drugs (or medicines), other forms of disease management such as diet and exercise, complementary therapies including acupuncture, and ‘talking’ approaches such as psychotherapy and cognitive behavioural therapy (CBT).   Looking back over the years, we can see how treatments have changed as knowledge advances.  

Health professional training also changes over time.  Some biomedically focussed educators advocate for RCTs of learning and teaching methods to explore whether one way is better than another. However, RCTs are fraught with difficulty in an educational setting, and their use is controversial. I am not a fan of RCTs to guide evidence-based education, though they do have proponents.

Consider that a potential new drug for the treatment of high blood pressure (hypertension). It is developed in a laboratory based on knowledge of what regulates blood pressure and how it might be modified.  The medicine works in a different way to existing medication and has been tested in animal models as well as a small number of human volunteers: it seems to have good results in a phase I trial lasting a few months.  Subsequently, there is a phase II trial (lasting about 2 years) and in phase III the drug is ready for wider trials in more authentic settings – out in the real world, for up to 4 years. 

For an RCT we need to first establish what is the endpoint or outcome that we will be measuring.  For a hypertension treatment this would be lowering of blood pressure to within the range agreed by experts as acceptable. In addition, we would hope that this lowering would reduce the risk of strokes and heart attacks, morbidity, that are associated with high blood pressure.  After many years we may be able to show a reduction in mortality – a treated person should live longer than if they had remained untreated or taken an older medicine.  And, ideally, these outcomes would occur without any serious adverse effects on other aspects of health. 

The RCT compares the new medicine with one or more medicines and/or placebo that are already being prescribed and have outcome data.  For such a comparison to be valid, the two or more groups of volunteer people in the trial must be comparable in age range and with respect to smoking, alcohol consumption etc.  Volunteers are chosen at random to be in a group and individuals do not know which medicine they receive – the trial is blinded.  Moreover, the researchers who are measuring outcomes do not know which medicine an individual is given – the trial is double-blinded.  This blinding is necessary to ensure that neither the research subjects nor the scientists are affected by knowing what treatment is being taken by whom.  Blood pressure is measured regularly, and the research subjects are asked to monitor for unexpected events, which may or may not be due to the medication, for example rashes, dizziness, headaches, sleep disturbances. If the drug shows therapeutic promise, it is manufactured, approved as a prescription medicine, and advertised to prescribers. (Though even then once a medicine is taken by thousands of people, unexpected problems may be detected.)  

Now consider a new approach to learning and teaching.  How does it compare to a new medicine? There is unlikely to be the time, resources, or funding for several phases of a trial.

Let’s choose the early stage of health professional education: the sciences underpinning clinical practice.  The educational intervention should be based on knowledge of what is already known about learning – educational theory and evidence of effects in practice – but will be innovative in the context of educating students.  

What are the outcomes we should be measuring?  At a basic level we will be testing to see if the students taking the new course achieve the learning goals stated as being the aim of the course.  

In first year at university a learning goal (or outcome) might be: describe how blood pressure is regulated in the body and the effects of poor regulation including hypertension.  In my pre-clinical years in the 1970s this topic was taught in separate sciences: physiology, anatomy/histology and biochemistry, with lectures and laboratory work including students’ taking each other’s blood pressure.  Today it is more likely that the topic is learned in an integrated course called ‘the cardiovascular system’, with fewer lectures and early patient contact to show the clinical relevance of blood pressure, with students’ interacting with patients and taking their blood pressure.  Some institutions have also adopted a problem-based learning (PBL) approach with group-based activities.  How would we know which way is best at helping students meet the learning outcomes?  How could we investigate this?

Once we begin to consider an RCT, where instead drugs we are comparing learning activities, we start to run into problems.   

Obviously, it would be impossible to do a double blinded RCT but maybe an RCT where we randomised first year students into two or three groups, and one has the traditional course (so would be the control group), one has the new integrated course (intervention group I), and one has the PBL approach (intervention group II). Then we give them an assessment and see which group has learned most/has the higher marks.  Does the assessment need to reflect the type of learning activity? Should the assessment be the same for all groups – if so, would it favour one group over another? 

Remember also that these students are highly intelligent, have done well at school to achieve a university place and are motivated to learn.  They will all learn something.  The numbers passing the assessment in each group may differ and there may be statistical significance but what practical significance does this have? What are the features of each approach that make the most difference? Would the outcomes be the same for a different topic? Context is important, as usual.

Moreover an assessment is similar to looking at short-term reduction in blood pressure and not long-term improvement in health.  How can we demonstrate long term change in how health professionals apply their education in the workplace and, ultimately, that patient care improves because of a university program?

Logistically this type of trial would be a nightmare as the institution would need extra resources, parallel timetables and probably additional teaching space. It would also be hard to keep the two/three groups separate if they are mainly located in the same buildings and socialise together.  Students are likely to discuss their experiences, a situation called contamination in trials, which implies that the control group become ‘contaminated’ by the intervention groups and any outcomes may not be solely due to the education each group is receiving.   

Another way would be to compare year groups of students at two or more different universities.  However, the cohorts might not be directly comparable in relation to class size, entrance requirements, the teaching faculty etc.  A third way is to compare the last cohort of students for one educational approach and the first cohort with the new style the year later.  That might be feasible – and such studies have been done. Moreover, students following the old curriculum may feel disgruntled to miss out on the brand-new shiny experience of integrated courses, while the new curriculum students could feel as if they are being experimented on and that new doesn’t necessarily mean better.  The Hawthorne effect can occur in any RCT: participants modify their behaviour because they are being studied. All these feelings can affect performance.  And just suppose the old curriculum turns out to be better, would the institution revert to it after all the time spent on renewal? 

When medical school curricula change, as they typically do about every ten years, no-one has carried out an RCT beforehand and made changes based on the findings.  The change is rather developed to meet a new set of learning outcomes in relation to changes in health professional knowledge and practice, based on new ideas in educational theory and delivery from a range of sources.  

How do we measure if one generation of health professionals is better than another? We must be clear about what a good health professional looks like AND be able to measure the ‘goodness’ AND compare how that goodness is demonstrated in practice.  Could we look at patient outcomes?  For example, do patients do better if they are treated by a doctor from a traditional medical programme compared to a newer approach?  But what does ‘do better’ mean? Fewer deaths? Improved quality of life? Greater patient satisfaction? All these outcomes are reasonable to look at, but can we really link specific elements of programs lasting from 3 to 7 years to patient outcomes more than five years later? 

What do you think of RCTs in education?

Further reading


The journal Educational Research has an open access issue devoted to RCTS in education research: 2018; 60(3). The above link is to the editorial introducing the issue and summarising the papers within it.  RCTs can be useful in certain contexts but there are many challenges. 


A 2011 critique of RCTs in medical education. 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: