This is part of a series of blogs detailing a discussion that I had with John Sweller in mid 2017. See all parts of this series on this page.
OL: Now, the next thing I want to ask you about is the work of Kapur, and Andrew alludes to this in his paper as well (as mentioned in the previous post). Some would argue that the results of Kapur suggest that in some case it is actually better to enable people to engage in approaches that lead to overburden on working memory and subsequent failure to achieve the intended outcome, if you then follow that up with a gap filling kind of instruction, so where are you at in terms of that at the moment? (Hear Andrew Martin’s take on productive failure in this podcast)
JS: Okay, here’s my problem. There are ways of testing the productive failure hypothesis legitimately. Now I have to talk about experimental design here. We’re talking about randomized controlled trials. Most people in Education, when they run a randomized controlled trial, get the randomization done fine. The control part of it they tend not to. And the whole point of using a randomized controlled design is to determine causality because you only alter one variable at a time. If you alter multiple variables simultaneously, you can’t determine causality and it’s a waste of time running that sort of experiment. So there are certain things you can’t do. For example, you can’t do something like: give one group of students, let’s say, a lecture which involves guidance; and another group of students problem solving, which reduces guidance; and then test them and find out which one is better. And the reason you can’t do that is because you are altering multiple variables simultaneously and so can’t know why you got an effect. Let’s concentrate on the lecture to begin with. Assume it’s a really brilliant lecture, magnificent lecture, very clear, students understand it. They are highly motivated. Is that going to be better than problem solving? Almost certainly the lecture is going to be better. If it’s a really poor lecture, students don’t understand what’s going on, it’s disorganized, they’re bored out of their minds, then studying a problem, no matter its deficiencies, is going to be better. You shouldn’t do those sorts of experiments.
There are experiments you can run, and they’ve been running in a different context without even attempting to test the productive failure hypothesis. The obvious one is to compare worked examples followed by problems to problems followed by worked examples. Well, on the productive failure hypothesis, you’d be better off giving people problems first, followed by worked examples. Now you’ve got exactly the same conditions for both except for one variable; you either start with problems or you end with problems. You either start with worked examples or you end with worked examples. So, you compare worked examples followed by problems to problems followed by worked examples. If you do that, and you’re using novices who really need the worked examples, the results are uniform. Worked examples followed by problems is always better.
If you’re using somebody who no longer needs the worked example, you see the expertise reversal effect. As expertise goes up, the advantage of worked examples go down, and as expertise continues to go up, eventually the relative effectiveness of worked examples and problems reverses and the problems are more helpful than worked examples. As you know, once you learn how to solve something, you need to automate it, you need to get practice, you need to be able to do it without thinking about it and at that point you’re better off solving problems rather than studying worked examples.
If you get people who are sufficiently knowledgeable, yeah you’ll get a different result, they’re better off with problems first. But, come what may, you have to have experiments that only alter one variable at a time. And that variable has to be the one that’s of interest to you. Interestingly, in the old days, decades ago when computers first started appearing in education, people immediately asked the question: ‘Oh gee, is it better to give people a lecture or have them get computer assisted instruction?’ That’s a nonsense question! You cannot test that out. A good lecture is much better than bad computer assisted instruction and vice versa. As it happens, you can test out the productive failure hypothesis, but when you test it out properly, only altering one variable in your experiments at a time, you get different results to some of the ones reported. That’s my problem with productive failure.
All posts in this series:
- Worked Examples – What’s the role of students recording their thinking?
- Can we teach problem solving?
- What’s the difference between the goal-free effect and minimally guided instruction?
- Biologically primary and biologically secondary knowledge
- Motivation, what’s CLT got to do with it?
- Productive Failure – Kapur (What does Sweller think about it?)
- How do we measure cognitive load?
- Can we teach collaboration?
- CLT – misconceptions and future directions