Keywords

Personal Prologue: Starting to Do Classroom Research

Today, anyone seeking to develop an academic career in educational research has a formidable number of obstacles to overcome. As a minimum, candidates will generally be expected to have a good honours degree, solid teaching experience, preferably in the maintained sector, and at least a master’s degree with the promise of completing a doctorate in the not too distant future. A journal publication would be an additional advantage. I simply had three strokes of good fortune.

My First Piece of Luck: Attendance at a Boarding School

I began my academic career as a chemist taking a science degree at what was to become the University of Newcastle upon Tyne, although at the time it was an adjunct of the University of Durham. I stayed on after graduating and began a doctorate exploring the capacity of solutions to conduct electricity at high temperatures. Two years into this investigation, with no consistent results to show for it, the apparatus which had taken a year to build disintegrated. I hadn’t the heart to start all over again and with a pregnant wife to support asked the Professor what I should do. ‘Try teaching’, he replied. ‘You might be better at that’.

With no postgraduate certificate in education, the maintained sector offered limited possibilities. The Times Educational Supplement in that week offered two openings for chemists in the private, independent sector at Truro Cathedral School in Cornwall and St. Paul’s School in London. I applied for both, got asked for interviews and arranged to go to Truro via London.

At St. Paul’s School, the headmaster’s (or High Master as he was called) office was an enormous room. It had been used by General Bernard Montgomery, a former pupil, as his headquarters when planning the 1944 D-Day invasion. After the usual enquiries about my background, the following exchange occurred:

  • High Master: You went to a boys’ boarding school then?

  • Me: Yes sir.

  • High Master: Was there much err... er... er.... ering [stuttering] between boys?

  • Me: Some

  • High Master: What would you do if you found two boys er... er... er... ering together?

To this day I cannot remember what I said in reply. All I know is from that point, the High Master stopped prefixing his remarks with ‘If you come to St. Paul’s’ and instead replaced the ‘if’ with ‘when’.

I cancelled my trip to Truro. I had entered the teaching profession.

A Second Piece of Luck: My Head of Department Had a First-Class Cambridge Degree

St. Paul’s was a good place to learn to teach. There were few discipline problems, and I gradually gained a reputation for getting the less able pupils through their A levels. At the time there was increasing national interest in the ‘swing away from science’ among undergraduates and a leading chemist at Leeds University, Professor, later Lord Dainton, was asked to chair a government committee to investigate the problem.Footnote 1 Professor Dainton had gained first-class honours in the science tripos at Cambridge. The only other person from his college to obtain the same degree was Head of the Science Department at St. Paul’s. Professor Dainton became convinced, as his enquiry progressed, that the drop-off of students in the first year of the degree course at Leeds was due to inadequate teaching by lecturers who were only interested in their research. He decided to appoint someone with a reputation as an effective practitioner to teach the first year undergraduate course. As Professor Dainton had little knowledge of schools, he sought advice from the person who had gained a first-class honours degree with him at Cambridge, whom he knew from a previous college reunion taught at St. Paul’s. He wrote to my head of department and the head recommended me. I was the only person interviewed.

The Third Piece of Luck: Learning to Write Computer Programmes

At the beginning of the 1970s, I gave up my career as a chemist at Leeds University to join the School of Education at the University of Leicester. I applied for the post of senior researcher on a project directed by Professor Jack Kerr and Jim (Biology) Eggleston, later professor at Nottingham University. The subject was often appended to the latter’s name to distinguish him from Professor John (Woodwork) Eggleston, who also became Professor of Education at the University of Warwick, the two subjects indicating the areas which they had taught while schoolmasters. Professor Kerr had recently been awarded a grant by the then Schools Council to evaluate the new science curriculum which was sponsored by the Nuffield Foundation. Nuffield Science, as it became known, was the first attempt in the UK to introduce constructivist ideas in the teaching of science, following on from American initiatives such as the Harvard Physics Project and the CHEM Study Curriculum. Subsequently, when I became a Professor at Leicester, I was able to access my application form. Two things seemed to count particularly in my favour because they had been underlined several times and starred. First, it was thought that the 5 years spent working at a university chemistry department would impress heads of science in the schools, and, second, I was able to write computer programmes, a skill which was at the time not available in the Leicester School of Education. Professor Kerr was quick to see the potential value of this addition to the School’s expertise in that it created the capacity to improve the range of quantitative methods currently in use. Thus, a mediocre talent in the context of a University Chemistry Department, where everyone was computer literate, was viewed as a highly valued attribute in education.

On such chance occurrences therefore whole careers are sometimes fashioned.

The Beginnings of Systematic Observation in the UK

However, because I had been out of school teaching for 5 years, Professor Kerr suggested that I should familiarise myself with the Nuffield scheme by visiting a number of local schools which had adopted the new curriculum. On one of these visits, I met a physics teacher who claimed to ‘out Nuffield, Nuffield’. In a particular instance, the work scheme provided a very elegant way of establishing Ohm’s law. This states that the current generated is equal to the voltage divided by the resistance in the circuit. In the Nuffield scheme, pupils first experimented with water flowing down a series of pipes. The pressure of the water (varied by altering the height of the filter funnel into which the water was poured) was the equivalent of the voltage. Various constrictions in the pipe down which the water flowed were equivalent to the resistance and the rate of flow to the current. By varying the pressure and measuring the flow, the students were able to gain some insight into the relationship with the width of the tubes down which the water was flowing. In turn they could then make use of this water analogy to predict how the current might vary when a series of resistances were introduced into an electrical circuit, powered by batteries. The lesson I observed was the one in which the students were to test their predictions.

A life-changing moment occurred at the start of the lesson. The teacher began by telling the pupils that in this lesson they would be testing their predictions, which they had written in their books for homework and which he had collected. He then went on to describe the apparatus and how they should set it up, but concluded with the words:

You will need your books in order to check your predictions. You will see that I have marked them right or wrong.

In their design of the study, Professor Kerr and Jim Eggleston had intended to distinguish between Nuffield and non-Nuffield teachers using a questionnaire which would ask them about their educational philosophy in the teaching of science, question them about the use of Nuffield texts and whether they had carried out what were seen as certain key experiments, such as the one described. On the basis of the respondents’ answers, teachers would then be divided into two groups, those who were pro-Nuffield and those who were less inclined. Classes would then be given pretests and post-tests using specially constructed measures, some which favoured traditional ways of studying science and some which supported the more problem-based Nuffield approach. Under this design it was clear that the teacher I had observed would have found himself in the pro-Nuffield category. However, the fact that he had marked their predictions meant that in practice his lesson was very similar to one where in a more traditional style, the teacher supplied the answers and the pupils by carrying out the experiment were tasked with checking that the teacher had presented the new knowledge correctly. At that time standard practice, used by nearly all chemistry teachers, myself included, was on occasions to substitute our own set of results where those of the pupils didn’t corroborate the textbook theory because of experimental error.

I returned to the School of Education convinced that we could not rely on the teachers’ self-report to decide whether a teacher was pro-Nuffield or not. It was clear that only by sitting in classrooms and watching the action would we be able to tell who was implementing the Nuffield scheme faithfully. The rest is history. Jim Eggleston and I together with the other researcher, Margaret Jones, sat down and devised the first UK-based systematic observation system which we called the Science Teacher Observation Schedule (STOS). Its purpose was to distinguish between the teachers’ use of different types of questions, statements and directions. Thus, there were questions of fact, of closed problem-solving, of open-ended problem-solving and of inference, hypothesis, experimental design and categories of statements and directions which paralleled these. Nuffield teachers were those that were high on the hypothesis, inference and experimental design categories and low on closed questions and factual statements. In the analysis only a small proportion of teachers (19%) were found to have adopted the Nuffield approach to any great extent (Eggleston et al. 1976).

Early Systematic Studies of Classrooms

Although relatively little used by UK researchers at the time of STOS, the recording of specific categories of behaviour to describe classroom practice had been a regular feature of research in the USA since the 1930s. Barr (1935) was the first to introduce time sampling where various designated behaviours were recorded at fixed intervals.

The early developments of this strand of educational enquiry are described by Medley and Mitzel (1963) in the first Handbook of Research on Teaching which was produced by the American Educational Research Association (AERA). Some of the early workers in the field had arrived in the USA during the 1930s as refugees from totalitarian regimes in European countries. They contrasted the rigid and authoritarian methods of teaching the young to be model citizens in countries such as Germany and Italy with what they saw as the ‘democratic’ methods underpinning American society where in schools pupils were allowed to express opinions and exercise a degree of choice in their work.

This perspective informed the classic study of kindergarten children’s patterns of aggressive behaviour by Lewin et al. (1939) which sought to determine whether certain teaching methods fostered undesirable behaviours such as scapegoating and bullying on the part of ‘dominant’ groups and apathetic submissiveness to authoritarian domination by the persecuted. Patterns of teaching were rotated between authoritarian, democratic and laissez-faire. In authoritarian teaching all determinations of policy and the steps in the activity were dictated by the playgroup leader who directed personal criticism or praise at individual pupils but remained neutral to the group as a whole.

In the democratic situation, all policies were a matter of group discussion, and joint decision-making was encouraged among the children. Children were free to work with whoever they chose, and the playgroup leader endeavoured to develop a friendly, warm atmosphere within the group. In the laissez-faire approach, children were given complete freedom to decide what they wished to do without any participation by the playgroup leader. The adult supplied materials but made it clear that she/he would only take part in the discussions when asked.

Two experiments were carried out. In one experiment the democratic approach was used for 6 days, there was then a transition day, and this was replaced by authoritarianism for a further 6 days followed by another transition day when the approach switched back to democracy. As a control a comparable group of children started with autocracy and then moved to democracy and then back to autocracy. In the second experiment with two other groups of children, the approach was changed from laissez-faire to autocracy and then to democracy and in the control from authoritarianism to laissez-faire and then to democracy. The outcome measure used to determine the success of different teaching methods was the number of aggressive actions recorded per meeting of the playgroup. The researchers found that there was an average of 38 such actions per meeting during laissez-faire, 30 during authoritarian lessons but only 20 when the democratic approach was used. More importantly they found that whenever they moved out of an authoritarian mode during the transition day, the number of aggressive behaviours increased markedly, and the tendency for certain children to be singled out by the rest of the group as scapegoats for failure also increased. The implication of this study was that the approaches adopted by teachers in countries such as the UK and the USA accounted in part for their relative political stability in contrast to Central European states, such as Italy, Spain, Germany and Russia, where the regimes were characterised by their extreme violent actions and the persecution of minority groups.

This study undoubtebly influenced Anderson’s (1939) definition of direct teaching, which he recorded whenever a teacher was ‘telling things to pupils’ in contrast to indirect teaching where pupils were ‘asked things’. This distinction was based upon Anderson’s view that direct teaching was usually accompanied by other dominative techniques such as the use of force, commands, threats, shame and blame which he claimed ‘obstructed [children’s] natural growth processes’, whereas indirect teaching, which acknowledged differences in individual behaviour and was an expression of the democratic process, was ‘consistent with growth and learning’. Based on these assertions, Anderson constructed an observation system in which eight categories recorded dominative contacts with the teacher, whereas ten were said to represent integrative forms. Dominative categories included direct refusals, warnings, threats and lecturing, whereas integrative behaviours involved inviting rather than telling children to take part in an activity, participating in that activity and expressions of sympathy, as, for example, giving permission to leave the room in order to get a drink. Three kindergarten teachers were observed. What was striking about the data was that across different sessions there seemed a constant ratio for each teacher between the number of dominative and integrative contacts. For two of the teachers, the ratio lays between 2.1 and 2.3, respectively, whereas for the third teacher, who was observed for three sessions, the ratio was 5.0, 4.9 and 4.8. This finding, although the result of only a small-scale study, gave a boost to the claim that there was a ‘scientific’ basis for the study of teaching, as it appeared to have established a basic rule concerning teaching behaviour, namely, that the ratio between dominative and integrative activity for any particular teacher was constant.

The next 30 years was to be dominated by studies of ‘direct’ versus ‘indirect’ teaching principally through the use of the Flanders’ (1970) Interaction Analysis Category (FIAC) system. This system had ten categories, the first three were recorded whenever teachers accepted and were sensitive to pupils’ feelings, praised or encouraged or accepted and used pupils’ ideas. The next four categories noted when a teacher asked a question requiring answers which conformed to the teachers ideas (what we call a closed question), lectured, gave directions or criticised and justified his/her authority. Categories 8 and 9 distinguished between pupils responding and initiating an exchange with the teacher while the final category recorded as silence was mainly used when the observed behaviour could not be coded. The totals for columns 1–4 were summed and divided by the sum of columns 5, 6 and 7 to give the indirect/ direct ratio.

Flanders went on to do a number of process-product studies in which pupil performance on tests of attainment and also aspects of their attitude and motivation were measured. The majority of these studies were naturalistic in that teachers were observed, their ID ratios recorded and then an arbitrary division was made into those that were indirect or direct. This was usually done by ranking the teachers in order of their ID ratios and then dividing them into three groups. Some studies, however, were true experiments in that teachers were randomly assigned to two groups, deliberately encouraged to be either direct or indirect, and the performance of pupils then subsequently assessed. The results of this research were reviewed in great detail by Dunkin and Biddle (1974). Of the naturalistic studies, some 14 showed that teacher indirectness was unrelated to pupil achievement, whereas ten others showed some positive differences. However, in the five experimental studies examined, none showed a positive relationship with indirectness, although in one study, it was related to improved pupil attitude. To use a sporting analogy, there were some wins, lots of draws but no losses. Soar and Soar’s (1972) explanation of these findings was that the relationship between direct teaching and pupil performance was curvilinear so that the maximum effects tended not to occur at the extremes but when a mixture of the alternative approaches were used. Nevertheless, when Simon and Boyer (1970) published their anthology of classroom observation instruments consisting of around 200 entries, nearly 90% of these were derivatives of FIAC.

Process-Product Research and the Use of Direct Instruction

The 1960s saw a shift away from the view of intelligence as a fixed trait, with the consequence that while teaching could improve the performance of all pupils, it could not close the gap between the able and less able. New theories developed by writers such as Carroll (1963) offered more optimistic notions of school learning in arguing that any pupil, in principle, could be taught anything if allowed sufficient time. Carroll’s notion of mastery was expressed in the proposition that the degree of learning was directly proportional to the ratio of time actually spent by a pupil on a task divided by the time needed by the pupil to master the demands of that task.

Taking Carroll’s ideas forward, Harnischfeger and Wiley (1978) produced a model of learning which mediated the influence of time as a key variable. These researchers distinguished between the allocated and instructional teaching time. The difference, the evaporated time, was mostly taken up by changeovers between periods and interruptions during lessons, (handing out books, collecting homework and dealing with pupils’ misbehaviour, etc.).

A key question within this approach, therefore, is to identify those factors which maximise pupils time on task and hence their achievement. ‘Time on task’, that is engaged time, thus became a critical output measure of effective teaching. Among the characteristics identified in those classrooms with the highest levels of on-task behaviour were the following:

  • Teachers were accurate in their diagnosis of pupil performance levels.

  • Teachers were able to set appropriate tasks such that they matched the children’s learning needs.

  • High levels of pupil-teacher interaction took place concerning the presentation of information on academic content, monitoring work and giving feedback about performance. Such interactions usually took place in a group or class setting and were not characterised by individual work.

  • Teachers spent more time discussing the structure of the lesson.

  • Teachers gave satisfactory responses to students’ requests.

  • Teachers’ value systems emphasised academic goals.

  • Teachers encouraged students to take responsibility in helping each other and sharing materials.

These characteristics were referred to as direct instruction by Rosenshine (1979). The term direct instruction was deliberately chosen to distinguish the process from Anderson and later Flanders’ use of direct teaching as discussed in the previous section. For Anderson, in particular, direct teaching involved a teacher telling pupils things in an authoritarian manner. Rosenshine and others saw no reason why teachers could not instruct pupils within a warm friendly unthreatening classroom climate. The basic steps in direct instruction are that pupils are first presented with the new information, then allowed practice, then assessed either through testing or questioning and then retaught those parts which they have failed to master.

The results of a large number of studies of this period are summarised by Brophy and Good (1986). Few of those advocating the use of direct instruction failed to acknowledge the limitations expressed by these authors, particularly that the outcome measures used in this process-product research consisted almost entirely of standardised tests of factual knowledge rather than tests of more complex learning activities. Thus, Rosenshine (1987) concludes that the findings on direct instruction

are most relevant when the objective is to teach procedures, explicit concepts or a body of knowledge. Specifically, these results are most applicable when teaching mathematics concepts and procedures, English grammar, sight vocabulary, historical knowledge, reading maps and charts and science knowledge and procedures. These findings are less relevant when teaching areas where the skills to be taught cannot be broken down into explicit steps. Such areas include mathematics problem solving, analysis of literature, writing papers or discussion of social issues. (Rosenshine 1987, p.258)

Systematic Studies of UK Classrooms in the 1970s and 1980s

Throughout the 1970s and 1980s, a series of studies based upon systematic classroom observation were carried in British classrooms. The first of these was Neville Bennett’s (1976) study, Teaching Styles and Pupil Progress. In this study teaching styles were identified through the use of a questionnaire, teachers were categorised as being either formal or informal. However, in attempts to explain differences between the relative successes of the different styles, pupils were observed and the amount of work related and social interaction between pupils noted. In the formal (class taught in silence, regular testing, competition encouraged) classrooms, the work rate was around 70% at a maximum, whereas in the informal classroom (integrated subject teaching, free movement, choice of where to sit), it never exceeded more than 60%. A more detailed study was then carried out between 1975 and 1980 at Leicester and was called the Observational Research and Classroom Learning Evaluation (ORACLE) research. Thirty-seven years later, Inside the Primary Classroom, the first of five volumes resulting from the ORACLE research (Galton et al. 1980), is still the most frequently cited piece of research concerning primary classrooms in the UK. The findings have been replicated in a number of other studies, including Curriculum Provision in the Small Primary School (Prisms Project) by Galton and Patrick (1990). This was similar to the ORACLE but carried out in small rural schools (with less than 100 pupils on roll across 14 local authorities). Although the study was to ascertain how far the curriculum matched that taught in larger schools, data was also collected on patterns of teacher and pupils’ behaviour within the classroom using the same observation system to that used in the ORACLE research. Also during the 1980s, two studies were carried out in London, one in junior schools (Mortimore et al. 1988) and the other in the infant classes of primary schools (Tizard et al. 1988). The Mortimore study used the same observation instrument as in ORACLE. During the late 1980s and early 1990s, a further study was carried out in schools in Leeds by Robin Alexander et al. (1989). These various studies showed a gradual decrease in the proportion of time teachers were interacting with individual children in favour of whole class activity. Whereas in ORACLE 72% of time was spent with individual student by the 1990s, this figure had fallen to 50% (Pollard et al. 1995). What was equally significant, however, was the consistent pattern of teachers’ interactions associated with the different forms of classroom organisation and the behaviour of pupils in response to these different instructional strategies. ORACLE studied this relationship in great detail.

The ORACLE research identified six main types of instructional strategy. The first of these, named group instructors, as the name implies, spent around 20% of the time with groups of pupils. When with groups they concentrated on giving pupils instructions and routine information rather than engaging in discussion of ideas. But they gave pupils plenty of verbal feedback and asked a number of open questions. The second group, class enquirers, taught the whole class for 31% of the time. These teachers were highly organised, clear and lucid when explaining tasks and devoted much of the class time to asking questions and making statements, including statements of ideas.

A third group were called individual monitors who worked mainly one to one with children, using lower levels of group and class teaching than the other teaching approaches. They spent much time telling children what to do rather than discussing ideas and talked less than other teachers, often tending to sit at their desk silently marking children’s work and then having the pupil to come out to hand back work and discuss errors.

The fourth group were ‘super teachers’ called infrequent changers who interacted with individual children for around 90% of the lesson. When they did change their approach from individual to whole class teaching, this was carefully planned with a clear purpose in mind. In this they differed from another group of habitual changers who made impromptu, seemingly unplanned switches in organisation often when pupils displayed signs of inattention or poor behaviour. Infrequent changers asked the most cognitively challenging questions and encouraged pupils to show a degree of independence in choice of tasks and ways of tackling them. The final group of rotating changers moved pupils around different curriculum tables or had children stay in their place but change to another subject so that in each case the teacher was coping with four or five different disciplines at one and the same time. Both habitual and rotating changers had the lowest levels of task-related interactions.

There were corresponding patterns of pupil behaviour. The four categories consisted of intermittent workers. These pupils tended to work when they were the focus of the teacher’s attention, but at other times when the teacher was involved elsewhere, the children engaged in conversation which rarely related to the work in hand. Intermittent workers seemed therefore to take advantage of the opportunities to talk when the teacher was engaged elsewhere.

The next group were known as solitary workers. Such pupils tended to receive very little individual attention from the teacher but were usually part of the teacher’s audience when they were addressing the whole class. They tended to listen and watch while other pupils were the focus of the teacher’s attention. Although solitary workers were not active participants in class discussions, they were a considerable asset in that they could be trusted to maintain high work rates irrespective of the presence or absence of the teacher in their vicinity.

The third major group were known as attention seekers. They were not only pupils who would seek out the teacher to enquire whether they had performed the task correctly often asking very trivial questions such as I’ve done number one now Miss, shall I go on to number 2? But there were also other pupils who did not so much seek the teacher’s attention as were the objects of it usually as a result of a misdemeanour. The observers described such pupils as being very adept as finding things to do in other parts of the classroom that allowed them to avoid getting on with their own tasks. They would often, for example, appear to be intensely interested in other pupils work, and sometimes if required to queue at the teacher’s desk, they would step backwards when another child joined the queue allowing them to go in front, thereby delaying their own encounter with the teacher. The final group of pupils were known as quiet collaborators. These pupils differed from the solitary workers in that they were often working in groups but within such groups were reluctant to take part in conversations with other pupils. When they did collaborate, it generally involved the sharing of material.

There were direct relationships between the approach adopted by the teacher and the pupils’ behaviour. Nearly 50% of pupils taught by individual monitors were in the group who worked intermittently, while at the other extreme, class inquirers contained only 9% of this type of pupil. Class-directed activities, therefore, succeeded in cutting down the amount of distraction. However, although class enquirers had the highest numbers of solitary workers, a distinction needed to be made between those who were hard grinders working conscientiously and consistently and easy riders who found ways of slowing down their work rate by doing legitimate routine tasks without attracting the teacher’s attention. These pupils would spend an excessive amount of time sharpening a pencil or washing out a paint pot and in extreme cases, when queuing to see the teacher, would let other children go in front of them to delay the point at which they reached the teacher’s desk.

Group instructors had the greater proportion of quiet collaborators and the lowest number of attention seekers, whereas the reverse was true of infrequent changers. The two other teaching approaches, habitual and rotating changers, had little to recommend them in that they had the lowest levels of time on task with the highest proportion of intermittent workers and easy riders between them. These patterns were reflected in the correlations between teaching approach and academic performance where class enquirers and infrequent changers did best on standardised tests of attainment and also tests of study skills based on measures such as following instructions, formulating questions and demonstrating originality (Galton and Simon 1980).

Two decades later, the ORACLE study was replicated, mostly in the same schools as those participating in the original study, and this research produced almost identical findings (Galton et al. 1999). The observation system, with slight modifications, has been used to study the reduction in class size in Hong Kong primary schools (Galton and Pell 2012) with similar patterns of teacher and pupil behaviour despite the cultural differences. Parts of the system dealing with collaborative learning have been extended and used by Galton and Williamson (1992) and by Hargreaves and Galton (2002) to study group work in upper primary and lower secondary classrooms. Mention has already been made of the observational studies of Tizard et al. (1988) in the early years of primary education. This has influenced the later work of Peter Blatchford who was a member of the original Tizard team, in, for example, his research on class size (Blatchford 2003). Frank Hardman and colleagues (Smith and Hardman 2003) have made important technical advances by making the process of recording and tallying the observed behaviours instantaneous. Both Blatchford and Hardman have contributed to this volume and provide their own perspectives on the value of systematic observation for studying classroom interactions.

Objections to the Use of Systematic Observation to Study Classrooms

Systematic observation has not been without its critics, mainly during the 1970s and 1980s during the so-called the paradigm wars (Gage 1989). Hamilton and Delamont (1974) criticised the early schedules such as FIAC on account of their limited applicability to situations where teachers were stood at the front of the classroom and pupils were sat in single rows of desks in pairs. Apart from writers such as Mehan (1979) who espoused an ‘interpretative’ viewpoint and objected to the use of ‘positivist’ approaches in general, on the grounds that the use of numbers alone failed to capture the complexities of classroom life or to interpret the different cultural nuances associated with individual and group behaviour, Scarth and Hammersley (1986) directed their criticisms at ORACLE in particular. These two authors argued that the distinctions between statements and questions are not clear cut and that the use of subcategories such as open and closed questions and statements of fact and directions is even more problematic and leads to ‘high-inference’ unreliable judgements. A more general criticism of systematic observation was that it sampled a very small proportion of teacher and pupil behaviour, but nevertheless used such limited results to draw conclusions about whole populations: in the case of ORACLE, all primary teachers in England. ORACLE, for example, sampled a mere 58 classrooms in 19 different ‘all through’ primary, ‘junior’ primary or ‘primary-middle’ schools spread over three local authorities. Each class was visited for 3 days, each term over 2 years and six observation sessions each lasting approximately 55 minutes undertaken. Nineteen of these 55 minutes were spent observing the teacher and 36 minutes observing a sample of eight pupils.

In answering these specific points, Croll and Galton (1986) point to the key role that replication plays in establishing the reliability and validity of the conclusions drawn from systematic observational data in that if patterns emerging from one set of observations are to be regarded as unrepresentative of the population of a whole, then this must be equally true of a second set. The chances therefore of two untypical samples producing the same patterns and relationships would be extremely rare. On the issue of questioning, the observations were low inference because observers judged the nature of the question not on what they gauged to be the teacher’s intention but based on the pupils response and the teacher’s subsequent reaction. Thus, if the teacher asked the pupil, ‘what do you think?’, but the pupil replied, ‘the solution has turned blue’, the observer would only classify this as an open, challenging question if the teacher then said something such as ‘Why blue?’; otherwise if the first answer was accepted, the closed question category would have been ticked. ORACLE therefore measured successful open questions not intended ones which explains why the category was highly correlated with attainment unlike some other studies (Dillon 1981). In retrospect, history tended to support ORACLE’s conclusions. Two decades later Galton et al. (1999) in their replication of the original found that although the number of questions asked by teachers had increased – the result of government pressure to engage in more ‘interactive teaching’ – the proportions of closed to open questions had remained constant at around 80% of the former and 20% of the latter. Moreover, when observations were made, not in situ, but from videotapes of lessons and with different observation system (Hardman et al. 2003), similar patterns of teacher-pupil interactions emerged.

Life in Classrooms Today

The above findings and that also by Hargreaves et al. (2003) which produced similar interaction questioning patterns point to an unresolved issue, namely, why is it so difficult to bring about changes in the way that teachers engage in the moment-to-moment exchanges with their pupils as identified in systematic observation studies? While there have been changes in the nature of classroom organisation (use of whole class teaching, more group work, etc.), the use of certain kinds of statements, questions and feedback has remained remarkably stable over four decades of classroom observational research. This is not just a British or American phenomena. More interesting has been the efforts of advanced societies around the Pacific Rim and East Asia to cut down on the amount of direct instruction in favour of more active pupil participation. Thus, in Singapore teachers have been advised to teach less so pupils learn more (Gopinathan 2010), while in Hong Kong, the emphasis on Learning to Learn was introduced by the Curriculum Development Council at the beginning of the millennium (CDC 2001), yet in both countries, despite large investments in professional training, research has shown that teacher instructional talk still dominates (Hogan and Gopinathan 2008; Hogan et al. 2013; Galton and Pell 2012). Cuban’s (1984) assertion that ‘teaching is a very conservative profession’ would appear amply justified not only in the USA but worldwide.

Although ORACLE has therefore clearly had an impact among researchers and those responsible for teachers’ professional development, it is more difficult to estimate the contribution that the various studies and other systematic observational approaches have made to the advancement in teaching. For teachers taking certificated professional development courses, it can provide a stimulus with which to explore their own classroom practice. But the evidence of the previous paragraph would suggest that once these practitioners return to their classrooms, the forces of reaction prove too formidable for them to maintain their resolution to change the way they teach. In the West, the strong elements of accountability coupled with the sanctions that can result in failure have presented a stark choice for many teachers. Either they teach to the tests or they leave the profession and in the UK, for example, over 50,000 teachers resigned or took early retirement in 2014–2015. New recruits are failing to make up these losses. In English and mathematics, the numbers are 12% and 11%, respectively, below targets. In design the figure is a staggering 57%.Footnote 2 There are many reasons for this situation, but some have to do with the fact that teachers no longer feel in control, cannot expand on their subject knowledge nor teach creatively or find time to cater for pupils’ social and emotional wellbeing. Indeed one UK education minister has characterised these latter aspects of children’s development as ‘peripherals’ which get in the way of the ‘essentials’ of learning. All mention of wellbeing has consequently been removed from the inspection framework by which schools are now judged.

In the Asia Pacific, matters are less clear cut. Among the politicians and administrators, there is a recognition that a model of twenty-first century learning is required that can produce ‘flexible, self-regulating learners with the skills to meet the demands of an ever-changing labour market’ (Stobart 2014). In Hong Kong, for example, although parents still put great store on traditional methods, because the end of primary school examination still determines the band (and status) of secondary school attended while the secondary leaving diploma is the passport to university, the Education Department has made serious attempts to widen the curriculum and introduce new teaching methods, based on the observation categories highlighted in ORACLE. In some schools, for example, a 20%, 60%, 20% rule operates. For homework, pupils have to use their laptops and iPads prior to the lesson to inform themselves of the subject matter to be taught, so that teacher instructional time takes up a maximum 20% of the teaching period. The bulk of the lesson (60%) is given over to extended classroom dialogue, group and pair work with the remaining 20% taken up with subsequent reflection and evaluation.

Despite, therefore, nearly a century of research into teaching making use of both quantitative and qualitative observation studies (and increasingly combinations of both), we still have no universal agreed, sound practical model of how teachers learn to teach better, based on current theories of developing expertise such as those proposed by Berliner (2002). While there are promising approaches to school-based professional development, based on notions of ‘learning communities’ as discussed by Chris Watkins (2005) in this volume, we still need to align these improvements in teachers’ professional knowledge with what we have learned over the years concerning the principles that govern human behaviour within the classroom. Desforges (2003: 15–16) offers a similar perspective but observes that since much of this professional knowledge is ‘generated behind the closed doors of an individual teacher’s classroom it is rarely written down and consequently it is difficult to articulate’. He contends that

Schools could be even more successful than they are now in promoting achievement if we could all learn to share and use the knowledge we have now about learning. I recognise that there is a vast body of knowledge about learning evident in the everyday practices of teachers. This knowledge is difficult to get at and so it is difficult to share. There is also a small but strong body of scientific knowledge about learning to be gleaned from research. This knowledge is easy to get at but difficult to apply. The trick we need to perform is to bring the practical knowledge and the theoretical knowledge together to promote advanced teaching practices.

This remains our greatest challenge.