1 Introduction

Student achievement in mathematics continues to be the center of attention in many countries, sparked by results from international comparisons such as the Trends in International Mathematics and Science Study (TIMSS) and Program for International Student Assessment (PISA). Cross-national comparative studies in mathematics education provide a window to examine tacit assumptions and aspects of teaching (Bradburn and Gilford 1990; Hiebert et al. 2003), which remain hidden until they are exposed and scrutinized. Particularly, there is a rise in interests in unraveling cultural and instructional factors to explain differences in achievement among countries (Cai 2005; Ma 1999; Stigler and Hiebert 1999). Cross-national studies have compared aspects of mathematics education (e.g., An, Kulm, and Wu 2004; Ma 1999) and have identified substantial differences in the mathematics education traditions among these countries. Many studies have unveiled unnoticed but culturally located teaching practices characteristic of the systems under scrutiny (Stigler and Hiebert 1999; Wilson, Andrew, and Sourikova 2001). In particular, “comparing teaching and learning processes in different countries promises to help us to become more aware of our own implicit assumptions concerning the learning of mathematics” (Knipping 2003 p. 283).

One factor worth studying is teachers’ mathematical knowledge that affects the quality of their instruction and student learning (Hill, Rowan, and Ball 2005; Hill, Blunk, Charalambous, Lewis, Phelps, Sleep, and Ball 2008). Comparative studies on teachers’ knowledge (An et al. 2004; Ma 1999) and instructional practices (Stigler and Hiebert 1999) in the past have been useful in examining differences and commonalities in what teachers know and need to know to be effective in their practice. However, when comparing teachers’ knowledge across countries, the question remains whether what is being compared is commensurable or whether we are comparing the incomparable (Silver 2009). Not only does this possible discrepancy provide methodological challenges such as the instruments being used (Delaney, Ball, Hill, Schilling, and Zopf 2008), but also on a more ideological level of the existence of the universality of knowledge required for teaching mathematics. The premise of the theoretical construct of mathematical knowledge for teaching (MKT) is that what teachers need to know is determined by what teachers actually do in their teaching (e.g., Ball 1990). The task of teaching, however, may differ across countries.

In this study, we examine teachers’ mathematics knowledge of geometry in Indonesia using the MKT measures. Anderson-Levitt (2002) contends the importance of understanding not only differences in the teaching practices among countries, but also how they “differ from the wider universe of alternatives” (p. 21). When using the MKT geometry measures to study teachers’ mathematical knowledge in Indonesia, we aim at uncovering tacit assumptions about Indonesian teachers’ knowledge of geometry. The research question that drives this paper is, “What are the challenges in using the adapted MKT in studying Indonesian elementary teachers’ mathematical knowledge for teaching geometry?”

2 Background

There is a growing interest and need to develop valid and reliable instruments to measure teachers’ knowledge of mathematics due to a climate of increased accountability. Few scholars will dispute that teachers’ knowledge of mathematics is one of the most important influences on teaching practices and eventually on what students learn (Ball 1991; Ball, Lubienski, and Mewborn 2001; Hill, Ball, and Schilling 2008). Thus, the availability of measures to reliably assess what teachers know holds promises for further understanding what mathematical knowledge teachers need to know to impact student learning. Moreover, with the increased attention to comparative studies in mathematics education in the past few decades, examining the quality of teachers’ mathematical knowledge from different countries may provide insights into improving students’ achievement (An et al. 2004; Cai 2005; Ma 1999).

The scope of cross-national studies on teachers’ mathematical knowledge has been limited to a few countries, and these few countries perform well when compared to the USA on international comparison (e.g., An et al. 2004; Cai 2005; Ma 1999; Stigler and Hiebert 1999; Zhou et al. 2006). Widening the range of these studies to include more countries, including developing countries, may be useful to reach a greater understanding of the teaching and learning of mathematics. In addition, including representative samples of teachers rather than using small selective samples is also a desirable feature. Comparative studies on teachers’ mathematical knowledge can be helpful to shed light on observed differences in performance in studies such as TIMSS and PISA.

One example of such a study is the Teacher Education and Development Study (TEDS-M) funded by the International Association for the Evaluation of Educational Achievement (IEA), which compares teacher education programs focusing on the preparation of teachers to teach mathematics in primary and lower secondary schools. The TEDS-M framework makes a clear distinction between two types of knowledge within the mathematical knowledge for teaching: mathematical content knowledge and mathematics pedagogical content knowledge (Tatto, Schwille, Senk, Ingvarson, Peck, and Rowley 2008). Drawing from Shulman’s (1986) three categories of knowledge for teaching, the pedagogical curricular knowledge and pedagogical content knowledge, the project’s framework assumes that mathematics pedagogical content knowledge consists of three components: mathematical curricular knowledge, knowledge of planning for mathematics teaching and learning (preactive), and enacted mathematics knowledge for teaching and learning (interactive).

The majority of the items consist of multiple-choice format with the purpose of using the instruments at scale. Two other formats, complex multiple choice and open constructed responses, are also used. To provide useful linkage to TIMSS data, the study adopts both the content and cognitive domains from the TIMSS framework, with the content covering topics in number, algebra, geometry, and data, and the cognitive domains comprising knowing, applying, and reasoning. For the content knowledge items, the study made clear distinctions among the curricular level of each content item into three categories: “novice” (mathematics content that is typically taught in the grades the future teacher will teach), “intermediate” (content that is typically taught one or two grades beyond the highest grade the future teacher will teach), and “advanced” (content that is typically taught three or more years beyond the highest grade the future teacher will teach). Therefore, the TEDS-M assesses the content teachers are required to teach, with the assumption that prospective teachers need to know at least 2 years beyond the level they are expected to teach. The pedagogical content knowledge items present typical teaching scenarios and participants are asked to suggest a pedagogical solution in relation to the three components of mathematics pedagogical content knowledge in the framework.

Another set of measures that has been widely studied and shown to be successful in the USA is the MKT measures (Learning Mathematics for Teaching 2008). Scholars at the Learning Mathematics for Teaching (LMT) project at the University of Michigan base these measures on a construct inspired by Shulman’s (1986) theory that teachers need to know both subject matter knowledge and pedagogical content knowledge. They further refine these two categories of knowledge into six domains of knowledge (Fig. 1) and refer to this construct as MKT (Hill et al. 2008a, b). The three domains under subject matter knowledge consist of common content knowledge (CCK), specialized content knowledge (SCK), and knowledge at the mathematical horizon or horizon content knowledge (HCK). Pedagogical content knowledge (PCK) is subdivided into knowledge of content and students (KCS), knowledge of content and teaching (KCT), and knowledge of curriculum.

Fig. 1
figure 1

Domains of MKT (Ball, Thames, and Phelps, 2008, p. 403)

Commonalities exist between the LMT and TEDS-M projects, especially pertaining to PCK. Both groups’ characterization of the PCK as knowledge of explanations and of students’ thinking aligns with Shulman’s (1986) theoretical framework of teachers’ knowledge.

However, the two groups differ when it comes to content knowledge because of differences in the targeted population. The LMT project was interested in practicing elementary teachers, whereas the TEDS-M project targets prospective mathematics teachers. In particular, their distinction between CCK and SCK for practicing teachers is noteworthy. These domains were initially identified through study of the practice of teaching where measures were developed and were subsequently tested and supported by psychometric analyses (Ball et al. 2008). However, despite the confirmation of these domains, these measures were developed based on teaching practices in the USA and may not translate to other countries. For instance, Ball et al. (2008) identify one of the aspects of mathematical tasks of teaching in the USA as “appraising and adapting the mathematical content of textbooks” (p. 400). However, many countries have a national curriculum and there may not be variations among textbooks. Teachers are expected to use the textbook as a prescriptive manual, and thus there is no room for appraising or adapting mathematical content from the textbook. Therefore, the translation and adaptation of measures such as the MKT is not straightforward and requires careful scrutiny to be successfully implemented in another setting.

Emenogu and Childs (2005) remark the well-documented fact that even when rigorous processes of translation, verification, and field-testing are followed, translation may introduce measurement nonequivalence. Differences may occur due to not only language difference, but also language group differences. For instance, curriculum differences such as the sequence of mathematics courses, the time spent on topics, and the availability of textbooks and other materials may cause differences in the relative item difficulty of measures (Emenogu and Childs 2005). Understanding the context of the intended country where the measures are to be used is therefore deemed necessary.

When using a wider lens to examine the mathematical work of teaching, one expects to find similarities across countries. Anderson-Levitt (2002) points out that differences and similarities exist in the organization of the work of teaching, teachers’ beliefs and values, and teaching practices. Of interest in the context of teacher knowledge is the existence of a common curriculum at the elementary level and the “egg-carton” school structure in which one teacher works with a group of students isolated in a class more or less at the same grade level. However, when zooming in to examine the work of teaching mathematics in detail, differences do occur. Stigler and Hiebert (1999), for instance, claim that different beliefs about the nature of mathematics, the nature of learning, the role of the teacher, the structure of a lesson, and how to respond to individual differences among students lead to different teaching practices in the USA and Japan. Other research studies also support the existence of national scripts (Andrews 2007). These studies, however, focus on (a) mathematical foci or observable learning objectives of a teacher’s actions; (b) mathematical contexts or the conceptions of mathematics underpinning the tasks teachers present their students; and (c) generic didactic strategies that teachers use to facilitate learners’ ability to understand and use mathematics (Andrews 2007, 2009). Although they highlight differences in key tasks of teaching mathematics, such as the use of mathematical terms and conventions, the presence and prevalence of particular content strands at particular grades, and the typical responses of students, the question remains whether differences in teaching practices warrant different knowledge required for the task of teaching across cultures.

3 Indonesian context

Like many developing countries, Indonesia is undertaking an effort to improve its education. Students’ performance in national examinations shows mathematics to be consistently the lowest scored of all school subjects (Depdikbud 1997). In international comparative studies like TIMSS and PISA, Indonesian students performed less well than most other participating countries (Gonzales et al. 2008; OECD 2010).

Prospective elementary teachers had the following options regarding their educational training: 1-year, 2-year, or 3-year programs known as the Diploma 1 (D1), Diploma 2 (D2), and Diploma 3 (D3) programs, and the 4-year Stratum 1 (S1) program. Generally, diploma programs emphasize professional training, whereas the 4-year Stratum I (S1) program, equivalent to the bachelor’s degree, emphasizes academic training. However, in the education areas the difference between the two is unclear, and the D1 and D2 programs preparing junior secondary teachers have been phased out and replaced by the D3 programs (Joni 2000). To improve teachers’ quality, the Teachers and Lecturers Law No. 14 was passed in 2005 requiring teachers at all levels to complete a 4-year college degree; in 2007, teacher certification programs were enacted by the National Education Ministry, which requires that every teacher must fulfill the academic qualification standard and teacher competency (World Bank 2010).

Unlike high-performing Asian countries, research on teacher knowledge in Indonesia is scant. In general, Indonesian teachers’ knowledge of the content is weak (Joni 2000; World Bank 2010). However, there is a lack of records on the specific topics or areas where these teachers lack knowledge. Some studies showed that understanding of the ordering of negative decimals and placing them on the number line are challenging for Indonesian preservice elementary teachers (Widjaja, Stacey, and Steinle 2011), and that they have weak knowledge in linking fractions and decimals (Widjaja and Stacey 2009).

4 Method

4.1 Participants

In this study, I recruited 229 elementary and middle school teachers who participated in science and mathematics professional development programs in Indonesia. Only those who were in the mathematics professional development program were invited to participate. These programs were requested by the school districts and were conducted by an independent professional development provider. Although all teachers were mandated to participate in these programs, participation in this study was voluntary. Teachers who participated in the study were given a small token of appreciation. Out of the 229 participants, there were 19 incomplete forms and these were dropped from the study. Ten of the incomplete respondents were district supervisors. Among 210 teachers, there were 180 elementary teachers and 30 middle school teachers. Because the MKT geometry measures assess geometry content knowledge for grades 3 through 8, the elementary teachers and middle school teachers were combined. There is a possibility that the initial preparation of the two groups of teachers might have an effect on how they respond to the items. However, because the treatment of two-dimensional shapes was not different in both grade bands (Departemen Pendidikan Nasional 2003), the assumption was that they might not be significantly different.

4.2 Instruments

The EL GEOCK 2004 MKT geometry measures were selected as part of a study to evaluate the effectiveness of a professional development program conducted by the author, working through the existing professional development provider mentioned above. Because the MKT instrument is one of the most widely used instruments to assess in-service teachers’ mathematical knowledge in the USA, the measures were considered to be appropriate for this study. The measures contained items assessing content knowledge in geometry of grades 3–8, and consisted of 35 items. These items were distributed between two forms: Form A (8 item stems, 19 items) and Form B (8 item stems, 23 items), with seven common matching items for linking purposes. The items were in multiple-choice format with choices ranging from three through seven. The geometry scales only had items in the content knowledge domain. For the purpose of this paper, only data from Form A, given as a pretest, is reported. The measures were administered to the participants of the mathematics professional development at the beginning of the program. They were given for 1 h to complete the form individually, but on average the participants required only about 30 min.

The MKT geometry measures were translated by the author and then adapted for cultural fitness as part of a study examining Indonesian teachers’ MKT (Ng 2011). Delaney et al.’s (2008) framework was used for item adaptation, which included changes to general context, changes to school context, and changes to mathematical substance. The author, fluent in both Indonesian and English, had experience in teaching secondary school in Indonesia and teaching secondary and preservice elementary teachers in the USA and had translated the items. To ensure the soundness of the translation and adaptation, the translated items were examined by a professor in mathematics education, a professor of Teachers of English to Speakers of Other Languages (TESOL), and a staff person from a professional development provider in Indonesia. They verified that the translations were accurate and that any changes regarding wording and cultural fitness were in line with common presentations in Indonesia. Translation and adaptation of the measures were documented, especially issues that arose during the verification of the translated version by the experts mentioned above.

4.3 Data analysis

To assess how the translated and adapted versions performed, three psychometric analyses were conducted: comparing the point biserial correlation estimates between the US and Indonesian measures; evaluating the relative item difficulties using a one-parameter item response theory (IRT) model between the two countries; and calculating the reliability of the adapted measures. The point biserial correlation estimates were calculated based on the Indonesian teachers’ responses and compared to the US point biserial correlation estimates. The point biserial estimates provide information on how the items are correlated with one another. Higher point biserial correlation indicates stronger relationships among the items and the construct being measured (Delaney et al. 2008). In the case of this study, higher point biserial correlations would suggest that the items were capable of discriminating between teachers who were closer together on their MKT geometry. The relative item difficulties from a one-parameter (Rasch model) item response theory (IRT) model were calculated and compared to the US form. Item difficulty is reported in standard deviation units, where 0 represents the average teacher ability. Items with difficulties lower than 0 are easier and items with difficulties higher than 0 are harder. Finally, the reliability of the measures, i.e., how consistent the respondents’ scores are over multiple items or over multiple testing, was calculated for each form. The reliability, using Cronbach’s alpha, is reported on a 0–1 scale, with higher values indicating highly correlated item responses and consistency in measuring the same construct.

In addition to the three psychometric analyses mentioned above, the test information curve maximum was generated for each form to examine how useful the measures were. The test information curve provides information on whether the measures were more difficult or less difficult for the average teachers, i.e., whether the measures are able to discriminate among teachers of different level of abilities (Hill et al. 2008a, b).

5 Results

The focus of this paper was to examine how useful the adapted MKT geometry measures were for studying teachers’ knowledge in Indonesia and the challenges in using the adapted MKT to study Indonesian elementary teachers’ MKT geometry. Selective challenges in translation and adaptation of items are discussed to highlight some of the complexities in using instruments from another setting. Findings from the psychometric analyses are presented.

5.1 Item translation and adaptation

When translating the MKT geometry measures using Delaney et al.’s (2008) categories of changes for adapting instruments, several issues came up. This section documents some of the challenges in translating and adapting the MKT geometry measures specifically related to the mathematical substance as it pertains to teachers’ knowledge of mathematics. More detailed documentation of the translation and adaptation process in the study is available in another paper (Ng, Mosvold, and Fauskanger 2012).

One of the challenges of item translation and adaptation related to the mathematical substance pertains to the influence of mathematical language and definitions. The use of technical mathematical terms is not common, particularly at the elementary school context. For instance, the term polygon is frequently used in school texts in the USA, but rarely by Indonesian school texts. Instead, a more descriptive term, multi-sided 2-D shape, is typically used. This is true also for the term hexagon, where the more descriptive word six-sided 2-D shape is preferred. A second issue relating to mathematical language is the non-availability of equivalent mathematical term in the Indonesian context. For example, the word tessellate is not used in Indonesia. In the USA, tiling is a plane-filling arrangement of plane figures, whereas tessellation is a special form of tiling using regular polygons. However, in Indonesia, no distinction is made between the two. Elementary teachers in Indonesia are also not familiar with the term diagonal, because it is not addressed in the elementary curriculum.

Finally, a third difference in terms of mathematical vocabulary relates to the use of ambiguous terms inferred by context. For example, the words faces and edges are used interchangeably in the context of two-dimensional figures and three-dimensional solids. The same word sisi (literally means side) is used both to describe the face of a three-dimensional shape and the edge for a two-dimensional shape and is understood by context (side of a square versus side of a cube). This use of the same word, sisi, to depict an edge in the context of two-dimensional shape and a face in three-dimensional context, is not colloquial in the sense that it is informally used within school context with young children. This same word is used by mathematicians and is understood to refer to which component by the context. For the edge, the segment where two faces meet on a polyhedron, another term rusuk is used in Indonesian.

5.2 Psychometric analyses

5.2.1 Point biserial correlations

Point biserial estimates provide information on the degree of relation among the items. Higher point biserial correlation is indicative of stronger relationship among the items and the construct being measured. The point biserial estimates for each item in the MKT geometry measures compared with sample from US teachers are presented in Table 1.

Table 1 Point biserial correlations estimates for MKT geometry measures between the US and Indonesian versions

One item (Item 3c, Table 1) in the Indonesian version had negative point biserial correlations (r = −0.045), indicating that teachers who scored well on other items on this test were more likely to get this particular item incorrect. This item asked the teachers if it was possible for a parallelogram to have congruent diagonals. Teachers whose concept image of a parallelogram included rectangle as a special case of parallelogram with congruent angles would have correctly answered this question. However, the negative point biserial suggested that teachers who were correct with other items answered this item incorrectly and thus revealed that their concept image of parallelogram excluded the inclusion of rectangle as a special type of parallelogram. Furthermore, all items with low point biserial estimates (Items 1b, 3a, and 4) test teachers’ knowledge of the inclusion relations among quadrilaterals. These results further suggested that the Indonesian teachers in this study were not familiar with the concept of hierarchical relationships among the quadrilaterals. Another item with low point biserial estimate was Item 1e, where the question asked teachers to evaluate whether all the angles of a hexagon were always equal, which required thinking about hexagon more generically rather than the typical regular hexagon.

The point biserial correlations of each item were compared between the US and the Indonesian measures. Figure 2 shows the point biserial correlations between the US and Indonesian samples of each item after eliminating the item with negative point biserial correlation (Item 3c). The correlation between point biserial estimates was moderate (r = 0.369). This result suggested that items that were highly correlated with one another in the US measures were not correlated in the adapted Indonesian measures. Some of the items in the adapted version had low point biserial estimates (and one item even had a negative point biserial estimate).

Fig. 2
figure 2

A regression line fitted to a scatter plot of the US and Indonesian biserial correlations

5.2.2 Relative item difficulties

The item difficulties were examined to determine whether the same items were easy and difficult in both countries. Table 2 presents the item difficulty estimates for the two countries. The majority of the items were more difficult for Indonesian teachers except for items related to the use of the geoboard (Items 2a–2e), where the relative difficulties were close for the two settings. Items that required teachers to think relationally about quadrilaterals (Items 1b, 3a, and 4) were significantly more difficult for teachers in the Indonesian sample (close to 2 standard deviation units).

Table 2 Comparison of the difficulty estimates for MKT geometry measures between the US and Indonesian versions

Figure 3 shows a regression line fitted to a scatter plot of relative item difficulties of the MKT geometry measures. The correlation between the relative item difficulties for Indonesian and US teachers is relatively strong (0.817). There were two outliers. Item 3a was relatively more difficult for Indonesian teachers, which has been explained above, whereas Item 3d, assessing the possibility for a right triangle with all sides equal, was relatively more difficult for US teachers.

Fig. 3
figure 3

A regression line fitted to a scatter plot of the relative difficulties of items in the Indonesian and US versions of the MKT geometry measures

5.2.3 Reliability of the MKT geometry measures

The reliabilities of the measures were calculated using Cronbach’s alpha. Since the measures contained the same items as the US form, their reliabilities can be compared. The adapted measures had relatively low reliabilities of 0.645 in Indonesia, which had Cronbach’s alpha of 0.845 in the USA (Hill 2004). In general, reliabilities of 0.70 or above are considered adequate for instruments intended to answer research and evaluation questions using relatively large samples. These low reliabilities mean that the item responses were not highly correlated and did not consistently measure the teachers’ MKT geometry. Hill and colleagues (2008a, b) posit that low reliability of measures may result from several sources: (1) the existence of multiple constructs; (2) the presence of items not sensitive to discriminate among individuals; (3) mismatch of items to the ability level of the target population. In the case with the adapted MKT geometry measures, several items had low point biserial estimates and were relatively more difficult for the Indonesian sample. These questions assessed teachers’ understanding of the inclusive relations among quadrilaterals, which for the Indonesian context was neither related to their experience nor required in their task of teaching geometry in the elementary grades. The national curriculum guideline (Departemen Pendidikan Nasional 2003) states under the Geometry and Measurement standards that “students should be able to (a) Identify and measure two- and three-dimensional shapes, and use them to solve problems; (b) Identify and measure elements of two-dimensional shapes, and use them to solve problems; (c) Measure perimeter and area of two-dimensional shapes and use them to solve problems; (d) Identify characteristics and measure elements of three-dimensional shapes, identify symmetry of two-dimensional shapes, and use them to solve problems; and (e) Use coordinate system in a plane” (p. 18, author translation). As a comparison, the Principles and Standards for School Mathematics document states the expectation that grades 3–5 students should “identify, compare, and analyze attributes of two- and three-dimensional shapes and develop vocabulary to describe the attributes; and classify two- and three-dimensional shapes according to their properties and develop definitions of classes of shapes such as triangles and pyramids” (NCTM 2000 p. 164). The treatment of two- and three-dimensional shapes in the Indonesian curriculum is thus different; shapes are introduced as distinct objects and no efforts to relate them can be found in the standards and textbooks.

The existence of this possible sub-construct of knowledge of inclusive properties of shape that is distinct from the general content knowledge of geometry might have lowered the reliability of the measures, particularly for this context.

5.2.4 Test information curve

Test information curve identifies the level of knowledge at which the scale best measures individuals. Figure 4 shows the test information curve for the adapted measures. The x axis is the teachers’ scale score; 0 typically corresponds to the average teacher in the population under study, with negative scores indicating less knowledgeable teachers and positive scores indicating more knowledgeable teachers. In this case, Fig. 4 shows that most information (shown as a solid line and measured on the left y axis) is provided for teachers between zero and 1 standard deviation above average; a corresponding standard error estimate (shown as a dotted line and measured in the right y axis) shows that errors in score estimation increase significantly for individuals below average. In the pilot study for US teachers who participated in California’s Mathematical Professional Development Institute, the geometry form A provides the maximum information for less knowledgeable teachers whose abilities are one-half standard deviation below the mean (Hill 2007). As a result of adapting the measures in Indonesia, the test information curve reaches its maximum at z = 0.5 as shown in Fig. 4. In other words, the geometry form A provides maximum information for Indonesian teachers whose abilities are 0.5 standard deviation above the mean, and the survey provides less information for teachers whose abilities are more than 2.5 standard deviation above the mean and less than 1.5 standard deviation below the mean. This means that the adapted measures discriminate more knowledgeable Indonesian teachers better than average or less knowledgeable Indonesian teachers.

Fig. 4
figure 4

Test information curve for MKT geometry measures

Since the adapted measures were the same as the US measures, the average item difficulty of each form could be compared. The adapted MKT measures had an overall average item difficulty of 0.316 standard deviation units, which was significantly higher than that of the US form, which had an overall average item difficulty of −0.548 standard deviation units, implying that the adapted MKT was more difficult for the Indonesian teachers.

6 Discussion

Comparative studies provide “a powerful way to unveil unnoticed but ubiquitous practices” (Stigler et al. 2000 p. 88). This study of using practice-based measures of teacher knowledge developed in the USA to examine Indonesian elementary teachers’ knowledge of geometry has revealed many different nuances in the mathematical task of teaching in Indonesia. The process of translating and adapting the MKT geometry measures for use in the Indonesian context unveiled differences in the emphasis of teaching and learning geometry despite commonalities in the topics treated in the curricula of both countries. Although elementary teachers in both countries are required to teach geometry content of two-dimensional shapes, teachers in Indonesia, however, introduce these shapes without any effort to relate them based on their properties, following the Indonesian national curriculum guideline where one of the objectives of geometry for students is to “identify two- and three-dimensional shapes based on their properties, characteristics, or similarities” (Departemen Pendidikan Nasional 2003, authors’ translation). On the other hand, teachers in the USA need to emphasize the relationship among shapes, such as a square is a special type of rectangle, with their students.

In this section, we discuss the linguistic challenge when adapting the MKT instrument and the importance of attending to multiple definitions of mathematical objects. Following the discussion on these challenges, we turn to a broader discussion of what knowledge is required of teachers and what teachers should know ideally and offer suggestions for further discussions and research.

6.1 Inclusive versus exclusive definitions

Language plays a significant role in promoting mathematical understanding (e.g., Miura, Kim, Chang, and Okamoto 1988). For instance, the linguistic structure of the language of instruction has been found to influence students’ understanding of place value (Miura, Okamoto, Kim, Steere, and Fayol 1993) and fractions (Miura, Okamota, Vlahovic-Stetic, Kim, and Han 1999). In this study, results from psychometric analyses showed that items testing teachers’ content knowledge of the hierarchical relationships of quadrilaterals had low point biserial estimates and were about 2 standard deviations more difficult for Indonesian teachers compared to US teachers. This discrepancy may be attributed to linguistic influence. The Indonesian term for rectangle is literally “long square,” which resembles Euclid’s term oblong. Therefore, the term for rectangle precludes it from having all sides congruent and consequently a square is not a special type of rectangle. As a result, Indonesian teachers’ concept images of squares and rectangles are incongruent with the inclusive definitions presented in the MKT geometry measures. Consequently, when presented with items such as, “A square is a rectangle” and asked to evaluate if this statement was always true, sometimes true, or never true, rather than testing the teachers’ understanding of the relational properties between the two shapes, the question actually tests the teachers’ concept images strongly influenced by the definitions they adopt.

Contrary to common public views about the language of mathematics being universal and thus mathematical definitions being standardized, multiple definitions do exist and the same mathematical objects are named in many different ways, indicating different mathematical meanings (Usiskin and Griffin 2008). Another possible reason why the Indonesian elementary teachers found items relating shapes to be difficult was that they were familiar with the exclusive definitions of quadrilaterals. Usiskin and Griffin (2008) point out that the classification of quadrilaterals as found in Book I of Euclid’s Elements adopts exclusive definitions of the five quadrilaterals (squares, oblongs, rhombuses, rhomboids, and trapezia), where none of the five types of quadrilaterals is a special kind of any other type. In other words, Euclid’s scheme of classifying the quadrilaterals is of a partitional nature rather than a hierarchical one (De Villier 1994). The Indonesian textbooks adopt these exclusive definitions, and there is no indication in the curriculum guidelines to relate these shapes. Students are expected to “identify … two-dimensional shapes” and “identify … elements of two-dimensional shapes” (Departemen Pendidikan Nasional 2003, p. 18). There is no expectation to “… compare, and analyze attributes of two- and three-dimensional shapes and develop vocabulary to describe the attributes; and classify two- and three-dimensional shapes according to their properties and develop definitions of classes of shapes such as triangles and pyramids” (NCTM 2000 p. 164) as outlined in the US document.

De Villier (1994) posits that many teachers and textbook authors hold a perspective that only the conventional hierarchical classification is mathematically acceptable, whereas a partition classification is mathematically illogical and therefore unacceptable. However, the opposite case is true for some countries, in this case, Indonesia, where the acceptable classification is partitional in nature and the hierarchical classification is not mentioned in the primary or secondary curriculum (Departemen Pendidikan Nasional 2003). Both classifications are mathematically valid; however, the wider adoption of the hierarchical classification over the partition classification is because of its greater functionality (De Villier 1994). Most textbooks and teachers, however, completely ignore discussing this fundamental aspect, simply imposing a certain classification and definitions on students for which they have little or no functional understanding. On the other hand, teachers, especially at the primary level, may not even be cognizant of the existence of alternative classifications and definitions as shown in this study. Leikin and Winicki-Landman (2000) found that even experienced mathematics teachers struggle with this issue of equivalent and non-equivalent definitions.

The following sample item taken out of the pool of the released items (LMT 2008) has been selected to exemplify the challenge in the translation and adaptation of the items specifically related to the problem of multiple definitions. This item is not directly related to the topic of quadrilateral because such item is not among those released. However, some changes were made to this released item to serve the purpose of showcasing the problem at hand (Fig. 5).

Fig. 5
figure 5

Sample geometry item from released items (Learning Mathematics for Teaching 2008)

From the adapted version of the sample item from the released items pool, the response a teacher will give on this item depends on which definition of symmetry is used. One view of rotational symmetry is that a 360-degree rotation is considered to be a rotational symmetry. However, another view considers the 360-degree to be non-significant, since any shape when rotated to 360° is mapped to its original position. Relating back to the items in the geometry measures, by asking the Indonesian teachers to judge whether the statement “a rectangle is a square” is always true, sometimes true, or never true, their concept image of squares and rectangles, depending on which definitions they adhere to, may be evaluated. The Indonesian teachers were more likely to get these items incorrect, as the results showed, not necessarily because they are not knowledgeable teachers, but because of a strong influence of their concept images.

6.2 What teachers are required to know and what teachers should know

The work of teaching in a broader sense may be similar among different countries (LeTendre et al. 2001), but on closer examination differences occur. For instance, US teachers may need to select and adapt the mathematical content of textbooks because of the wide selection of curriculum materials available in the market. US teachers have much more freedom in using textbooks for classroom instruction and in how closely they follow the textbooks (Li, Chen, and An 2009). However, many countries have a national curriculum and there may not be variations among textbooks. In many countries, although teachers have choices on instructional strategies and modifications, the textbook provides “a blueprint for content coverage and instructional sequence” (Li, Chen, and An 2009 p. 813). Individuals may have choices in the pedagogical stances employed or possible adaptations to specific students’ needs, but not in the content or questioning the authority of the textbook. Indonesian teachers perceive that definitions are typically given in the textbook, so they do not scrutinize definitions, questioning their validity. The work of teaching itself is not different because of the existence of a national curriculum. However, the demand placed on the teachers to adapt instructional materials is considerably lessened because of having national standards and prescriptive curriculum guidelines.

The results in this study indicate that although the adaptation of the MKT geometry measures in the Indonesian context presents challenges, the measures were useful to discriminate somewhat knowledgeable teachers from less knowledgeable ones (z between −1.5 and 2.5 standard deviation) as shown by results from the psychometric analyses. However, a more pressing question is whether differences in teaching practices require different knowledge needed for teaching. Do teachers in Indonesia need to know about inclusive definitions of quadrilaterals even when they are not required to teach it to their students, because the curriculum guidelines do not specify it? Answering this question is important, especially for international studies where the purpose is actually comparing teachers’ knowledge.

6.3 Globalization of mathematics education and teacher knowledge

It should be pointed out that this study did not aim to compare teachers’ knowledge in Indonesia and in the USA. Teachers’ knowledge relies on many factors, such as their preparation program, the institutional practice of grade placement, and the curriculum guideline on the topics taught. Our intention was to examine the challenges in adapting the US-based measures for use in Indonesia. On face value, the MKT geometry measures may not be a good set of assessments to evaluate Indonesian elementary teachers’ MKT geometry for several reasons. First, many teachers in Indonesia are typically assigned to teach a specific grade level throughout their entire career (Ng 2011). Typically, teachers in Indonesia are trained for either primary or secondary level. In many cases, at the primary level, many teachers have graduated from a diploma program. Teachers with stronger mathematics background are assigned upper primary grades (year 5 or 6). This narrow assignment poses challenges, because teachers will be resistant if they are to be tested on content knowledge they are not required to teach. However, because of the interconnected nature of mathematics, it could be argued that teachers need to know the content beyond a specific grade level and be able to anticipate future problems students may face when, for example, they are taught square and rectangle as distinct shapes compared to understanding that one is a subset of the other. This kind of knowledge, what Ball et al. (2008) call Horizon Knowledge of Mathematics, is one of the important aspects of teachers’ MKT.

Second, differences in the use of definitions, particularly in the context of geometry in this study, present challenges because items that focus on teachers’ understanding of the inclusive definitions do not discriminate between knowledgeable teachers and less knowledgeable ones (these items have differences in item difficulties of about 2 standard deviation units). Instead, they discriminate between teachers who adopted inclusive definitions and those who adopted exclusive definitions. There has not been a consensus among mathematics educators on this issue, i.e., which definition should be encouraged in the elementary curriculum. Although judging by research studies on this topic, the trend is moving toward the adoption of inclusive definitions by many countries (e.g., Fujita and Jones 2006; NCTM 2000; Okazaki 1999). De Villier (1994) suggests the importance of discussing and comparing the relative advantages and disadvantages of the two ways of defining quadrilaterals, so that students may be led to realize that there are certain advantages in accepting a hierarchical classification. He proposes that hierarchical classification results in more economical definitions: for example, a hierarchical definition of a parallelogram as a quadrilateral with both pairs of opposite sides parallel versus a partitional definition as a quadrilateral with both pairs of opposite sides parallel, but not all angles or sides equal. He argues that in hierarchical definition for a concept, all theorems proved for that concept automatically apply to its special cases. For instance, a proof that the diagonals of a parallelogram bisect each other immediately applies to rectangles, rhombuses, and squares, but not for a partitional view. For these reasons, De Villier (1994) acknowledges the more intuitive nature of partitional view for students, but endorses the more functional hierarchical definitions in school geometry, requiring mathematics teachers to develop awareness of the existence of equivalent and non-equivalent and even competing definitions (Winicki-Landman and Leikin 2000) and specifically the advantages of one definition over the others.

The MKT construct is based on studying the mathematical work of teaching in the USA (Ball and Bass 2003). Whether or not this construct is usable in other cultural settings depends on our conception of national culture. LeTendre et al. (2001) contend that “[t]he concept of a national script for teaching is an amalgam of cultural ideals about what should happen in an ideal lesson in [a] nation, not what actually occurs” (p. 12). Extending this analogy to the mathematical knowledge required to perform the task of teaching, the MKT is a construct about what teachers should know to perform ideally. The instrument developed to measure MKT is far from perfect, and this line of research has the potential to inform the continual development and adaptation for use with a wider audience. The MKT measures are worth looking at in other countries, because they help to surface differences in the practice of teaching that may otherwise remain hidden, i.e., understanding what teachers need to know in order to do the work of teaching mathematics in any setting. This claim may be far-fetched at this time, but considers the trend in the globalization of mathematics (Clarke 2003). Atweh and Clarkson (2002) note that the convergence of school mathematics and mathematics education curricula around the world is an example of globalization of mathematics education, where constraints of geography on social and cultural arrangements recede. They also argue that different curriculum documents and textbooks in mathematics education around the world are strikingly similar rather than different, especially in the areas of content and sequencing of topics. Moreover, these similarities have proven to be rather stable across the years; changes in curriculum in one country or certain region are often reflected in other countries within a few years. LeTendre et al. (2001) warn that “[t]he forces behind global institutional isomorphism are considerable and failing to recognize their power in all organizations within the same institutional sector everywhere in the world is a route to an unclear picture of how culture actually works in a dynamic fashion” (p. 13). Therefore, we deem that our work in this study examining challenges in adapting the MKT and its potential for use in Indonesia would be fruitful in anticipating what the future might hold in studying teacher knowledge in Indonesia and other countries.