Peer tutoring in algebra: A study in Middle school

Abstract This study reports the academic benefits of peer tutoring in algebra for middle school students. A total of 380 students enrolled in grades 7th and 8th participated in the study. Two peer tutoring sessions took place during each week (10 weeks). Interactions between peers lasted 20 to 25 minutes for each session. The typology of tutoring was fixed and same-age. A pretest posttest with control group design was used. Statistical significant improvements were reported in the academic achievement variable after the implementation of the peer tutoring program for 7th and 8th grade courses separately and altogether. Over 87% of the students in the experimental group improved their marks. The overall effect size for the experience was reported to be medium (Hedge’s g = 0.48). The main conclusion of this study is that fixed and same-age peer tutoring in algebra may be very beneficial for middle school students.


Introduction
The benefits of peer tutoring in Mathematics have been widely reported during the last three decades. The pioneer studies by Fantuzzo, Riggio, Connelly and Dimeff (1989), Topping (1989), Greenwood, Delquadri and Hall (1989) and Fuchs and Fuchs (1990) led the way to hundreds of experiences in the field. Recent literature reviews (Alegre, Moliner, Maroto, & Lorenzo-Valentin, 2019;Leung, 2019b) and meta-analysis (Alegre-Ansuategui, Moliner, Lorenzo, & Maroto, 2018;Leung 2019a) have repeatedly documented the academic, social and psychological benefits of this methodology. Peer tutoring experiences have been carried out across all educational levels, from early childhood education to college. Nevertheless, there is a lack of peer tutoring studies in Mathematics that address algebra contents. Taking into account that there are six great branches in Mathematics education (algebra, geometry, arithmetic, analysis, probability and statistics), peer tutoring experiences refer mostly to geometry and arithmetic contents (Leung, 2015). The very few studies addressing algebra contents are either inconclusive or reported no significant academic benefits for the students (Agne & Muller, 2019;Allsopp, 1997;Walker, Rummel, & Koedinger, 2014). So, given the demonstrated potentiality of this methodology with other mathematical contents, it is necessary to determine if it can also be effective for average students working with algebra contents. This study examines the academic achievement of middle school students (7 th and 8 th graders) working with algebra contents through peer tutoring. To this purpose, the following three hypotheses are defined: Hypothesis 1: Students' marks in algebra will improve significantly after the peer tutoring implementation.
Hypothesis 2: The majority of students will improve their marks in algebra because of peer tutoring. Hypothesis 3: Significant differences will be found between 7 th and 8 th grade students before or after the implementation of the peer tutoring program.

Theoretical framework
Duran, Flores, Oller and Ram ırez (2018) state that many times students can be better mediators than teachers or adults in academic environments. They support this statement by saying that students have just learned the contents, so they know better the areas in which their peers might need more help. The use of a more direct speech, as well as the fact of sharing cultural and linguistic references must also be taken into account according to these authors. Several definitions of peer tutoring can be found in the literature. According to Topping (2005) peer tutoring can be defined as students from similar social groups helping each other to learn and learning themselves by teaching. In this process, classmates with more knowledge or ability play the role of tutors. Hence, the tutors support and provide help to other students (tutees) with less knowledge or ability through cooperation in pairs. In this sense, an asymmetric relationship is externally planned by a professional with students sharing the goal of acquiring a curricular content (De Backer, Van Keer, Moerkerke, & Valcke, 2016). It is a methodology that fosters inclusion in the classroom as it promotes collaborative learning (Miravet & Garc ıa 2013;Shirani Bidabadi, Nasr Esfahani, Mirshah Jafari, & Abedi, 2019).
There are different types of peer tutoring depending on two factors: the ages of the participants and the role they play during the experience. Taking into two considerations the ages of the participants, two main types of peer tutoring can be developed: same-age and cross-age tutoring (Greene, Mc Tiernan, & Holloway, 2018). If the roles of the participants are to be considered, two different types of tutoring can be found: fixed peer tutoring and reciprocal peer tutoring.
In the same-age tutoring style, students must be enrolled in the same grade. In the cross-age tutoring style, students are enrolled in different grades and can even be placed in different educational levels (primary and secondary education or college). The superiority of one type over another has yet to be proved. Authors such as H€ anze, M€ uller and Berger (2018) state that students make greater progress with tutors older than themselves. According to Topping and Bryce (2004) the age difference guarantees the validity of the process and states that the differences in age should range between 2 and 4 years. Vogelwiesche, Grob and Winkler (2006) claimed that students preferred cross over same-age tutoring when comparing both types. Nevertheless, several studies (Topping, Miller, Thurston, McGavock, & Conlin, 2011;Topping et al., 2011a) state that similar results were found for cross-age and same-age tutoring in Mathematics and other subjects. Besides, none of the meta-analysis or literature reviews mentioned in the introduction section have reported significant differences between the two types. Besides, as Ramani (2016) states same-age tutoring usually takes place in the same class group, so it has the added advantage that it does not require any special organizational actions.
Fixed tutoring is usually the most developed strategy when implementing peer tutoring. In this type of tutoring, the participants maintain their respective roles (tutor and tutee) during the whole implementation. In contrast, during reciprocal peer tutoring the roles are exchanged (Duran & Monereo, 2005). No higher academic benefits have been widely proved for one type over the other. Similarly to what happens with same-age and cross-age, none of the previous literature reviews or meta-analysis above mentioned document the superiority in academic terms of one type over the other. Nevertheless, from a psychological point of view, reciprocal peer tutoring has been recommended over fixed peer tutoring. This is due to the fact that tutees' selfconcept may be affected by the continuous help received from their tutors, making themselves feel useless or inferior to their peers (De Backer, Van Keer, & Valcke, 2015a;Miravet, Ciges, & Garc ıa, 2014). However, according to different authors (Bailey et al., 2018;De Backer et al., 2016) implementing reciprocal peer tutoring requires a moderate level of expertise and a deep knowledge on the students' academic and social skills.

Research design
The influence of the design on peer tutoring interventions has recently been studied. Authors such as Zeneli, Thurston and Roseth (2016) state that the absence of a control group may overestimate the academic outcomes of peer tutoring experiences and highly recommend using a control group. To this purpose, an experimental pretest-posttest control group design was used (Valente & MacKinnon, 2017).

Sample access
Participants in the study were accessed through convenience sampling (non-probabilistic sampling technique) as they were selected only because they were conveniently available to the researchers (Altmann, 1974). Legal consent was obtained by the parents or legal tutors of the students and the Educational Council of the region. In this sense, one of the researchers of this manuscript served as a full-time teacher in the middle-school where this research took place. This teacher had previous experience in the peer tutoring field so the implementation of the program was facilitated by the prior knowledge of this researcher.

Participants
Students from a public middle-school in Spain enrolled in grades 7 th and 8 th participated in the study. A total of 380 students participated during the whole peer tutoring program as four students left the school after the beginning (turnover ratio of about 1%). Their ages ranged from 12 to 15 years old. About 52% were female and 48% were male. Approximately 56% were Hispanic, 22% were Caucasian, 19% were African, 2% were Asian and the other 1% were of other ethnicity. Almost all of them came from an urban zone and their families' socioeconomic background was average. Regarding grades distribution, 245 students were enrolled in 7 th grade and 135 were enrolled in 8 th grade. A total of 130 students in 7 th grade were assigned to the experimental group and 115 students also in 7 th grade were assigned to the control group. Besides, 70 students in 8 th grade were assigned to the experimental group and the remaining 65 students in 8 th grade were assigned to the control group. Due to organizational matters, it was impossible to have the same number of students in the experimental and the control group. In any case, suggestions given by Kenny (1975) were followed to ensure that experimental and control groups were matched as properly as possible. First, students were assigned on a probabilistic basis to either the experimental or control group. Later, students' average marks in Mathematics of the previous school-year were calculated and compared between the experimental and control group for both grades in order to find if there were important differences between groups. Once all groups seemed properly balanced regarding students' previous mathematics achievement, a Kolmogorov-Smirnov test for normality (Lilliefors, 1967) was run to ensure the normality of each group distribution.

Representativity of the Sample
The purpose of the study was that the sample was at least representative of the population of middle-school students in Spain. According to the National Institute of Statistics, in 2018 there were almost 1.300.000 middle-school students in Spain. Taking into account the considerations provided by Krejcie and Morgan (1970) about sample size for research activities, a sample of at least 374 students was necessary so that it could be representative of that population (5% of margin of error and 95% of confidence level). Hence, the sample was large enough for the purpose of the investigation.

Academic contents
The contents correspond to the second term of the seventh and eighth-grade Mathematics courses. In this term, on one hand, 7th graders learn algebra for the first time in the Spanish educational system. The contents covered included: expressing simple facts using algebraic expressions; solving first degree equations without fractions; addition, subtraction, multiplication and division of monomials; solving problems using first degree equations. On the other hand, 8th graders refresh all the contents of the previous academic course and expand their algebra knowledge with the following contents: solving second degree equations; solving first degree equations with fractions; addition, subtraction and multiplication of polynomials; solving problems using second degree equations; systems of two linear equations; solving problems with systems of two linear equations.

Peer tutoring program implementation
During the first term, the teacher used traditional teaching methods, that is, one-way instructional teaching, whereas during the second term the peer tutoring program was implemented alongside the teacher's lessons. Teachers' performance during the peer tutoring sessions was key as they were assigned several tasks. First, teachers had to monitor students' interactions during the tutoring time to ensure that students were addressing mathematics contents instead of other academic or nonacademic issues. Besides, teachers also had to watch that there were no behavioral problems between any pair of students, that is, null interaction, reluctance to help a peer or receive help from him/her. In case some pair of students were absolutely incompatible for tutoring practice due to behavioral reasons, they were re-assigned with different peers. Moreover, teachers also had to check that tutors had the right results for each problem or exercise and correct it before interactions between students started in case it was wrong.
The chosen types of tutoring were fixed and same-age. Due to organizational matters the implementation of a cross-age tutoring was dismissed. Besides, due to the fact that the researchers did not have a deep knowledge on the participants' abilities and skills in Mathematics, fixed peer tutoring was chosen taking a conservative approach (Topping, Watson, Jarvis, & Hill, 1996).

Classroom dynamics
The typical mode of instruction during a peer tutoring session was as follows. First, tutors and tutees were given the worksheet and all of them worked individually.
Approximately 6 minutes were given to complete the first exercise. After that, students had 8 minutes to work in pairs (sharing the results, asking questions … ). Later, 8 minutes were given to complete individually the second exercise for Primary Education and the problem for Secondary Education. Then, students had another 8 minutes to work in pairs again.

Instruments Used and Data Collection
The students' academic performance was measured by comparing their marks in the exam of unit 6 (Algebra 1) with the exam of unit 7 (Algebra 2). For both, 7 th and 8 th grade, Algebra 1 is intended to serve as an introductory unit to algebra contents that students will later use in Algebra 2. In the algebra 1 exam, students must express facts and perform operations with algebraic expressions. In the Algebra 2 exam, students must solve equations and algebra problems. A strong theoretical and practical knowledge is needed in Algebra 1 in order to succeed in Algebra 2. All exams have 10 exercises or problems and are graded from 0 to 10. Students are given one point for each right problem or exercise. Students are given 0.3 points for a problem in which a right procedure is developed but there is a minor calculation mistake. The marks for the Algebra 1 exam were used as pretest scores and the marks for the Algebra 2 exam were used as posttest scores. All participants in the study completed both, the pretest and the posttest. Students were not informed at any time that their results would be examined for research purposes so that the Hawthorne Effect or any other similar effect would not distort the results (Adair, 1984).

Organization and scheduling
The program was scheduled to run over 20 sessions for each course. Two peer tutoring sessions took place during each week (10 weeks) and the interactions between peers lasted 20 to 25 minutes. These sessions were held between the Algebra 1 exam and the Algebra 2 exam, that is, during the development of unit 7 (Algebra 2). All sessions took place during school-time hours. This scheduling was programed following the implications for practice given by Leung's meta-analysis of peer tutoring in mathematics and reading (Leung, 2015) in order to maximize students' academic performance. According to this author, structured tutoring, unstandardized testing to control for author bias, students training with not high frequent weekly training sessions, no formation of competing teams for group reward and no regular formation of new teams are recommended to display the larger effect sizes in these type of experiences.
Selection and distribution of peers, students' interactions, training, materials and resources were designed following the criteria and indications developed by Alegre Ansuategui, and Moliner Miravet (2017). According to these authors, the teacher must supervise the interactions between students. If one of the tutors is not able to help his/her peer, the teacher must help both the tutor and the tutee to complete the exercise or the problem. Students are ordered in a list according to their pretest marks. Then, the list is split into two parts. Students in the first half perform as tutors and those in the second perform as tutees. The most competent tutor is paired with the most competent tutee and so on until the list is over (Cockerill, Craig, & Thurston, 2018). The materials used in each peer tutoring session (worksheets, text books … ) were the same the students worked with during the rest of the academic year. A worksheet comparable to those used along the school-year was handed out to the students for each peer tutoring session. That worksheet contained two exercises or an exercise and a problem depending on the session. The complexity of the exercises and problems varied according to the worksheet. Extra activities were prepared in case some pairs finished the worksheet much earlier than the rest of the class. All students were trained before the implementation on the program. The classroom dynamic was based on two main ingredients: respect and patience. Apart from understanding the fact that the relationship with their peers was based on these two factors above mentioned, students were told how to interact during the tutoring time. The teacher checked that the tutor had the right result. After that, the tutor had to ask the tutee about his/her result once he/she had finished. If it was right, the tutee had to explain to the tutor the procedure he had followed to solve it. If it was wrong, the tutor had to help with step by step explanations the tutee so that he/she could finish the exercise or the problem. The tutee was allowed to ask when needed to his/her tutor, but always on a basis of individual work and perseverance trying to do as much as he/she could to get to the right result.

Data analysis
SPPS software (version 25) was used to analyze the data collected from the exams. Means, standard deviations and Student's t-test (95% confidence level) were used to determine whether there were any significant differences between the experimental and control groups by grades and globally before and after the implementation of the peer tutoring program (Efron, 1969). Hedge's g was calculated and used as measure of effect size for 7 th grade, 8 th grade and both, 7 th and 8 th grade combined using the formula proposed by Rosenthal and Rubin (1982) for a pretest posttest with control group design. Besides, a quantitative descriptive analysis was carried out in order to determine how many students had improved or gotten worse with the peer tutoring implementation.

Results
Quantitative descriptive results for the experimental and control groups are shown in Tables 1 and 2. Table 1 shows means, standard deviations (SD) and number of students (n) by grade (7 th or 8 th ), group (experimental or control) and phase of the study (pretest or posttest). Table 2 shows the number of students that increased or decreased their scores from the pretest to the posttest. Table 3 shows all the inferential statistics that were carried out in order to find any differences between groups. Mean differences between groups (X A -X B ) and Student's t test with its significance level (t (sig.)) are reported. Asterisks indicate tests that showed statistically significant differences (p < .05). No statistical significant differences were found for the pretest between groups (tests 1 and 2). Statistical significant differences were found between the pretest and the posttest for the experimental groups in both, 7 th and 8 th grade separately and altogether (tests 3 to 5). No statistical significant differences were found between the pretest and the posttest for the control groups in 7 th and 8 th grade nor altogether (tests 6 to 8). The increment, that is, the mean difference between the pretest and the posttest for each group was also analyzed. Statistical significant differences were found in all cases between the experimental and control groups (tests 9 to 11). No statistical significant differences were found for the increment between 7 th and 8 th grade experimental groups (test 12). A Hedge's g effect size of 0.31 was reported for the 7 th grade intervention and 0.32 for the 8 th grade intervention. The Hedge's g global effect size for the experience was 0.48.

Discussion
Statistical significant improvements were reported for the experimental group. Hence, hypothesis 1 was not rejected. These improvements are consistent with previous peer tutoring experiences in Mathematics in middle school. Kroeger and Kouche (2006), Thomas, Bonner, Everson and Somers (2015), Capp, Benbenishty, Astor and Pineda (2018) and Song, Loewenstein and Shi (2018) reported academic benefits with 7 th and 8 th graders in Mathematics during their peer tutoring experiences although algebra contents were not addressed in their studies. As stated before in the introduction section, previous peer tutoring studies in algebra had not reported academic gains (Allsopp, 1997;Walker, Rummel, & Koedinger, 2014). Nevertheless, although not specifically for algebra, the existent meta-analysis and literature reviews had usually highlighted the academic benefits of peer tutoring in middle school (12 to 15-15 years old) or secondary education (12 to 18 years old) in Mathematics (Gersten et al., 2009;Kroesbergen & Van Luit, 2003;Stevens, Rodgers, & Powell, 2018). Although the previous experiences in algebra were not satisfactory, it could be expected to find some kind of improvement in the academic achievement variable with this intervention given the educational context of the analyzed subject and grades. In this sense, the statistical significant differences found in this study are consistent with the qualitative and quantitative information given by some reviews for peer tutoring in Mathematics (Alegre et al., 2019;Mulcahy, Krezmien, & Travers, 2016) which state that a majority of studies reported statistical significant improvements. Besides, the global effect size reported for the experience is very similar to the ones reported in the meta-analysis above mentioned. It can be interpreted as a close to medium effect size (Fritz, Morris, & Richler, 2012) and implies that a considerable academic improvement has taken place from a quantitative perspective.
The majority of students in the experimental group improved their marks from the pretest to the posttest. Hence, hypothesis 2 was not rejected. The percentage of students that improved thanks to peer tutoring (87%) is quite similar to many of the recent studies in the field of peer tutoring in Mathematics not referring to algebra (Hrastinski, Stenbom, Benjaminsson, & Jansson, 2019;Yaman, 2019;Zunaiedy, Syahputra, & Panjaitan, 2019). This can be considered as a strong evidence of the benefits of peer tutoring in academic terms (Tsuei, 2017). Nevertheless, it must be taken into account that, although this methodology usually increases the academic achievement of a majority of students, there is always a percentage of students (usually between 10 and 20 percent) that decreases with its implementation. In the case of this study 13% of the students decreased their academic achievement. Authors such as Sanchez et al. (2015) state that this may be due to the fact that some students do not like to help their peers or to be helped by them, so cooperative methodologies such as peer tutoring may result in a decrease in academic engagement. Besides, Moeyaert, Klingbeil, Rodabaugh and Turan (2019) noted in a peer tutoring meta-analysis conducted for different subjects such as English or Mathematics that as tutees are repeatedly receiving help by their peers (tutors) a decrease in their academic self-concept may happen, causing some tutees to feel insecure during the exams when they are left alone. Moreover, Malone, Fodor and Hollingshead (2019) and De Smedt, Graham and Van Keer (2019) state that during peer tutoring experiences in secondary education sometimes exceptional clashes between tutors and tutees due to role differences might affect the academic outcome.
No statistical significant differences were found between 7 th and 8 th grade in the experimental group. Hence, hypothesis 3 was rejected. This fact is consistent with previous research in the peer tutoring field for different subjects by Sailors and Shanklin (2010) and Colver and Fry (2016). These authors did not find differences by course grade in their experiences. Regarding this fact, previous meta-analysis by Leung (2015) for peer tutoring in several subjects and review by Robinson, Schofield and Steers-Wentzell (2005) specifically for peer tutoring in Mathematics state that outcome differences may be found between educational levels, but rarely within the same educational level. They state that peer tutoring studies in primary education usually show higher effect sizes than secondary education or college studies. Nevertheless, the homogeneity of studies within each educational level is a fact, so it is difficult to find differences by course grades in the same educational level.

Limitations of the Study
Although results in this study may be promising, there are several limitations that must be taken into account when considering them. First of all, although a sample of 380 students may be representative of the population of middle school students in Spain, is not enough evidence to support the fact that peer tutoring will be effective when teaching algebra in middle school. Future studies in the field will determine the suitability of this methodology in different countries and cultures to address the universal validity of this methodology in this context. Besides, selecting the participants by means of convenience sampling may have also compromised the validity of the study. The fact that the teacher that implemented the program had previous experience in the field must be taken into account as new researchers in the field may face several unexpected inconveniences during future implementations (students' reluctance, organizational problems … ).

Conclusion
The main conclusion that can be drawn from this study is that same-age and fixed peer tutoring may be beneficial for middle school students in algebra from an academic perspective. Although specific previous peer tutoring studies in algebra did not report strong evidence on its academic potential, the results depicted in this study are consistent with those in the Mathematics peer tutoring field for middle school and secondary education. Moderate effect sizes for academic achievement as well as statistical significant improvements should be found when implementing peer tutoring in algebra in middle school. The percentage of students that improved their marks (87%) as well as the statistical significance of the different tests that were carried out during this research constitute a strong evidence of the potentiality of this methodology. In this sense, it is expected that a great majority of students improve their academic achievement in algebra because of peer tutoring. Nevertheless, the fact that an important number of students (13%) decreased their academic achievement after the implementation of the peer tutoring program must be considered. In this sense, it must be taken into account that peer tutoring does not work for all the students and that partial removal of feedback by the teacher during the classes may have negative academic consequences for some students. Clashes between pairs may also appear and reluctance to help or receive help from a peer may also take place and affect negatively the academic outcome of the study. It can also be concluded that the academic effect of peer tutoring in algebra for 7 th and 8 th graders should not differ significantly. In this sense, moderate effect sizes and improvements rates may be found in middle school for all grades during this type of experiences.