My Commentary to the Miracle Classroom Solution: TFA
Limitations to George Noell’s
TFA Teachers’ Contribution to Achievement
Mercedes K. Schneider, Ph.D.
Applied Statistics and Research Methods
December 7, 2012 (updated December 16, 2012)
In 2009, George Noell and Kristin Gansle completed a study “examining the degree to which students who were taught by Teach for America (TFA) corps members exceeded, met, or failed to meet the educational attainment that would be projected for them based on prior achievement and demographic factors” (pg. 3, http://www.nctq.org/docs/TFA_Louisiana_study.PDF). In this paper, I comment on several limitations to Noell’s study and offer salient conclusions at the end.
In his work, Noell uses archived data for grades 4 through 9 to compare TFA teachers to three other groupings of “non-TFA” teachers: 1) “non-TFA” teachers in general (no additional specification; 2) “non-TFA” new teachers (first or second year in the classroom), and 3) “non-TFA” experienced teachers (third to tenth year in the classroom). I use the term “non-TFA” teachers since Noell provides no additional evidence regarding the remaining teachers, only that these teachers were part of “intact longitudinal databases linking students, teachers, and courses for the 2004-2005, 2005-2006, and 2006-2007 school years” (pg. 5). Likely these teachers were mostly those trained in traditional teacher preparation programs; however, there may have been other teachers who received alternative or other provisional certifications.
“Controlling for Experience”
Noell also includes an analysis where he “controls for years of experience” to see if there is a significant difference between TFAers and other, “non-TFA” classroom teachers when experience is “equalized.” I find this last analysis strange since in essence, the goal is to test whether there is a significant difference between TFAers and “non-TFA” teachers if one is able to “erase” experience and test what is “left.” This ends up being a test of “something else but not experience.” One cannot reduce this test to more than that. I consider it a nonsense test.
Details Regarding TFA Data Set
Aside from the ‘non-TFA” issue noted above, Noell mentions some limitations of the available TFA data. The TFA sample begins with 350 teachers, but the end sample is 127. Noell notes that the TFA sample was reduced from 350 to 271 because TFAers could not be verified as “teachers of record.” This is a decrease of 20% of the sample. He does not attempt to answer why these TFAers were not considered “teachers of record.” Was this due to some assessment that these TFAers did not exhibit proper classroom control, for example? One does not know. Moreover, were other teachers in the room with the TFAers even if they were “teachers of record”? Such would confound results. Multiple teachers in a classroom preclude attributing the result to a single teacher. Noell does not investigate these issues. In the end, the TFA sample is 127 for this study since teachers of only testable grades (4 through 9) and classes (ELA, reading, math, social studies, and science) could participate in the study. (This paragraph edited 12/16/12.)
“Nonsignificant” Means “Not Significant”
Another problematic issue is Noell’s ignoring the implication of nonsignificant results. A nonsignificant statistical result means that the differences between the TFA group and the other teacher groups are due to chance. In other words, there is no difference between groups except random fluctuation. If one group has a higher score than another, in the face of nonsignificance, the difference is trivial. However, Noell insists upon comparing nonsignificant differences and reporting that “the coefficients for TFA corps members were positive when compared to experienced teachers” (pg. 15), for example. This is slanted reporting. Noell implies that the nonsignificance is simply due to small sample sizes. Yes, sample size affects significance, yet Noell cannot know that significance would happen “if only there had been more TFAers to measure.”
Noell’s results show NO significant results for TFAers outperforming experienced, certified teachers in any of the five areas: ELA, reading, math, social studies, or science, for grades 4 through 9. Keep in mind that HLM (the analysis used in this study) cannot separate the teacher from the class. Thus, what is really being tested is the class, not the teacher. HLM can compare the class, but it cannot tease out the influences inside of the class.
Noell’s results show NO significant results for TFAers outperforming teachers in general.
Noell’s results show significant differences ONLY between TFAers and new teachers (first or second year) in ELA, reading, math, and science. Keep in mind the lack of clarification in the TFA sample regarding the presence of another teacher in the room.
Isolating Subject Area Contribution
The “positive” results Noell cites on page 8 presume that student learning can be isolated per subject area. In truth, teachers are encouraged to assume a cross-curricular approach to teaching. This week in my English II class, I taught some psychology, biology, chemistry, history, and German. Some weeks I include math in the English lesson. I taught some Greek and Hebrew a few weeks ago. As my life experience increases, my ability to contribute to cross-curricular education also increases. There is no way to assess such complexity using the standardized tests in this study. In reality, any information I teach that is not English cannot be attributed to my class. It can only be confounded with test results for teachers of other subjects.
Results Cannot Generalize to
All Grades K – 12
This study utilized data on grades 4 through 9. Therefore, the results can only extend to grades 4 though 9. One cannot generalize the results of this study to grades K through 3 or 10 through 12.
Finally, as Noell correctly notes, TFAers do not remain in teaching beyond two years. Therefore, it is not possible to compare “experienced” TFAers with experienced certified teachers. This attrition can be costly for school districts and the teacher turnover unsettling to the school atmosphere. No long-term study has been conducted to see the impact of TFA turnover on both school climate and the sustainability of educational gains. Ironically, the TFA program depends upon experienced teachers to train the TFA teachers. Due to TFA attrition in the classroom, and in the absence of experienced teachers, who will train TFAers? For a true and clear comparison of TFAers to traditionally-trained teachers, the TFA sample cannot be trained using teachers from traditional teacher-prep programs. (This paragraph edited 12/16/12.)
Noell’s study does not support the argument that experienced, credentialed teachers can be successfully replaced by undertrained TFAers. All results in this regard can only be attributed to chance. None were significantly significant.
The analysis used in this study and in value-added modeling, Hierarchical Linear Modeling (HLM), cannot isolate the contribution of the teacher from his/her class. Teacher contribution is confounded with a myriad of classroom influences. Given the limited information concerning the attrition in the TFA dataset, one such classroom influence for some TFAers could be the presence of a second, experienced teacher in the room. A second teacher in the class would alter the classroom dynamic and confound results.
The idea that TFAers outperform experienced, credentialed teachers is confounded with the fact that TFA depends upon experienced, credentialed teachers to train their recruits to enter the classroom. By design, TFAers are not likely to train each other since most TFAers do not persist in the classroom beyond two years. No study has been done to examine the impact on educational quality of TFAers training each other in the absence of any connection or influence of traditionally trained teachers; that is, of second-year (or other former) TFAers training first-year TFAers. (This paragraph updated 12/16/12.)
I have heard this study is cited as indisputable evidence of the effectiveness of TFA over traditional teacher training. Such citing is poor practice in the absence of replication of the results and expansion of study criteria to include untested grades (K – 3 and 10 – 12). Replication and expansion are critical given the importance placed on studies such this in making high-stakes, community-altering policy decisions of late.
Ignoring the points in this review and instead promoting the shaky, biased interpretation of the study results as written reveals the absence of true concern for what is profitable for stakeholders in public education. May those with the influence to do so judiciously apply the results of this commentary.