Intention-to-Treat Analysis: A Primer for Those Not In “The Stats Club”
In Patrick Wolf’s work with other researchers on this study of late, Student Attainment and the Milwaukee Parent Choice Program: A Final Follow-Up Analysis, Wolf et al. use a statistical analysis know as the Intention-to-Treat (ITT) analysis. In this post, I will explain the analysis, discuss its original intended usage, and discuss its application to Wolf’s work, including limitations.
This post is meant to serve as a not-too-scary primer on ITT.
The Intention-to-Treat (ITT) analysis originates with medical research. In ITT, the researchers “intend” to treat patients in the study according to any number of treatment regimens with the goal of addressing a single medical issue (e.g., cancer). Studies might also include a control group (a group receiving no treatment).
Even though the researchers “intend” to treat patients in the study, patients do not always follow the treatment protocol, or they might manifest some reaction that might require discontinuance. Researchers could exclude such “noncompleters” from the study; however, doing so would disturb the randomized design of the study (random assignment to groups helps ensure that groups are balanced regarding demographic characteristics, such as age, gender, ethnicity, income, or geographic region).
Furthermore, in the clinical studies, those who choose to complete the entire regimen of treatment (i.e., those who follow through “per protocol”) tend to fare better with the medical outcome than those who do not follow through, regardless even of placebo treatment. Keep in mind that we are discussing medical research related to serious life issues (in this case, coronary issues). Thus, including noncompleters in the overall analysis prevents the treatment regimen from appearing “overly promising.” However, if the number of patients not adhering to the treatment is large, the study results can be biased against those actually completing the treatment regimen as prescribed. Thus the suggestion of using ITT rather than per protocol analysis to assess treatment utility. However, there is not complete agreement on this point.
Now, let’s turn our attention to the use of ITT in Wolf’s et al. study on vouchers. First, the voucher study is not a randomized design. Thus, the initial use of ITT, to preserve random design in a study, is not applicable. Instead, their study uses what is known as matching. That is, for each of the 801 students who accepted vouchers in ninth grade in 2006, Wolf and his colleagues attempted to find a match among the public school attendees; this matching was done on 1) neighborhood, 2) test scores, and 3) for some students, propensity scores (the likelihood of being a voucher student even though student was not). Matching is an alternative to randomization; however, matching does not make for as robust a study as does randomization. Even if Wolf et al. had matched voucher students with public school students on a list of 20 characteristics, or 50, or 100, there is still the possibility that the researchers failed to match on a key characteristic, one affecting the outcome of the study.
Second, Wolf’s et al. study is not “treating” anything. In medical research, the study has a clearly identifiable outcome to be overcome via the treatment methods, be it obesity, or migraines, or lymphoma. What, then, is the “outcome” being “treated” in this educational study? One focus of the study is on-time (4-year) graduation rates. Another is postsecondary enrollment. Now, Wolf and his colleagues might assert that voucher use is the “treatment” for these two “outcomes.” However, the treatment must be clearly and directly connected to the outcome in order for ITT to work as it was intended to work, just as it works in medical research. It is quite the stretch to assume that a voucher that a student accepted (then rejected, mind you) at some point during four years of high school had a clear and direct effect upon either graduation or subsequent college enrollment. In his NEPC review of Wolf’s et al. work, Casey Cobb alludes to this lack of established direct connection when he writes, “…the research design is not robust enough to inform the reader about the causal effects of a voucher program.”
A third issue involves what ITT use allows Wolf et al. to conceal about their study. It is all too convenient to note that neither a 75% nor a 56% attrition rate matters if one uses ITT. Even the lower of the two numbers, 56%, still means that over half of the students accepting vouchers as freshmen in 2006 relinquished those vouchers and returned to public school. Since the voucher program is being promoted as an “opportunity” for concerned parents to have their children exit public school, an ethical researcher should investigate the high rate of return of the voucher students to the public schools. To continue to peddle vouchers as an exit from public school is misleading if one knows that more than half will return.
The Sage Handbook of Social Science Methodology uses a medical example to illustrate the need to pay attention to what might be termed the “common sense” issues requiring analysis. In the example, two treatments are being compared: castor oil and castor oil plus surgery. The doctors of the castor oil group decide that castor oil is useless and (ethically so) move their patients to the surgery group. The writer comments:
In addition to analyzing the data as intent to treat, there is another analysis that we should be doing here. We should simply count the number of patients who ended up receiving each kind of treatment. When we discover that almost all patients were switched away from castor oil, this tells us a lot about what their physicians thought of the castor oil treatment. It may also be profitable to run an analysis on groups “as treated” and to present that result as well as the intent-to-treat result. [Emphasis added.]
To extend the above advice to the voucher study: “When we discover that over half of the voucher students returned to public school, this tells us a lot about what the students thought of the voucher program. It may also be profitable to run an analysis on voucher students as “noncompleters” and “completers” and to present that result as well as the intent-to-treat result.”
It is possible that the voucher “completers” are comprised of more highly motivated students than are voucher “noncompleters.” Wolf and his colleagues allude to as much:
As we are reporting elsewhere in an academic journal (Cowen, et al. forthcoming), there is evidence that the students who leave MPCP [Milwaukee Parental Choice Program] for public schools are among the lowest performing private school students. [Emphasis added.]
If those who stay in the voucher program are indeed the more motivated, higher-performing students, parents need to know this. Parents certainly do not need to be given some false hope that their C/D performing child is likely to become an A/B student simply by being given a voucher. In order to make educated choices for their children, parents should be apprised of any finding that vouchers are not necessarily for all students, and pro-voucher officials and organizations should cease to promote them as such.
As for the use of ITT on voucher research: I consider it a poor choice because the ITT result shelters the study outcomes from practical application. Who, exactly, is exiting the voucher program, when, and why? Who is most likely to follow through using vouchers for four years of high school? I don’t think any parent accepts a voucher for his/her child in the hopes that the child will return to the public school from whence the child came. Nevertheless, this is exactly the outcome apparently hidden behind the boldly-proclaimed usage of ITT.