Statistical Evidence. Chapter 2. Who Was Left Out?
[This is the first draft of Chapter 2 of "Statistical Evidence."]
2.0 Introduction
Research studies often have a narrow focus, but sometimes it can be too narrow. When too many patients are left out, those who remain may not be not representative of the types of patients you will encounter.
Case study: Nicotine patches
In a study of teenage smokers (Smith 1996), researchers recruited 22 volunteers from five public high schools in the Rochester, MN area for participation in a smoking cessation program, involving behavioral counseling, group therapy, and nicotine patches. Researchers measured the number of cigarettes smoked, side effects, and blood levels of nicotine.
The purpose of the research was to evaluate "the safety, tolerance, and efficacy of 22 mg/d nicotine patch therapy in smokers younger than 18 years who were trying to stop smoking." The authors also listed a secondary goal, "to compare blood cotinine levels, nicotine withdrawal scores, and adverse experiences with those of adults obtained in previous patch studies." Cotinine is a metabolite of nicotine and provides a useful objective measure of cigarette smoking. It also allowed the authors to examine whether nicotine toxicity was an issue.
This study did not include major segments of the teenage smoking population. The study included only white subjects because there were too few minority students in the Rochester area. Subjects had to get parental permission, excluding smokers who wished to keep their habit secret from their parents. Subjects were also volunteers, and thus could be considered more motivated to quit than the typical teenage smoker.
The study also had a serious drop out rate. Of the presumably thousands of teenage smokers in the Rochester, Minnesota area, only 71 volunteers responded to the initial call for subjects. Of the 71 volunteers, 55% met inclusion criteria. Of the remaining 39, 44% declined to attend the initial meeting. Of the remaining 22, 14% were non-compliant. Of the remaining 18, 39% failed to respond to the one year survey. Only 11 completed the entire study (50% of those who started the study; 28% of those meeting inclusion criteria; 15% of the initial volunteers.)
This study had a serious problem with who was left out. The large number of subjects who did not get into the study or who did not complete the study makes it hard to generalize the findings of this research.
Who was left out? What to look for.
When you are trying to figure out who was left out and what impact this has, ask the following questions:
Who was excluded at the start of the study? In a desire to create a nice clean homogenous research study, researchers may apply rigid and unrealistic entry criteria. The patients excluded can often have a different prognosis than those who make it into the study. This exclusion can make it difficult to extrapolate to the types of patients that you normally see.
Who refused to join the study? Almost all research involves the informed consent of volunteers. Many potential patients can and do refuse to participate in research studies. This can dramatically affect the results of the research.
Who dropped out during the study? Not everyone who starts out in a research study will finish it. Volunteers always have the option of withdrawing their consent to participate at anytime and some patients will miss their follow-up appointments because they moved or they just plain forgot. If these dropouts have a different prognosis, then?
Who stopped or switched therapies? If there are compliance issues, handle the non-compliant patients carefully. Patients who have problems with compliance will also often have trouble with other self-care habits and thus be at greater risk for adverse outcomes. Excluding non-compliant patients can lead to some serious biases.
2.1 Who was excluded at the start of the study?
Researchers, trying to minimize variation, will use exclusion criteria to create more homogenous groups. Ask yourself the question, "How similar are my patients?" If it is difficult to extrapolate results from a very tightly controlled and homogenous clinical trial to the variation of patients seen in your practice, then the research has limited value to you.
There is a tension between minimizing variation and maximizing generalizability (Godwin 2003; Siderowf 2004). The trials with minimal exclusion criteria are called pragmatic trials and are intended to measure effectiveness, the ability of a drug or therapy to work under very general conditions. Trials with stricter exclusion criteria develop a more narrowly drawn population that can measure efficacy, the ability of a drug or therapy to work under ideal conditions. It is important to know both efficacy and effectiveness, and it is important to establish efficacy under ideal conditions before trying to demonstrate effectiveness under more general conditions.
There are three very common and very serious exclusions in medical research that deserve special attention: exclusion of elderly patients, exclusion of women, and exclusion of children.
Exclusion of elderly patients
If you are elderly, pat yourself on the back. Your demographic group drives the healthcare economy. You are, by far, the largest consumers of new medications and new therapies. Yet, far too often, these new medications and new therapies are tested on patients much younger (Bayer 2000).
There's a simple reason for this exclusion. When researchers design their experiments, they want a nice clean sample.
Researchers want patients who are ill with one and only one disease. But with older people, several things will break down at the same time (Schellevis 1993).
Researchers don't want patients who are taking a lot of other medications. But older people take so many different drugs that they often qualify for bulk discounts at the local drug store.
Finally, researchers want patients who are likely to stay alive for the duration of the research study. But older people are likely to die from conditions unrelated to disease being studied.
Although the reasons for excluding elderly patients are understandable, they are still not justifiable. Research done on younger patients cannot be easily generalized to older patients.
Exclusion of women
Several decades ago, there was a large study of aspirin as a primary prevention against heart attacks (Physicians Health Study Research Group 1989). This study recruited over 20 thousand physicians and asked them to take either a small dose of aspirin every day or take a placebo. They had to follow these physicians for five to ten years because they wouldn't cooperate and have heart attacks faster. At the completion of the study, the researchers announced that aspirin was highly successful at preventing heart attacks.
There was one major problem with the research sample, though. Every single one of the physicians studied was male. Not a single female was included in the sample. It's not as this was a problem only for men. Heart disease kills more women than any other condition.
There are some legitimate concerns when testing drugs that might harm a developing fetus, but you can handle this with careful restrictions to women who are not sexually active and/or who are using an effective form of birth control. In addition, some conditions, such as prostate cancer cannot be tested in women.
There is some dispute over whether gender bias exists, with one study arguing that it still occurs (Ramasubbu 2001) and another arguing it does not (Meinert 2001). When the exclusion of women occurs, it raises troubling questions and hinders your ability to generalize the results of the research.
Exclusion of children
At the opposite extreme from the elderly are children. This group, sadly, is also left out too often from the benefits of research.*
Children are not little adults. The liver in a child will process drugs quite differently from the liver of an adult. The nutritional demands of a growing child are quite different than those of a fully grown adult. And if you thought that your children became unpredictable as they went through puberty, try looking at them from a medical perspective!
No one wants to see our children used as guinea pigs, and there are special ethical reviews and safeguards that we must comply with when we study children.
Our failure, however, to study children in a careful controlled setting will end up subjecting all children to a large and uncontrolled experiment with no prospect of learning which treatments are safe for children and which ones are harmful.
Excluding troublemakers
A new trend in medical research is to treat all patients with a placebo for a short time and then exclude from the study anyone who responds too strongly to the placebo. The idea is that if you remove these patients from the sample, the response rate in the placebo group for the full study might be lower which increases the difference between the placebo group and the treatment group. This is a very active area of research, and there is some data to suggest that placebo responders differ in important ways from other patients (Leuchter 2002).
If this sounds like cheating, some people would agree with you. As a practicing clinician, you have no way of telling which patients would respond well to a placebo, and even if you did, you would not refuse to treat such patients. Furthermore, there is empirical evidence that excluding placebo responders does not enhance the apparent effectiveness of a treatment (Lee 2004).
Another purpose of the short term placebo administration to all patients is to see who is capable of meeting the informational demands and the logistical requirements of the research. Researchers will identify patients who cannot fill out a diary regularly or who are haphazard in their collection of data. These patients are dropped from the study before they can do any harm to the research.
There are both practical and ethical arguments against a preliminary evaluation of a placebo in all patients (Senn 1997; Evans 2000). The ethical concerns involve the intentional deception of the patient. Notice that this differs from a blinded study. In a blinded study, you tell the patients that they will not know what treatment they receive until after the study is over. The patients know that you are intentionally withholding this information in order to improve the validity of the science, and if they are uncomfortable with this, they can refuse to participate in the research. In an initial short term placebo evaluation differs markedly, since a doctor is hardly likely to say: "Take this ineffective substance for the next month and record your symptoms daily in this diary." (Senn 1997).
From a practical perspective, you don't care whether a patient is sloppy in filling out a diary. You treat that patient the same as any other patient. More importantly, there may be reason to believe that patients who make lousy research subjects might have a worse prognosis than patients who are more meticulous. If this is true, excluding the troublemakers is putting on a pair of rose-colored glasses.
Other important exclusions
Sometimes the exclusions in a research study are subtle. A commonly repeated story (though I'm not sure if it is true or not) involves a researcher who compared the IQ scores of prisoners to the IQ scores of the general public. Noting a large gap, the researcher concluded that criminals have lower IQs than honest people. This comparison, though, used a sample not of criminals, but of criminals who got caught.
If you wanted to study adolescent drug use, you might consider a survey of high school students. This survey, though, would exclude anyone who dropped out of school. The dropouts have a far higher rate of drug usage than teenagers who stay in school. If you are interested in all adolescents, but your research design excludes dropouts, you will seriously underestimate drug use (Swaim 1997). In a different situation, of course, this might not be a terrible problem. It depends on your perspective. A principal trying to understand patterns of drug use in his/her high school, for example, might actually prefer to exclude dropouts.
A rather clever understanding of these subtle exclusions appeared in an article on selection bias (Wainer 1998) as well as on the famous American radio show, Car Talk. The hosts of the Car Talk program, Tom and Ray Magliozzi, offer a puzzle each week for their listeners. Most of the time it relates to auto mechanics, but this particular puzzle involved a nameless mathematician who was asked during World War II to help with a military problem.** A lot of bombers were not returning from their missions, so the Royal Air Force wanted to put armor on the bombers. But where to put it? They couldn't put it everywhere because the bomber would be so heavy that they couldn't take off. So this mathematician looked at the planes that returned and noted where they had holes from enemy fire. These holes were distributed more or less randomly throughout the plane except for two regions where there was nothing. His recommendation was to place the armor only in those two areas where no enemy fire was found. This seems counterintuitive, which is why it makes such a good puzzle.
This mathematician hypothesized that any plane hit in those regions did not survive to return. The other areas could be hit and the plane could still limp back to safety. This is an example of selection bias. The bombers in the study were not a random sample of all bombers, they were a sample of bombers that returned safely.
If you read the account in Wainer 1998, you will learn that the nameless mathematician was Abraham Wald. This article also has several other amusing examples of subtle and not so subtle exclusions, including research into the most dangerous occupation of all. An occupation where the average life expectancy is only 20.7 years. And what is this dangerous occupation? Student.
Exclusions: What to look for.
Not all exclusions are bad. Here are some issues to consider.
- Are the excluded patients likely to have a worse prognosis?
- Are any important demographic groups left out or seriously underrepresented?
- Are any of the exclusion criteria artificial and unrepresentative of the patients that you normally see?
2.2 Who refused to join the study?
Quite often, the only patients we are able to study are those who volunteer to help out. The use of volunteers, however, may exclude important segments of the patient population.
Volunteers may differ from the normal population in several important ways. Volunteers for a study involving cash payments may come more often from economically challenged environments. If a free health check-up is included, volunteers may come more often from people worried about their health status. Volunteers for lengthy studies are less likely to be employed.
Smokers who volunteer for a smoking cessation study are quite different than smokers in general (Hughes 1997). It should be obvious, but sometimes it is easy to forget this important distinction. Sometimes you are interested in generalizing to all smokers and sometimes you are interested in generalizing to all smokers who are interested in trying to quit.
Volunteers for painful procedures
Recruiting controls is especially troublesome in a study that involves a painful procedure. A Swedish study documents volunteer bias in a study of personality (Gustavsson 1997). In this study, the researchers wanted to analyze cerebrospinal fluid in order to "examine the associations between personality traits and biochemical variables."
Now, how do you get cerebrospinal fluid? The technical term is lumbar puncture, but it's also called a spinal tap. A spinal tap is rather painful, I'm told, and it carries a small risk of some serious side effects. What sort of person would volunteer to submit to a spinal tap?
In this study, the subjects they recruited had already completed a complete personality profile in a previous research study. Of the 87 subjects, 48 declined to participate. There was one personality trait that was quite different between the "volunteers" and the "refusers". Can you guess what it is?
It turns out that the volunteers had scores roughly a half standard deviation higher on impulsiveness. They did not differ on other personality traits such as socialization and detachment. The large difference in the impulsiveness measurement would obviously cloud any attempt to correlate personality traits and biochemical measurements in spinal fluids among those who volunteered.
Professional volunteers
Many drug companies pay good money for healthy volunteers to test new drugs. If the study involves extensive observation and/or invasive procedures, the amount of money offered can add up. Some volunteers will return repeatedly for different studies. No one gets rich this way, and the amount of money offered cannot be so large to be coercive. But serving as a research volunteer can still help pay a few bills and supplement your income.
Do these professional volunteers differ from you and me? You might suspect that these volunteers are poorer and less likely to have a full time job. There are some subtle differences, though, that are even more important.
Example: When genetic testing was done on a group of professional volunteers, there were almost no instances of a genetic variation that was associated with slow metabolism of certain drugs (Chen 1997). This slow metabolism would tend to be associated with a greater chance of side effects. This may not be too surprising. If you have a bad outcome with your first research study, you'll probably not come back for the next study. Unfortunately, this means that studies on professional volunteers could possibly to understate the likelihood and severity of side effects, as compared to the general population.
Nonresponse
An aspect of volunteering can occur in survey studies. People who volunteer to return a questionnaire are frequently quite different from those who refuse to fill out the survey. In particular, the non-responders tend to be more apathetic. Return rates for surveys vary by the type of survey, but if less than half of the subjects returned the survey, any results are of very limited value. Again, look for efforts to minimize non-response and/or efforts to characterize the demographics of non-responders.
Example: Two researchers examined general practitioners who routinely failed to return mail surveys (Stocks 2000). A follow-up telephone call assessed demographic characteristics of this group. They were older, less likely to have post graduate qualifications and were less likely to be involved with a teaching practice.
Volunteer bias can be especially troublesome when you are examining issues that are considered by some people to be embarrassing or personal. Two American researchers examined the characteristics of people who were willing and unwilling to volunteer for studies about sexuality (Strassberg 1995). Volunteers had a more positive attitude towards sex, less guilt, and more sexual experiences.
Refusals: What to look for.
Most studies use volunteers, so you can't just pooh pooh a study for this reason alone. Here are some questions you should ask.
- Are any incentives for participating related to important prognostic factors?
- What are the disincentives for participating? Are any of these important?
- Were the researchers able to characterize various aspects of those who did not volunteer? How similar were the volunteers and non-volunteers?
2.3 Who stopped or switched therapies?
When you give a new drug to your patients, unless you watch them as they swallow the pill, you have no guarantee that they took the drug. This is also true for most research studies. The research subjects may not comply with the demands of the study. They may take only some of the medication, may stop taking the medication entirely, or may even switch to the competing medicine. Issues involving compliance are difficult to handle and there is no perfect way to analyze these patients.
Problems with compliance will usually end up diluting the impact of the new therapy. At the extreme, if 100% of your patients are non-compliant in both arms of the study, then you will surely see no difference between any two drugs. Although I discuss compliance from the perspective of a drug study, it is also an issue in non-drug studies. If a patient fails to show up for therapy sessions, or forgoes a required operation, it has the same issues and problems as noncompliance with a drug regimen.
Intention to treat
The intuitive approach is to remove from your study any patients who fail to comply with the protocol. This approach has its merits, but is generally avoided. What most researchers use instead is an "intention to treat" (ITT) approach. With ITT, the patients are analyzed in the groups to which they were originally randomized regardless of how much or how little medication they have taken. In fact, if some of the patients have the opportunity to switch to the competing drug (or therapy) and do so, with ITT, you still analyze them as if they took the drug they were originally assigned to.
There are several reasons why many researchers use ITT. First, researchers will often go to a lot of trouble to ensure randomized assignment in the study. Researchers in surgery have been known to take a sterilized coin into the operating room to choose which surgery to perform (Hollis 1999). When you go to such great lengths to use randomization, you don't want to abandon it without a fight. And when patient choices about whether they comply with the protocol start to determine who gets analyzed in which group, you lose randomization and all the benefits that it confers.
Second, with ITT, you get a more realistic picture of the new drug or therapy. If a drug or therapy is difficult to comply with, then that difficulty ought to be considered as part of the whole package. If noncompliance for a difficult to tolerate drug dilutes the impact of that drug, then that's worth knowing. Keep the noncompliant patients in because you will likely encounter the same patients among those who you regularly treat.
Third, ITT can prevent some serious biases in the research. Consider a new surgical therapy which is being compared to a standard non-surgical therapy. Some patients randomized to the surgical therapy might die prior to receiving the therapy. This is the most extreme form of non-compliance. These patients should still be analyzed as part of the surgical therapy group. Otherwise the rapidly dying patients will be excluded from the treatment group, but not from the control group, leading to serious bias.
As a general rule, noncompliant patients will usually have worse outcomes than compliant patients. In fact, there is solid evidence that patients who fail to comply with a placebo have worse outcomes than patients who comply with a placebo (Coronary Drug Project Research Group 1980; Horwitz 1990). I was quite amazed when I first saw evidence of this, but it actually makes sense. Patients who comply poorly with a placebo probably have other poor self care habits.
Intention to treat: What to look for.
When you are looking at compliance issues, consider the following issues:
- Was any attempt made to assess compliance?
- Was the compliance level similar to patients seen in your practice?
- Would additional analysis using the treatment actually received answer a different, but still important question?
2.4 Who dropped out during the study?
It is inevitable that some patients will drop out during the study. If the number is more than a few, this is a cause for concern. Dropouts often have a different prognosis than those who stay. Ignoring the dropouts will often paint a rosier picture of the outcome. Was there any effort (financial inducement, follow-up reminders) made to minimize dropouts? Were the authors able to characterize the demographics of the dropouts?
Is the dropout caused by the treatment itself or a poor prognosis?
When the reason for dropping out is unrelated to the study, then you can ignore the dropouts without any serious problem. You lose a little bit of power and precision, but are otherwise okay.
If on the other hand, dropouts are related to prognosis, be careful. If someone drops out of a cancer study to take laetrile treatments down in Mexico, that's often because the therapy assigned as part of the research is not working well.
You might be tempted to think that dropping out because of a move out of town is unrelated to prognosis. Often it is, but keep in mind that you will see more mobility among poorer patients. These patients will often have to move for economic reasons. So if you leave these patients out, then you are excluding patients who are on the lower rungs of the socioeconomic ladder. These patients will often not do as well for a variety of reasons, and their loss will end up making a rosier and more optimistic sample than what you would encounter in the real world.
At what level should the number of dropouts be a concern?
There is no simple answer to this question. Smaller is better, of course, but there are no firm guidelines. I've seen some suggestions that if the rate is 10% then dropouts are not a serious issue. There is no empirical justification for this value, but it seems reasonable enough to me. The larger the rate, the more chance for problems. A dropout rate of 50% or more is almost always a sign of serious problems.
Inferring outcomes for dropouts.
Sometimes you can infer or impute a value for the patient who dropped out of the study.*** All of these methods for inferring outcomes for dropouts approaches are imperfect. While these approaches can sometimes compensate for a small number of dropouts, they cannot make a silk purse out of a sow's ear.
In some contexts, you can infer the status of dropouts as treatment failures. For example, if someone stops attending a smoking cessation program, you have fairly strong justification for treating such a patient as if they were smoking again. In a study of weight loss programs, dropouts could be assumed to have regained any weight that they may have lost. This is not a perfect assumption, but it should work well in practice.
If someone drops out part of the way through the study, one option is Last Observation Carried Forward. This option takes the intermediate outcome forward and treats it as the final outcome under the assumption that the final outcome would have been about the same (Mallinckrodt 2004). Another approach is to incorporate whatever information is available in a mixed model. In its simplest form, this model fits a trend line for each individual subject and pool those trend lines across groups of subjects. Those subjects with complete data contribute more to the estimate of time trends, but all subjects with one or more intermediate values will still contribute a limited amount of information.
There are several more sophisticated approaches for inferring outcomes for dropouts, hot deck imputation and multiple imputation. With hot deck imputation, you divide your data into relatively homogenous subgroups. When you encounter a missing data value, select a random subject from the same homogenous subgroup and substitute his/her value for the missing value. With multiple imputation, you infer a distribution for the outcome variable using information about the interrelationships between the outcome variable and other variables measured in the analysis. For patients with missing outcomes, a random value is selected from this inferred distribution. This creates a new data set, which you analyze as normal. Now do this again five or ten more times. Each time, analyze your data. Now pool the results of these multiple analyses. Both of these approaches require a lot of work, but they have been proven to work well in practice.
Example: In a study of a quality of life measure, the AMC Linear Disability Score (Holman 2004), patients were asked to rate certain activities as either "I could carry out the activity" or "I could not carry out the activity." But if the patient never had a chance to carry out a particular activity, they rated their response as not applicable. Guidance on how to handle the not applicable response varied from treating it as a negative response, or using an average of the responses on the other items in the score. These researchers showed that hot deck imputation performed better than these simplistic approaches.
Dropouts: What to look for.
It would be a rare research study that had absolutely no dropouts, so you don't want to be too fussy.
- First, you need to look for the proportion of patients who drop out.
- Second, look for a description of who dropped out. Is this group different from those who completed the study?
- Third, can you infer something about the dropouts and impute a reasonable value for their outcome?
Counterpoint: Intention to treat is not all that it is cracked up to be.
The demand for an intention to treat analysis has become almost reflexive in the research community. Authors of systematic overviews will often cite the failure to use an intention to treat analysis as a methodological flaw (see Lawlor 2001, for example).
Nevertheless, there still is a place for the analysis that excludes noncompliant patients. This analysis answers the question, what will happen if I prescribe this drug to a group of patients who all take the drug regularly? The ITT analysis answers a different question: what will happen if I prescribe this drug to a group of patients that includes both compliant and noncompliant patients? It may help to know the answers to both questions.
Example: The MRFIT trial was a randomized comparison of a special intervention to usual care (Cutler 1991). The special intervention encouraged smoking cessation and dietary changes. A comparison of the groups as they were randomized to represented a comparison of special encouragement to change. A comparison of the groups that actually changed represented a different comparison, because some of the people in the special intervention ignored the advice and some of the people in the usual care group changed their habits on their own. This second comparison was of nonrandomized groups, since the patients themselves determined which group they belonged to. Nevertheless, it was interesting, because it involved a comparison, not of the encouragement itself, but of the actual changes that were being encouraged.
Since noncompliant patients can dilute the impact of a new drug, one dubious approach that researchers take is not to let these noncompliant patients into the study at all. A placebo drug is given to all patients during a single blind run-in period, and anyone who does not comply with the placebo is excluded from the study. This is the same philosophy of excluding troublemakers discussed earlier and it has the exact same problems.
The intent of this exclusion seems good on the surface. Problems with compliance will tend to dilute the effectiveness of a new therapy. At the extreme of 0% compliance, there is no possible way to distinguish effectiveness. So excluding noncompliant patients before the study starts will avoid this dilution effect.
The problem is that the researchers have jumped from the frying pan of compliance problems into the fire of poor generalizability. Unlike the researchers, you do not have the option of only treating patients who are compliant. And you will not have any reasonable way to screen out those noncompliant patients for special handling. So excluding noncompliant patients causes the same problems as excluding children, women, or the elderly.
Example: In a study of allergy shots (Adkinson 1997), children with asthma were evaluated during a run-in phase that lasted an average of 400 days. This lengthy phase was intended to insure stability of the disease. Patients were only included in full study only if they used asthma medications on a daily basis or bronchodilators five to seven days per week. The value of an asthma shot, however, is that you can ensure compliance because it is done in your office. So even though these shots were not demonstrated to be effective in a population of children who comply with their other medications, perhaps they might still be effective in a broader population of children that sometimes forget to take their pills on time.
2.5 Summary - Who was left out?
Exclusion of subjects can make the study biased or less generalizable.
Who was excluded at the start of the study? Excessively strict entry criteria in a research study can make it difficult to extrapolate to the types of patients that you normally see.
Who refused to join the study? Do the volunteers differ substantially from refusers in ways that might influence the outcome of the study?
Who stopped or switched therapies? If there are compliance issues, handle the non-compliant patients carefully.
Who dropped out during the study? Did these dropouts have a different prognosis?
On your own
1. Review the inclusion and exclusion criteria of the following study. The abstract and the relevant portions of the methods section are reproduced below:
The risk of menstrual abnormalities after tubal sterilization: a case control study. Shobeiri MJ, Atashkhoii S. BMC Womens Health 2005: 5(1); 5.
ABSTRACT Background: Tubal sterilization is the method of family planning most commonly used. The existence of the post-tubal-ligation syndrome of menstrual abnormalities has been the subject of debate for decades. Methods: In a cross-sectional study, 112 women with the history of Pomeroy type of tubal ligation achieved by minilaparatomy as the case group and 288 women with no previous tubal ligation as the control group were assessed for menstrual abnormalities. Results: Menstrual abnormalities were not significantly different between the case and control groups (p = 0.824). The abnormal uterine bleeding frequency differences in two different age groups (30–39 and 40–45 years old) were statistically significant (p = 0.0176). Conclusion: Tubal sterilization does not cause menstrual irregularities.
METHODS: This cross sectional case control study has been carried out on 500 women at Al-zahra hospital during 1999 to 2001 to assess the effect of tubal sterilization on the menstrual cycle. 260 women with abnormal uterine bleeding referred for diagnostic curettage, and 240 healthy women under the coverage of the hospital family planning center were selected randomly, and all were assessed for tubal ligation.
All women aged 30 to 46 were selected from a low-income urban population, with body weight between 50 to 90 kg. In the abnormal uterine bleeding group, those who had intrauterine device (IUD), leiomyoma on sonography, uterine size of greater than 9 cm or suffered from medical disorders were excluded from the study. Of 260 patients with menstrual irregularities, 30 subjects were excluded from the study. From the remaining 230 subjects, assessed for tubal sterilization, 87 patients had tubal ligation. Of 240 healthy women assessed for tubal ligation, 95 had previous tubal ligation. Totally 182 subjects with previous tubal ligation (case) and 288 subjects with no history of previous tubal ligation (control) were compared for abnormal uterine bleeding. Those subjects in the case group who had menstrual abnormalities, IUD, medical disorders or were on hormonal contraception, during the first year prior to the sterilization were excluded from the study. Those who were at least 30 and at most 40 years of age by the time of tubal ligation and had Pomeroy type of interval tubal ligation via minilaparatomy were included the study. Finally, considering the exclusion and inclusion criterias, 112 subjects remained in the case group and 288 with no tubal ligation in the control group were evaluated for menstrual abnormalities.
This is an open source publication. The full free text is available at www.biomedcentral.com/1472-6874/5/5
2. Review the inclusion and exclusion criteria of the following study. The abstract and the relevant portions of the methods section are reproduced below:
Comparison of energy-restricted very low-carbohydrate and low-fat diets on weight loss and body composition in overweight men and women. Volek J, Sharman M, Gomez A, Judelson D, Rubin M, Watson G, Sokmen B, Silvestre R, French D, Kraemer W. Nutr Metab (Lond) 2004: 1(1); 13.
ABSTRACT Objective: To compare the effects of isocaloric, energy-restricted very low-carbohydrate ketogenic (VLCK) and low-fat (LF) diets on weight loss, body composition, trunk fat mass, and resting energy expenditure (REE) in overweight/obese men and women. Design: Randomized, balanced, two diet period clinical intervention study. Subjects were prescribed two energy-restricted (-500 kcal/day) diets: a VLCK diet with a goal to decrease carbohydrate levels below 10% of energy and induce ketosis and a LF diet with a goal similar to national recommendations (%carbohydrate:fat:protein = ~60:25:15%). Subjects: 15 healthy, overweight/obese men (mean ± s.e.m.: age 33.2 ± 2.9 y, body mass 109.1 ± 4.6 kg, body mass index 34.1 ± 1.1 kg/m2) and 13 premenopausal women (age 34.0 ± 2.4 y, body mass 76.3 ± 3.6 kg, body mass index 29.6 ± 1.1 kg/m2). Measurements: Weight loss, body composition, trunk fat (by dual-energy X-ray absorptiometry), and resting energy expenditure (REE) were determined at baseline and after each diet intervention. Data were analyzed for between group differences considering the first diet phase only and within group differences considering the response to both diets within each person. Results: Actual nutrient intakes from food records during the VLCK (%carbohydrate:fat:protein = ~9:63:28%) and the LF (~58:22:20%) were significantly different. Dietary energy was restricted, but was slightly higher during the VLCK (1855 kcal/day) compared to the LF (1562 kcal/day) diet for men. Both between and within group comparisons revealed a distinct advantage of a VLCK over a LF diet for weight loss, total fat loss, and trunk fat loss for men (despite significantly greater energy intake). The majority of women also responded more favorably to the VLCK diet, especially in terms of trunk fat loss. The greater reduction in trunk fat was not merely due to the greater total fat loss, because the ratio of trunk fat/total fat was also significantly reduced during the VLCK diet in men and women. Absolute REE (kcal/day) was decreased with both diets as expected, but REE expressed relative to body mass (kcal/kg), was better maintained on the VLCK diet for men only. Individual responses clearly show the majority of men and women experience greater weight and fat loss on a VLCK than a LF diet. Conclusion: This study shows a clear benefit of a VLCK over LF diet for short-term body weight and fat loss, especially in men. A preferential loss of fat in the trunk region with a VLCK diet is novel and potentially clinically significant but requires further validation. These data provide additional support for the concept of metabolic advantage with diets representing extremes in macronutrient distribution.
METHODS: A total of twenty-eight healthy volunteers (15 men and 13 women) were recruited by flyers and word-of-mouth. Subjects were between 20 and 55 y, nonsmokers, and greater than 25 percent body fat determined via dual-energy X-ray absorptiometry (DEXA). Subjects went through a thorough screening procedure to ensure they would be committed to completing the study. Exclusion criteria included a body mass >145 kg (because of technical difficulties in performing DEXA), post-menopausal women, overt diabetes, cardiovascular, respiratory, gastrointestinal, thyroid or any other metabolic disease, weight change ± 2 kg over the last month, adherence to special diets, use of nutritional supplements (except a daily multi-vitamin/mineral), and use of medications to control blood lipids or glucose. The majority of subjects were sedentary and were instructed not to start an exercise program during the study. Those who were active were instructed to maintain the same level of physical activity throughout the study.
This is an open source publication. The full free text is available at www.nutritionandmetabolism.com/content/1/1/13.
3. Review the reasons listed for dropping out in the following study.
Participant characteristics associated with withdrawal from a large randomized trial of spermicide effectiveness. Raymond EG, Chen PL, Pierre-Louis B, Luoto J, Barnhart KT, Bradley L, Creinin MD, Poindexter A, Wan L, Martens M, Schenken R, Nicholas CF, Blackwell R. BMC Med Res Methodol 2004: 4(1); 23. [Medline] [Abstract] [Full text] [PDF]
The abstract is reproduced below. Discuss to what extent do these dropouts compromise the integrity of the research study.
Background: In most recent large efficacy trials of barrier contraceptive methods, a high proportion of participants withdrew before the intended end of follow-up. The objective of this analysis was to explore characteristics of participants who failed to complete seven months of planned participation in a trial of spermicide efficacy. Methods: Trial participants were expected to use the assigned spermicide for contraception for 7 months or until pregnancy occurred. In bivariable and multivariable analyses, we assessed the associations between failure to complete the trial and 17 pre-specified baseline characteristics. In addition, among women who participated for at least 6 weeks, we evaluated the relationships between failure to complete, various features of their first 6 weeks of experience with the spermicide, and characteristics of the study centers and population. Results: Of the 1514 participants in this analysis, 635 (42%) failed to complete the study for reasons other than pregnancy. Women were significantly less likely to complete if they were younger or unmarried, had intercourse at least eight times per month, or were enrolled at a university center or at a center that enrolled fewer than 4 participants per month. Noncompliance with study procedures in the first 6 weeks was also associated with subsequent early withdrawal, but dissatisfaction with the spermicide was not. However, many participants without these risk factors withdrew early. Conclusions: Failure to complete is a major problem in barrier method trials that seriously compromises the interpretation of results. Targeting retention efforts at women at high risk for early withdrawal is not likely to address the problem sufficiently.
This is an open source publication. The full free text is available at www.biomedcentral.com/1471-2288/4/23.
Bibliography
Adkinson NF, Jr., Eggleston PA, Eney D, Goldstein EO, Schuberth KC, Bacon JR, Hamilton RG, Weiss ME, Arshad H, Meinert CL, Tonascia J, Wheeler B. A controlled trial of immunotherapy for asthma in allergic children. New England Journal of Medicine 1997: 336(5); 324-31. The full free text of this reference is available at content.nejm.org/cgi/content/full/336/5/324.
Bayer A, Tadd W. Unjustified exclusion of elderly people from studies submitted to research ethics committee for approval: descriptive study. British Medical Journal 2000: 321(7267); 992-3. The full free text of this reference is available at bmj.com/cgi/content/full/321/7267/992..
Chen S, Kumar S, Chou WH, Barrett JS, Wedlund PJ. A genetic bias in clinical trials? Cytochrome P450-2D6 (CYP2D6) genotype in general vs selected healthy subject populations [letter]. Br J Clin Pharmacol 1997: 44(3); 303-4.
Cutler JA, Grandits GA, Grimm RH, Thomas HE, Billings JH, Wright NH. Risk Factor Changes after Cessation of Intervention in the Multiple Risk Factor Intervention Trial. Preventive Medicine 1991: 20(2); 183-196.
Evans M. Justified deception? The single blind placebo in drug research. J Med Ethics 2000: 26(3); 188-93. The full free text of this reference is available at jme.bmjjournals.com/cgi/content/full/26/3/188.
Godwin M, Ruhland L, Casson I, MacDonald S, Delva D, Birtwhistle R, Lam M, Seguin R. Pragmatic controlled clinical trials in primary care: the struggle between external and internal validity. BMC Med Res Methodol 2003: 3(1); 28. The full free text of this reference is available at www.biomedcentral.com/1471-2288/3/28.
Gustavsson JP, Asberg M, Schalling D. The healthy control subject in psychiatric research: impulsiveness and volunteer bias. Acta Psychiatr Scand 1997: 96(5); 325-8.
Hollis S, Campbell F. What is meant by intention to treat analysis? Survey of published randomised controlled trials. British Medical Journal 1999: 319(7211); 670-674. The full free text of this reference is available at bmj.bmjjournals.com/cgi/content/full/319/7211/670.
Holman R, Glas CA, Lindeboom R, Zwinderman AH, de Haan RJ. Practical methods for dealing with 'not applicable' item responses in the AMC Linear Disability Score project. Health Qual Life Outcomes 2004: 2(1); 29. The full free text of this reference is available at www.hqlo.com/content/2/1/29.
Horwitz RI, Viscoli CM, Berkman L, Donaldson RM, Horwitz SM, Murray CJ, Ransohoff DF, Sindelar J. Treatment adherence and risk of death after a myocardial infarction. Lancet 1990: 336(8714); 542-5.
Hughes JR, Giovino GA, Klevens RM, Fiore MC. Assessing the generalizability of smoking studies. Addiction 1997: 92(4); 469-72.
Lawlor DA, Hopker SW. The effectiveness of exercise as an intervention in the management of depression: systematic review and meta-regression analysis of randomised controlled trials. British Medical Journal 2001: 322(7289); 763-. The full free text of this reference is available at bmj.com/cgi/content/full/322/7289/763.
Lee S, Walker JR, Jakul L, Sexton K. Does Elimination of Placebo Responders in a Placebo Run-In Increase the Treatment in Randomized Cllinical Trials? A Meta-Analytic Evaluation. Depress Anxiety 2004: 19(1); 10-9.
Leuchter AF, Cook IA, Witte EA, Morgan M, Abrams M. Changes in Brain Function of Depressed Subjects During Treatment With Placebo. Am J Psychiatry 2002: 159(1); 122 - 9. The full free text of this reference is available at ajp.psychiatryonline.org/cgi/reprint/159/1/122.
Mallinckrodt CH, Raskin J, Wohlreich MM, Watkin JG, Detke MJ. The Efficacy of Duloxetine: A Comprehensive Summary of Results From MMRM and LOCF_ANCOVA in Eight Clinical Trials. BMC Psychiatry 2004: 4(1); 26. The full free text of this reference is available at www.biomedcentral.com/1471-244x/4/26.
Meinert CL, Gilpin AK. Estimation of gender bias in clinical trials. Stat Med 2001: 20(8); 1153-64.
Ramasubbu K, Gurm H, Litaker D. Gender bias in clinical trials: do double standards still apply? J Womens Health Gend Based Med 2001: 10(8); 757-64.
Schellevis FG, van der Velden J, van de Lisdonk E, van Eijk JT, van Weel C. Comorbidity of chronic diseases in general practice. J Clin Epidemiol 1993: 46(5); 469-73.
Senn S. Are placebo run ins justified? Bmj 1997: 314(7088); 1191-3. The full free text of this reference is available at bmj.com/cgi/content/full/314/7088/1191.
Siderowf AD. Evidence from Clinical Trials: Can We Do Better? Neurorx 2004: 1(3); 363-371. The full free text of this reference is available at www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=15717039.
Smith TA, House RF, Jr., Croghan IT, Gauvin TR, Colligan RC, Offord KP, Gomez-Dahl LC, Hurt RD. Nicotine patch therapy in adolescent smokers. Pediatrics 1996: 98(4 Pt 1); 659-67.
Stocks N, Gunnell D. What are the characteristics of general practitioners who routinely do not return postal questionnaires: a cross sectional study. J Epidemiol Community Health 2000: 54(12); 940-1. The full free text of this reference is available at jech.bmjjournals.com/cgi/reprint/54/12/940.
Strassberg DS, Lowe K. Volunteer bias in sexuality research. Arch Sex Behav 1995: 24(4); 369-82.
Swaim RC, Beauvais F, Chavez E, Oetting E. The Effect of School Dropout Rates on Estimates of Adolescent Substance Use among Three Racial/Ethnic Groups. American Journal of Public Health 1997: 87(1); 51-55.
Wainer H, Pamer S, Bradlow ET. A Selection of Selection Anomalies. Chance 1998: 11(2); 3-7.
Footnotes
* My supervisor, Ralph Kauffman, gave some excellent testimony about this before a Congressional subcommittee looking at FDA approval of drugs for children. You can read his comments at www.aap.org/advocacy/washing/offlabel.htm.
** You can read the original on the Car Talk websites, both the question (cartalk.com/content/puzzler/transcripts/199838/index.html) and the answer (cartalk.com/content/puzzler/transcripts/199839/answer.html)
*** An excellent introductory guide for inferring outcomes is on the Health Economics Resource Center website (www.herc.research.med.va.gov/FAQ_I9.htm).
This webpage was written on (unknown date), edited by Steve Simon, and was last modified on 2008-07-08. Category: Statistical evidence