Blind Testing | Science Exposed

Background

Blind testing is the experimentation on participants who are “blind” (unaware) of whether or not they are in the experimental or control group. They also are usually unaware of what the independent and dependent variables are. The experimental group is the group exposed to the independent variable. The independent variable is the variable that is purposely changed by the researchers to see the effects it has on the dependent variable, which is what is observed. The control group is the group that is not exposed to the independent variable to provide comparison for the experimental group. If the dependent variable between the two groups is observed to be different, than a correlation between the independent variable and dependent variable can be shown. Similar to blind testing, double-blind testing incorporates the same “blindness” for the participants, but it also uses researchers who are unaware of which group is the control group and which is the experimental group.

The Complication and Where it Occurred

Blind testing and double-blind testing are often thought of by the public as a way to simply eliminate a placebo effect from a study. A placebo effect is a phenomenon in which a fake treatment or stimulation, also known as a placebo, elicits a response from a participant simply because that participant expects that it will. Both blind testing and double-blind testing are effective at removing this bias because they can take advantage of the placebo effect itself. This can be beneficial when determining the effectiveness of a certain medication or drug. The participants are either exposed to the drug itself (experimental group) or they are exposed to a “fake drug,” which is the placebo (control group), so that all participants believe that they have taken and will experience effects from a drug. This makes the expectation of an effect constant across all groups, even if some of the groups did not receive the real drug, thus eliminating a possible bias from the placebo. An example of this is in the study of the effectiveness of nicotine patches on decreasing tobacco withdrawal symptoms and in helping users quit smoking (Daughton et al., 1990). In this study, participants were separated into three groups; one group was given patches that administered nicotine for 24 hours, one was given patches that administered nicotine for 18 hours, and one was given patches that administered no nicotine (placebo). The participants were then observed over 6 months, and it was determined that 22% of the people were able to refrain from smoking in the 24 hour group, 31% of people in the wakeful hours group, and 8% of people in the placebo group. Because the study was conducted using double-blind testing, and the participants all believed they were given nicotine, a valid conclusion about the effectiveness of the nicotine patches was able to be drawn from the comparison of the control (placebo) group and the two experimental groups.

However, blind and double-blind testing is also used in the scientific community to remove observer bias. Observer bias is the incorrect observations of a study that comes from the researchers expectations for how the participant will behave in a certain situation. Double-blind testing is able remove this bias because it does not allow the researcher to have expectations for the behavior of a participant since they are unaware of the group the participant is in (control or experimental). This can be seen in the study conducted by Snyder and Frankel (1976) of the interpretation of two videotapes of separate interviews, both without audio, by males who were informed of the videos content. Two groups of males were each shown the two videos, each video containing an interview of a woman, and were led to believe that one of the interviews was on the topic of sex and the other interview was on the topic of politics. However, the first group was told this information before they watched the two videos, and the second was not told until after. The second group in this scenario mimics the situation of a double-blind study. Both groups where then asked to make qualitative observations about the two videos. It was found that the first group viewed the women being interviewed to be tense in the situation they thought would be tense (the interview on the topic of sex). They then attributed this tension to be a result of the women’s disposition for anxiety, which was regarded as a misattribution by the researchers. In the second group, the observers were not affected by the knowledge of the topic or their own predispositions to view a certain topic as anxious. Because of this, they observed the women in the discussion about politics to be more emotionally involved than the women in the interview about sex, which was accepted by the researchers. This shows that knowledge of the situation of a participant can affect the observations of that participant, and that using a double-blind method (like the second group in the study was exposed to) is effective at obtaining the most accurate observations in a situation even if a placebo is not in effect.

Similarly, the experimenter expectancy effect is another bias avoided by blind and double-blind testing in the scientific community. This bias also involves the researchers expectations, except here, the researchers expectations change the behavior of the participants. This can be seen in a study on the effect of researchers knowledge about a group of rats on the efficiency of those rats to learn a maze (Rosenthal & Fode, 1963). Two groups of researchers were each given a group of rats and were given incorrect information about the rats they received; one group was told that the rats they had been given were special bred to perform well in mazes (group one), and the other group was told that the rats they had been given were genetically inferior and would perform poorly in mazes (group two). In reality, the rats were randomly assigned to each group and were not genetically bred to excel or fail at learning mazes. However, the results of the experiment found that the rats from group one were more efficient at learning the mazes than the rats in group two. These results support the experimental expectancy effect because the knowledge the researchers had about a group of rats, while incorrect altered the rats’ performance. Double-blind testing is able to decrease the amount of impact this bias has on a study because the observers do not have expectations for the participants so they will not be able to influence the participants’ behavior with expectations. So, if the researchers had not been given information about their rats (Rosenthal & Fode, 1963), it is likely that all the rats would have had the same efficiency in learning the maze.

Another common misconception is that when a study is conducted with blind or double-blind testing, that study must automatically be completely accurate; this is incorrect, because blind testing does not make an experiment that lacks control completely valid. In order for data from an experiment to be accurate in the scientific community, the researchers must have control in the experiment, meaning that they need to make sure the only variable affecting the results of a study is the independent variable. If there are other uncontrolled variables outside of the independent variable(s) affecting a study, the conclusion of that study is not credible. These other variables affecting the study of a study are known as confounds. Confounds can be witnessed in a study that was conducted on the effectiveness of caffeine on alertness (Zwyghuizen-Doorenbos, Roehrs, Lipschutz, Timms, & Roth, 1990). In this study, two groups of men were either given a caffeine tablet or a placebo tablet. The researchers did not know who was given caffeine or who was given the placebo, and the participants all assumed they were receiving caffeine, so the experiment was a double-blind test. Both groups were given their respective tablets (one group received the caffeine tablet, the other the placebo) twice a day for two days, and their alertness was then tested four times a day for those two days. Then on the third day, both groups were given the placebo, and their alertness was once again tested four times. However, the participants were not constantly monitored throughout the three days of the trial. This means that there were other factors (such as physical activity during the day, food and types of food consumption, quality of sleep, and many other variables) that could have affected the alertness of the participants that were not monitored. Making the experiment a double-blind test did not remove these confounds because they happened outside of the administering of the independent variable and the recording of the dependent variable, which makes the data from this study not completely reliable.

Blind testing also does not make a study completely valid because it does not compensate for inadequate procedures. This is especially prevalent with blind tests that occur in experiments where the control group is a different size than the experimental group, such as in a study conducted to determine the effectiveness of a stent (tubular support placed into a blood vessel) that releases a drug versus a stent which did not (Pfisterer et al., 2006). This trial was conducted by dividing up the participants in a 2:1 ratio, meaning for every 2 participants in the experimental group (stent that released drugs), there was 1 participant in the control group (stent that did not release anything). The participants were randomly assigned to each group and were unaware of which type of stent they were receiving (blind test). The participants were then observed over a long period of time, the main observations being the health of their hearts as well as their life span. While the difference in size of control group versus the size of the experimental group was not as extreme in this instance (2:1), it still affects the validity of the results because in groups of different sizes, there is less variation in participants, so the results will be naturally different. This difference in results increases as the difference in size of control group verse experimental group increases. Blind testing cannot compensate for this because blind testing is not a method to remove or add variation.

Finally, blind testing also cannot compensate for “cherry picking” that occasionally occurs in studies. “Cherry picking” is deliberately excluding certain results from a study so that the researchers are able to reach their desired conclusion. This process discredits the conclusion of a study because it is not based off of all of the results form that study. Blind and double-blind testing are not able to add validity to these kinds of studies because they are only able to remove biases that occur before and during the collection of data; “cherry-picking” occurs after the data has been collected.

There are also certain scenarios where blind and double-blind testing cannot be applied because it is too obvious who is in the experimental group verses who is in the control group to the researchers or participants. For example, double-blind testing cannot be used in some studies involving people with brain damage as the experimental group. Here, it is often clear to the observer who has brain damage based off of their responses to the tests. Once the observer knows who has brain damage, he is no longer blind to the control and the experimental group. However, blind tests can still be conducted in this situation because the participants will not know that brain damage determines the experimental and control groups. An instance where blind testing cannot be used is when the participant experiences a noticeable physical effect when the independent variable is administered that cannot be experienced through the administration of a placebo . For example, if electrodes are administered to stimulate certain parts of the brain by means of a headband, the participant is will know if the headband is administering electrodes because when the electrodes first start to pass into the brain, there is a noticeable sensation. Because of this sensation, the participant is not able to be blind to the administration of electrodes, so if this administration is the independent variable, then a blind test cannot be applied. This type of test can still be a partially blind study if the control group is separate from the experimental group, and is not aware of the administration of the independent variable. However, this will leave the study vulnerable to placebo biases discussed earlier since the experimental group would have a headband with electrodes and the control group would not. An example of an instance where neither double-blind nor blind testing could be used was a study conducted on the effect noise has on the ability to remember short pieces of text (Hygge, Boman, & Enmarker, 2003). In this study, high school students were asked to remember certain words that they read after silence and then to remember certain words after a noise (a meaningful but irrelevant conversation or traffic noise) was played. This study could not be performed as a double-blind study because the researchers had to administer the noise, so experimental group was known to them. This study could also not be a blind experiment because the participants where in both the control and experimental group, so the two groups were not independent of each other. The independent variable was also physically noticeable when it was administered, so the participants were aware when it had been introduced.

Because blind and double-blind testing involves experimentation on participants who are unaware of their own exposer to an independent variable, they can lead people to worry about the safety of the participants; both of these kinds of testing are, however, safe and not intrusive for the participants because of the Institutional Review Boards (IRBs) and debriefings that usually occur in blind and double-blind experiments. IRBs are committees that are put together to approve, monitor, and review studies in the scientific community that involve humans as the participants. These committees ensure the safety of the participants, so even though a participant is unaware of their exposer to an independent variable in a blind or double-blind study, they are no less safe than any other participant in a study not being conducted using blind or double-blind testing. Researchers often debrief the participants as well to let them know what was being tested, how it was being tested, and if that participant was in the experimental group. This way the participant is able to understand what happened, which removes any anxiety or ambiguity that may have occurred during the study.

How should blind and double-blind testing be viewed in the future?

The public should not assume that blind and double-blind testing only remove placebo effects from an experiment. Both of these types of testing are effective in removing other types of biases, and because of this, the public should be wary of an experiment that was not conducted using blind or double-blind testing. This does not mean that all studies that were not conducted using blind or double-blind testing are inaccurate, but the public should consider if the results could have been affected by a bias that was not removed by either blind or double-blind testing.

Also, the public should be particularly skeptical of a drug that was not tested using blind or double-blind studies. Certain drug companies do not release the results from a blind or double-blind test because they show that their drug is only as effective as a placebo. It is also important to remember that not all studies can use blind or double-blind testing, however, so it is important that that is considered as well.

The public should also not consider every study that was conducted using one of these two methods to be completely accurate. While studies that used blind testing or double-blind testing most likely do not contain placebo effects, observer bias, or conformation bias, the results may have been altered by lack of control in an experiment or even “cherry-picking” the results. It is still important for the public to decide if any uncontrolled variable affected the dependent variable or if the conclusion of an experiment was not based off of all of the results. The number of participants in the experimental group compared to the number of participants in the control group should also be noted in all blind or double-blind studies. Remember, the larger the difference in population size of the two groups is, the lower the validity of the results is.

Finally, it is important that the people are not worried about their safety when deciding to become a participant in a blind or double-blind test. These types of tests are important for accurate research in many different areas of study, and because of IMBs the studies are safe for humans to participate in. What occurs in the study is also not kept secret from the participants for very long, as they will most likely be debriefed, so people do not have to worry about participating in a study and never knowing what has happened to them.

References

Daughton, D., Heatley, S., Prendergast, J., Causey, D., Knowles, M., Rolf, C., . . . Rennard, S. (1990). Effect of Transdermal Nicotine Delivery as an Adjunct to Low-Intervention Smoking Cessation Therapy. Arch Intern Med, 151, 749-752.

Hygge, S., Boman, E., Enmarker, I. (2003). The effects of road traffic noise and meaningful irrelevant speech on different memory systems. Scandinavian Journal of Psychology, 44(1), 13-21.

Pfisterer, M., Brunner-La Rocca, H., Buser, P., Rickenbacher, P., Hunziker, P., Mueller, C., . . . Kaiser, C. (2006). Late Clinical Events After Clopidogrel Discontinuation May Limit the Benefit of Drug-Eluting Stents : An Observational Study of Drug-Eluting Versus Bare-Metal Stents. Journal of the American College of Cardiology, 48(12), 2592-2595.

Rosenthal, R., & Fode, K. (1963). The effect of experimenter bias on performance of the albino rat. Behavioral Science, 8, 183-189.

Snyder, M., & Frankel, A. (1976). Observer bias: A stringent test of behavior engulfing the field. Journal of Personality and Social Psychology, 34(5), n.p .

Zwyghuizen-Doorenbos, A., Roehrs, T.A., Lipschutz, L., Timms, V., & Roth, T. (1990). Effects of caffeine on alertness. Psychopharmacology (Berlin), 100(1), 36-39.

3 Blind Testing

Part 1 - Experimentation

Part 2 - Statistics

Part 3 - Results

Part 4 - Publishing

Part 5 - Broader Issues