Subject Allocation
As we have noted, in experimental studies, the researcher controls whether the person receives a treatment or some other intervention. Just as subjects can be selected for the study in various ways, they can be assigned or allocated to the different groups in a number of ways.
Sometimes these two steps are combined; as subjects are selected from the population, they are assigned to groups. In other instances the two steps are explicitly differentiated; a sample is derived, and then a separate procedure is used to allocate the subjects to the various groups. However, it is important to be aware of these two steps because, many times, the first step (subject selection) is only implicit in the study. For example, although patients in a hospital can be randomly allocated to receive conventional therapy or a new treatment, there is actually an initial stage that may not have been acknowledged, namely, the selection of the hospitals where the study was carried out. In many instances this initial selection procedure was not random.
Unfortunately, the similarity of terms used to describe subject selection and allocation can lead to considerable confusion for the uninitiated or unwary reader and offers an area of potential mischief for unscrupulous researchers (a group that fortunately does not include epidemiologists—often). In the above example the sample was randomly assigned to the treatment groups, but it was selected haphazardly. Describing the procedure as randomized, without adequately delineating the somewhat suspect origins of the sample, can be misleading.
Randomized AllocationWith random allocation, all subjects in the sample have the same probability of being assigned to the experimental group or to the control group. (This is not the same as a specific subject having an equal probability of being assigned to the groups; for design reasons, one group may be deliberately larger than the other, so the probability of ending up in that group is higher. However, the probability would be the same for all subjects.) This ensures that in the long run (i.e., with a large number of subjects) any underlying factors that may affect the outcome are equivalent for each group.
Bear in mind that random selection and random allocation have different aims. As stated earlier, random selection of subjects is used to ensure the generalizability of the results from the sample to the population. The purpose of random allocation is to ensue the validity of any cause-effect interpretations we make from the data.
The subjects are allocated to groups by a randomization device or scheme. If there are only two groups that are equal in size, this can be accomplished by a simple coin toss: if heads, then the first group, or if tails, the other group. However, it is more common to use a table of random numbers, which can be found in most introductory statistics books. These tables consist of many numbers, often listed in groups of five for the sake of readability, which are generated in a completely random fashion. An example of a small portion of a table of random numbers would look something like this:
92778 07201 92632 93521 1823583855 98335 11980 90040 2284385527 62908 55960 80310 4676534606 20883 66096 23610 0076537375 68228 49966 20361 5742481839 59252 91022 94233 9392867018 85005 03174 89887 94262
To assign subjects to two groups, the table is entered at random; if the first number is odd, for example, the subject is allocated to Group A, and if it is even, to Group B. The second subject is assigned in the same way on the basis of the next number in the table; “next” can mean moving you finger right, left, up, or down. When there are three groups, the subject is assigned to the first group if the number is 1, 2, or 3; to the second group if the number is 4, 5, or 6; and to the last group if the number is 7, 8, or 9. If a zero is encountered, it is simply ignored, and the next nonzero number is used. Groups of unequal sizes can be created in the same way. If Group A is to be twice the size of B, then numbers 1 through 6 can be used to allot subjects to Group A and 7 to 9 to Group B.
Now that you’ve mastered the arcane art of using tables of random numbers, the good news is that you probably won’t need to do it because most computers can easily produce random numbers. There are a number of programs that capitalize on this and produce lists of random assignments according to your specifications-equal numbers in all groups, one group twice the size of the others, and so on. However, they’re based on the same principles as those of the random number table so your mental effort was not in vain.
Block Randomization Block randomization is a modification of random allocation in which subjects are allocated in small blocks that usually insist of two to four times the number of groups ( Fig.1229). If there are three groups, then the block size is often six, nine, or 12 subjects.
The subjects in the first block are randomly assigned so that there are equal numbers in each group (or, if the groups are not to be equal, they are assigned in proportion to the size of each group). The subjects in the succeeding blocks are then randomized in turn until the final sample size is achieved. ( Fig.1230).
Figure 1229 – (Figure 3-8) Allocation of subjects into blocks

Streiner DL, Norman GR. PDQ Epidemiology-Second Edition, 1996, BC Decker Inc., Hamilton, Ontario.
Some figures may not display clearly when rendered as a PDF or printed.
Figure 1230 – (Figure 3-9) Block randomization

Streiner DL, Norman GR. PDQ Epidemiology-Second Edition, 1996, BC Decker Inc., Hamilton, Ontario.
Some figures may not display clearly when rendered as a PDF or printed.
Block randomization ensures that, even if the study ends prematurely, there will he nearly equal numbers in all groups. With simple randomization it is possible to have a “run” of subjects assigned to one group; if the study ends at this point, an imbalance could result that would tend to reduce the efficiency of most statistical tests.
Stratified Allocation
The aim of stratified allocation is slightly different from that of stratified selection. In the selection phase, stratification is used to ensure that the sample has certain desired characteristics. These characteristics may demand that the sample (1) matches the population on certain key variables, (2) includes sufficient numbers of subjects in all strata to permit subanalyses, or (3) has a normal distribution. The purpose of stratified allocations is more simple; it ensures that the groups do not differ too significantly on the stratification variables.
Stratified allocation is done when it is believed that the stratification variables may affect the outcome. If the groups are not balanced, any difference in outcome may result from these “nuisance” variables rather than from our intervention. For instance, if response to treatment is related to the patient’s age, we do not want the experimental and control groups to differ on this factor.
For logistic reasons it is often impractical to have more than two or three stratifying variables, unless the available population is large in relation to the sample size. Variables for stratification are chosen on the basis of their potential to affect the outcome. For example, because we felt that response to treatment was related to age but not to sex, only the former variable should be considered as a stratifying variable. If both age and duration of illness affect the outcome, but only one can be used as a stratification variable because of sample size limitations, the one that is more strongly associated with the outcome would be the variable to choose.
Minimization
Minimization is a relatively recent and sophisticated method of assigning subjects to groups and is used when there are many variables on which they should be matched. To keep matters simple, let’s assume that we want to match the groups on only two variables: age and parity. The first few subjects are assigned to the groups by simple randomization. When a new person comes along, she is tentatively placed into each group in turn, and we compute what the mean age and parity level would be if she were in that group. The group to which she is ultimately assigned is based on minimizing the age and parity differences among the groups.
To deal with both continuous and discrete variables simultaneously and with a large number of them, continuous variables such as age are broken up into categories. Then we count the number of subjects assigned to each category across all of the variables if the subject were assigned to the first group and subtract the number in each category for the second group; then we do the same thing again, only this time we assign the patient to the second group. The subject is allocated to whichever group results in a smaller sum, reflecting the minimum difference between the groups. Taves used this technique for 15 variables simultaneously, showing that it can be done. However, it hasn’t seemed to have caught on widely yet as a replacement for simple random assignment.
Nonrandom (Haphazard) Allocation Nonrandom allocation refers to situations in which subjects end up in the various groups by some manner other than having been randomly assigned. Let’s assume that we wanted to compare the mean Apgar scores of kids whose mothers worked with VDTs against a group of kids whose mothers did not use VDTs. Although we could select mothers at random from these two groups, the allocation would not have been random; they would have selected themselves to work or not to work with the terminals.
The difficulty here is that there may be other factors on which these two groups of people differ. Some factors to be taken into consideration include the following:
-
Working women may be healthier than women in general (see the discussion on subject selection biases in Threats to Validity).
-
They may be working because they are poorer than other women (or become richer because they are working) and therefore provide a different prenatal environment.
-
Even if we match for working status, those who have been chosen to be moved from typewriters to computers may be the brighter women.
In brief, the investigator has no control over factors that may, on the one hand, determine group membership and, on the other hand, affect the outcome.
The problem is even more acute in therapy trials. Clinical factors, which are also related to outcome, may have dictated whether the patient received medical or surgical treatment for his or her condition or was given one drug rather than another. So simply comparing the success rates of these haphazardly selected groups may lead to erroneous results, because we conclude that the difference between the groups was caused by the intervention rather than by the factors that originally placed the subject in one group rather than in the other.
Content on this page was last changed on March 19, 2009.
© 2002 BC Decker Inc. Show Disclaimer
| 5476. | Streiner DL, Norman GR. PDQ Epidemiology. 2nd ed. Hamilton, Ontario: BC Decker Inc.; 1996. |
Next Page: Other Forms of Randomization »