The way in which we select a sample of
individuals to be research participants is critical. How we select participants
(random sampling) will determine the population to which we may generalize our
research findings. The procedure that we use for assigning participants to different
treatment conditions (random assignment) will determine whether bias exists in
our treatment groups.
describing sampling procedures, we need to define a few key terms. The term
population means all members that meet a set of specifications or a specified
criterion. For example, the population of the United States is defined as all
people residing in the United States. The population of New Orleans means all
people living within the city’s limits or boundary. A population of inanimate
objects can also exist, such as all automobiles manufactured in Michigan in the
year 2003. A single member of any given population is referred to as an
element. When only some elements are selected from a population, we refer to
that as a sample; when all elements are included, we call it a census.
derived from a sample are treated statistically. Using sample data, we
calculate various statistics, such as the mean and standard deviation. These
sample statistics summarize (describe) aspects of the sample data. These data,
when treated with other statistical procedures, allow us to make certain
inferences. From the sample statistics, we make corresponding estimates of the
population. Thus, from the sample mean, we estimate the population mean; from
the sample standard deviation, we estimate the population standard deviation.
Types of Sampling
Simple Random Sampling
Researchers use two major sampling techniques:
probability sampling and non probability sampling. With probability sampling, a
researcher can specify the probability of an element’s (participant’s) being
included in the sample. With non probability sampling, there is no way of
estimating the probability of an element’s being included in a sample. If the
researcher’s interest is in generalizing the findings derived from the sample
to the general population, then probability sampling is far more useful and
precise. Unfortunately, it is also much more difficult and expensive than non
sampling is also referred to as random sampling or representative sampling. The
word random describes the procedure used to select elements (participants,
cars, test items) from a population.
When random sampling is used, each element in the population has an
equal chance of being selected (simple random sampling) or a known probability
of being selected (stratified random sampling). The sample is referred to as
representative because the characteristics of a properly drawn sample represent
the parent population in all ways.
caution before we begin our description of simple random sampling: Random
sampling is different from random assignment. Random assignment describes the
process of placing participants into different experimental groups.
Step 1. Defining the
Before a sample is taken, we must first define
the population to which we want to generalize our results. The population of
interest may differ for each study we undertake. It could be the population of
professional football players in the United States or the registered voters in
Bowling Green, Ohio. It could also be all college students at a given
university, or all sophomores at that institution. It could be female students,
or introductory psychology students, or 10-year-old children in a particular school,
or members of the local senior citizens centre. The point should be clear; the
sample should be drawn from the population to which you want to generalize—the
population in which you are interested.
is unfortunate that many researchers fail to make explicit their population of
interest. Many investigators use only college students in their samples, yet
their interest is in the adult population of the United States. To a large
extent, the generalizability of sample data depends on what is being studied
and the inferences that are being made.
Step 2. Constructing a List
Before a sample can be chosen randomly, it is
necessary to have a complete list of the population from which to select. In
some cases, the logistics and expense of constructing a list of the entire
population is simply too great, and an alternative procedure is forced upon the
investigator. We could avoid this problem by restricting our population of
interest—by defining it narrowly. However, doing so might increase the
difficulty of finding or constructing a list from which to make our random
selection. For example, you would have no difficulty identifying female
students at any given university and then constructing a list of their names
from which to draw a random sample. It would be more difficult to identify female students coming from a
three-child family, and even more difficult if you narrowed your interest to
firstborn females in a three-child family. Moreover, defining a population narrowly also means
generalizing results narrowly.
Caution must be exercised in compiling a list
or in using one already constructed. The population list from which you intend
to sample must be both recent and exhaustive. If not, problems can occur. By an
exhaustive list, we mean that all members of the population must appear on the
list. Voter registration lists, telephone directories, homeowner lists, and
school directories are sometimes used, but these lists may have limitations.
They must be up to date and complete if the samples chosen from them are to be
truly representative of the population. In addition, such lists may provide
very biased samples for some research questions we ask.
Step 3. Drawing the Sample
After a list of population members has been
constructed, various random sampling options are available. Some common ones
include tossing dice, flipping coins, spinning wheels, drawing names out of a
rotating drum, using a table of random numbers, and using computer programs.
Except for the last two methods, most of the techniques are slow and
cumbersome. Tables of random numbers are easy to use, accessible, and truly
random. Here is a website that provides a random number table, as well as a way
to generate random numbers.
Step 4. Contacting Members of a Sample
using random sampling procedures must be prepared to encounter difficulties at
several points. As we noted, the starting point is an accurate statement that
identifies the population to which we want to generalize. Then we must obtain a
listing of the population, accurate and up-to-date, from which to draw our
sample. Further, we must decide on the random selection procedure that we wish
to use. Finally, we must contact each of those selected for our sample and
obtain the information needed. Failing to contact all individuals in the sample
can be a problem, and the representativeness of the sample can be lost at this
This procedure known as stratified random
sampling is also a form of probability sampling. To stratify means to classify
or to separate people into groups according to some characteristics, such as
position, rank, income, education, sex, or ethnic background. These separate
groupings are referred to as subsets or subgroups. For a stratified random
sample, the population is divided into groups or strata. A random sample is
selected from each stratum based upon the percentage that each subgroup
represents in the population. Stratified random samples are generally more
accurate in representing the population than are simple random samples. They
also require more effort, and there is a practical limit to the number of
strata used. Because participants are to be chosen randomly from each stratum,
a complete list of the population within each stratum must be constructed.
Stratified sampling is generally used in two different ways. In one, primary
interest is in the representativeness of the sample for purposes of commenting
on the population. In the other, the focus of interest is comparison between
and among the strata.
samples are sometimes used to optimize group comparisons. In this case, we are
not concerned about representing the total population. Instead, our focus is on
comparisons involving two or more strata. If the groups involved in our
comparisons are equally represented in the population, a single random sample
could be used. When this is not the case, a different procedure is necessary.
For example, if we were interested in making comparisons between whites and
blacks, a simple random sample of 100 people might include about 85 to 90
whites and only 10 to 15 blacks. This is hardly a satisfactory sample for
making comparisons. With a stratified random sample, we could randomly choose
50 whites and 50 blacks and thus optimize our comparison. Whenever strata
rather than the population are our primary interest, we can sample in different
proportions from each stratum. Although random sampling is optimal from a
methodological point of view, it is not always possible from a practical point
Convenience sampling is used because it is
quick, inexpensive, and convenient. Convenience samples are useful for certain
purposes, and they require very little planning. Researchers simply use
participants who are available at the moment. The procedure is casual and easy,
relative to random sampling. Contrast using any available participants with
random sampling, where you must (1) have a well-defined population, (2) construct
a list of members of the population if one is not available, (3) sample
randomly from the list, and (4) contact and use as many individuals from the
list as possible. Convenience sampling requires far less effort. However, such
convenience comes with potential problems, which we will describe. Convenience
samples are non probability samples. Therefore, it is not possible to specify
the probability of any population element’s being selected for the sample.
Indeed, it is not possible to specify the population from which the sample was
; In shopping malls or airports, individuals are selected as they pass a
certain location and interviewed concerning issues, candidates, or other
In many large-scale applications of sampling
procedures, it is not always possible or desirable to list all members of the
population and randomly select elements from that list. The reasons for using
any alternative procedures include cost, timeliness, and convenience. One
alternative procedure is quota sampling.
technique is often used by market researchers and those taking political polls.
Usually, when this technique is used, the population of interest is large and
there are no ready-made lists of names available from which to sample randomly.
The Gallup Poll is one of the best known and well conducted polls to use quota
sampling. This poll frequently reports on major public issues and on
presidential elections. The results of the poll are syndicated for a fee that
supports it. In this quota sampling procedure, localities are selected and
interviewers are assigned a starting point, a specified direction, and a goal
of trying to meet quotas for subsets (ethnic origins, political affiliations,
and so on) selected from the population. Although some notable exceptions have
occurred, predictions of national elections over the past few years have been
relatively accurate—certainly, much more so than guesswork.
the quota sampling procedure, we first decide which subgroups of the population
interest us. This, in turn, is dictated by the nature of the problem being
investigated (the question being asked). For issues of national interest (such
as abortion, drug use, or political preference), frequently used subsets are
age, race, sex, socioeconomic level, and religion. The intent is to select a
sample whose frequency distribution of characteristics reflects that of the
population of interest. Obviously, it is necessary to know the percentage of
individuals making up each subset of the population if we are to match these
percentages in the sample. For example, if you were interested in ethnic groups
such as Italians, Germans, Russians, and so on, and knew their population
percentages, you would select your sample so as to obtain these percentages.
each subset, participants are not chosen randomly. This is simply because there
are usually no ready-made lists from which the researcher can select randomly.
Often individuals are selected in the sample on the basis of availability. For
this reason, quota sampling is less expensive. It would not be so if lists of
the population of interest had to be constructed. However, if exhaustive
ready-made lists were conveniently available for the population of interest,
then choosing participants randomly would be possible and preferable. In the
absence of such lists, it is much more convenient to select quotas by knocking
on doors, telephoning numbers, or sending mailings until the sample percentages
for subsets match those of the population. Obviously, even though the quotas
may be achieved and the sample may match the population percentages in terms of
subsets, the sample may still not represent (reflect) the population to which
we wish to generalize.
interviewers, for sampling purposes, concentrate on areas where large numbers
of people are likely to be. This could bias the findings. As we noted earlier,
samples taken in airports may over represent high-income groups, whereas those
at a bus or rail depots may over represent low-income groups. Samples at either
place may under represent those who seldom travel. Also, people who are home
during the day, and are therefore available for house-to-house interviews or
telephone calls, may be quite different in important ways from those who are not
home. In this respect, quota sampling and convenience sampling are similar. In
spite of these difficulties, the quota system is widely used and will
unquestionably continue to be so for economic and logistic reasons.
Table No. 1
of the population.
difficult to obtain the list.
May be more
of the population.
difficult to obtain the list.
May be more expensive.
member list needed.
May not be
representative of population.
member list needed.
May not be
representative of population.
Error can occur during the sampling process.
Sampling error can include both systematic sampling error and random sampling
error. Systematic sampling error is the fault of the investigation, but random
sampling error is not. When errors are systematic, they bias the sample in one
direction. Under these circumstances, the sample does not truly represent the
population of interest. Systematic error occurs when the sample is not drawn
properly, as in the poll conducted by Literary Digest magazine. It can also
occur if names are dropped from the sample list because some individuals were
difficult to locate or uncooperative. Individuals dropped from the sample could
be different from those retained. Those remaining could quite possibly produce
a biased sample. Political polls often have special problems that make
prediction difficult. Random sampling error, as contrasted to systematic
sampling error, is often referred to as chance error. Purely by chance, samples
drawn from the same population will rarely provide identical estimates of the
population parameter of interest. These estimates will vary from sample to
we conduct research, we are generally interested in drawing some conclusion
about a population of individuals that have some common characteristic.
However, populations are typically too large to allow observations on all
individuals, and we resort to selecting a sample. In order to make inferences
about the population, the sample must be representative. Thus, the manner in
which the sample is drawn is critical. Probability sampling uses random
sampling in which each element in the population (or a subgroup of the
population with stratified random sampling) has an equal chance of being
selected for the sample. This technique is considered to be the best means of
obtaining a representative sample. When probability sampling is not possible,
nonprobability sampling must be used. Convenience sampling involves using
participants who are readily available (such as introductory psychology students).
It is the easiest technique but the poorest from a methodological standpoint.
Quota sampling is essentially convenience sampling in which there is an effort
to better represent the population by sampling a certain percentage of
participants from subgroups that correspond to the prevalence of those
subgroups in the population.
By their very nature, samples do not perfectly
match the population from which they are drawn. There is always some degree of
sampling error, and the degree of error is inversely related to the size of the
sample. Larger samples are more likely to accurately represent characteristics
of the population, and smaller samples are less likely to accurately represent
characteristics of the population. Therefore, researchers strive for samples
that are large enough to reduce sampling error to an acceptable level. Even
when samples are large enough, it is important to evaluate the specific method
by which the sample was drawn. We are increasingly exposed to information
obtained from self-selected samples that represent only a very narrow subgroup
of individuals. Much of such information is meaningless because the subgroup is
difficult to identify.