Much of the data needed for information purposes are collected through *sampling*.
A sample is a set of values taken from a *population* of those values to represent
that population without the need for taking them all. Sampling saves time,
and in many cases it is impossible to identify all the individuals in a population anyway.
While the results or parameters, of a sample may or may not represent the parameters of the
population *exactly*, given the right sample size and an unbiased sample the results
can be very close to the original data.

The first action is to define the sampling frame i.e. to identify the population from which the sample will be taken. The population, for example, may be people, manufactured items, or a working period (as in activity sampling).

*Random sampling* is akin to the lottery methods and is the best method in statistical
terms as it is more likely to reduce the effects of bias. There are several ways in which
a random sample can be taken such as drawing well-mixed numbers from a container without
deliberate selection or using the random numbers published in textbooks as tables,
or computer spreadsheets.

*Systematic sampling* (the constant skip method) is a non-random method.
To reduce the chance of bias *something* must be random within the sampling frame,
so if the sampling method is not random then the data themselves must be random.
Systematic sampling is a way of taking every n^{th} value.

For example, in activity sampling (see Related Topics), a snap observation of a task being performed may be made every twenty minutes as long as this did not coincide with cycles of the work (e.g. each work cycle was not a multiple of five minutes).

*Stratified sampling* is a method of using natural divisions of a sampling frame
such as age, social class, type of machine or equipment, or time periods (such as days).
This ensures that all sections of the population are represented.
Taking the activity sampling example again, if random times are used the situation could
arise where certain hours of the day had, by chance, significantly more observations than others.
If this were a problem then the __strata__ could be "__hours__ in the day" and
each hour divided into a fixed number of random or systematic times.

The results of sampling can produce a picture of the whole population, say in the form of a frequency distribution or more specifically, its parameters. These parameters include averages (e.g. mean, median, mode and others) and how the distribution is dispersed e.g. range, mean deviation, standard deviation (see Related Topics).

Because a sample usually is a relatively small fraction of the population it may not always
describe the population and its parameters very accurately so the latter will contain statistical
errors. For example, the mean of the sample may not mirror the mean of the population exactly
because we did not measure every value in the population.
In this case there will be an error associated with the sample mean.
This is called the *standard error* of the mean.

The form of the standard error depends on the type of data (see Related Topics).

An example statement describing the estimated population mean is:

*"the estimated population mean = the sample mean* ±X* standard errors"*

The constant X is the number of standard errors necessary to define the level of confidence we have in the result and this is obtained from statistical tables published in textbooks on *Statistical Method*.

A schematic diagram to illustrate and compare two methods of sampling

Task: To take a sample of ten items from a random population by (a) random and (b) systematic sampling. | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

random: | X | X | X | X | X | X | X | X | X | X | ||||||||||||||||||||||

item codes |
A | B | C | D | E | F | G | H | I. | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | α | β | γ | δ | ε | |

systematic | X | X | X | X | X | X | X | X | X | X | ||||||||||||||||||||||

Ten items sampled |
||||||||||||||||||||||||||||||||

random: | C | H | J | M | S | V | W | Y | β | δ | ||||||||||||||||||||||

systematic | B | E | H | K | N | Q | T | W | Z | γ |

Custom Search