Statistics [20]: Experimental Design

11 minute read

Published:

Summary of some of the commonly used experimental design methods, including completely randomized designs, randomized block designs, full factorial designs and fractional factorial designs.


Completely Randomized Designs

For completely randomized designs, the levels of the primary factor are randomly assigned to the experimental units. By randomization, we mean that the run sequence of the experimental units is determined randomly.

For example, if there are 3 levels of the primary factor with each level to be run 2 times, then there are possible run sequences in total. Because of replication, the number of unique ordering would be .

All completely randomized designs with one primary factor are defined by 3 numbers:

  • : number of factors
  • : number of levels
  • : number of replications

and the total sample size would be .

Assume , then . The randomized sequence of a trial might look like:

X1
3
1
4
2
2
3
1
4
1
3
2
4

There are in total = 369600 ways to run the experiment, all equally likely to be picked by a randomization procedure.


Randomized Block Designs

For randomized block designs, there is one factor or variable that is of primary interest. However, there are also several other nuisance factors, which are those that may affect the measured result, but are not of primary interest.

When we can control nuisance factors, an important technique known as blocking can be used to reduce or eliminate the contribution to experimental error contributed by nuisance factors. The basic concept is to create homogeneous blocks in which the nuisance factors are held constant and the factor of interest is allowed to vary. Within blocks, it is possible to assess the effect of different levels of the factor of interest without having to worry about variations due to changes of the block factors, which are accounted for in the analysis.

Example

Suppose engineers at a semiconductor manufacturing facility want to test whether different wafer implant material dosages have a significant effect on resistivity measurements after a diffusion process taking place in a furnace. They have four different dosages they want to try and enough experimental wafers from the same lot to run three wafers at each of the dosages.

The nuisance factor they are concerned with is “furnace run” since it is known that each furnace run differs from the last and impacts many process parameters.

An ideal way to run this experiment would be to run all the wafers in the same furnace run. That would eliminate the nuisance furnace factor completely. However, regular production wafers have furnace priority, and only a few experimental wafers are allowed into any furnace run at the same time.

A non-blocked way to run this experiment would be to run each of the twelve experimental wafers, in random order, one per furnace run. That would increase the experimental error of each resistivity measurement by the run-to-run furnace variability and make it more difficult to study the effects of the different dosages.

The blocked way to run this experiment, assuming you can convince manufacturing to let you put four experimental wafers in a furnace run, would be to put four wafers with different dosages in each of three furnace runs. The only randomization would be choosing which of the three wafers with dosage 1 would go into furnace run 1, and similarly for the wafers with dosages 2, 3 and 4.

Let be dosage “level” and be the blocking factor furnace run. Then the experiment can be described as follows:

  • : 2 factors
  • : 4 levels of factor
  • : 3 levels of factor
  • : 1 replication per cell
  • runs

Before randomization, the design trials look like:

X1 X2
1  1
1  2
1  3
2  1
2  2
2  3
3  1
3  2
3  3
4  1
4  2
4  3

This can also be represented using matrix.

Latin Square

For Latin square designs, there are 2 nuisance factors, the 2 nuisance factors are divided into a tabular grid with the property that each row and each column receive each treatment exactly once.

Note:

  • The number of levels of each blocking variable must equal the number of levels of the treatment factor.
  • The Latin square model assumes that there are no interactions between the blocking variables or between the treatment variable and the blocking variable.

Below are designs for 3- and 4-level factors. When using any of these designs, be sure to randomize the treatment units and trial order, as much as the design allows.

For three-level factors:

  • : 3 factors (2 blocking factors and 1 primary factor)
  • : 3 levels of factor
  • : 3 levels of factor
  • : 3 levels of factor
  • : 1 replication per cell
  • runs
    X1 X2 X3
    1  1  1
    1  2  2
    1  3  3 
    2  1  3
    2  2  1
    2  3  2
    3  1  2
    3  2  3 
    3  3  1
    

    This can alternately be represented as

    A B C
    C A B
    B C A
    

For four-level factors:

  • : 3 factors (2 blocking factors and 1 primary factor)
  • : 4 levels of factor
  • : 4 levels of factor
  • : 4 levels of factor
  • : 1 replication per cell
  • runs
    X1 X2 X3
    1  1  1
    1  2  2
    1  3  4 
    1  4  3
    2  1  4
    2  2  3
    2  3  1
    2  4  2
    3  1  2
    3  2  4 
    3  3  3
    3  4  1
    4  1  3
    4  2  1
    4  3  2
    4  4  4
    

    This can alternately be represented as

    A B D C
    D C A B
    B D C A
    C A B D
    

Graeco-Latin Square

For Graeco-Latin square designs there are 3 nuisance factors, a Graeco-Latin square design is a tabular grid in which is the number of levels of the treatment factor. However, it uses 3 blocking variables instead of the 2 used by the standard Latin square design.

For three-level factors:

  • : 4 factors (3 blocking factors and 1 primary factor)
  • : 3 levels of factor
  • : 3 levels of factor
  • : 3 levels of factor
  • : 3 levels of factor
  • : 1 replication per cell
  • runs
    X1 X2 X3 X4
    1  1  1  1 
    1  2  2  2
    1  3  3  3
    2  1  2  3
    2  2  3  1
    2  3  1  2
    3  1  3  2
    3  2  1  3
    3  3  2  1
    

    This can alternately be represented as (A, B, and C represent the treatment factor and 1, 2, and 3 represent the blocking factor)

    A1 B2 C3
    C2 A3 B1
    B3 C1 A2
    

Hyper-Graeco-Latin Square

For Hyper-Graeco-Latin square designs there are 4 nuisance factors. A Hyper-Graeco-Latin square design is also a tabular grid with denoting the number of levels of the treatment factor. However, it uses 4 blocking variables instead of the 2 used by the standard Latin square design.

For four-level factors:

  • : 5 factors (4 blocking factors and 1 primary factor)
  • : 4 levels of factor
  • : 4 levels of factor
  • : 4 levels of factor
  • : 4 levels of factor
  • : 4 levels of factor
  • : 1 replication per cell
  • runs
    X1 X2 X3 X4 X5
    1  1  1  1  1
    1  2  2  2  2
    1  3  3  3  3
    1  4  4  4  4
    2  1  4  2  3 
    2  2  3  1  4
    2  3  2  4  1
    2  4  1  3  2
    3  1  2  3  4
    3  2  1  4  3
    3  3  4  1  2
    3  4  3  2  1
    4  1  3  4  2
    4  2  4  3  1
    4  3  1  2  4
    4  4  2  1  3
    

    This can alternately be represented as (A, B, C and D represent the treatment factor and 1, 2, 3 and 4 represent the blocking factor)

    A11 B22 C33 D44
    C42 D31 A24 B13
    D23 C14 B41 A32
    B34 A43 D12 C21
    

Full Factorial Designs

A design in which every setting of every factor appears with every setting of every other factor is a full factorial design.

A common experimental design is one with all input factors set at two levels each. These levels are called high' and low’, respectively. A design with all possible high/low combinations of all the input factors is called a full factorial design in two levels. If there are factors, each at 2 levels, a full factorial design has runs.

Consider the two-level, full factorial design for three factors, in tabular form, this design is given by:

run  X1  X2  X3
1    -1  -1  -1
2     1  -1  -1
3    -1   1  -1
4     1   1  -1
5    -1  -1   1
6     1  -1   1
7    -1   1   1
8     1   1   1

Fractional Factorial Designs

A factorial experiment in which only an adequately chosen fraction of the treatment combinations required for the complete factorial experiment is selected to be run.

Consider the two-level, full factorial design for three factors, namely the design. This implies eight runs (not counting replications or center points).

run  X1  X2  X3  Y
1    -1  -1  -1  y1=33
2     1  -1  -1  y2=63
3    -1   1  -1  y3=41
4     1   1  -1  y4=57
5    -1  -1   1  y5=57
6     1  -1   1  y6=51
7    -1   1   1  y7=59
8     1   1   1  y8=53

The right-most column of the table lists through to indicate the responses measured for the experimental runs when listed in standard order.

From the entries in the table we are able to compute all ‘effects’ such as main effects, first-order ‘interaction’ effects, etc. For example, to compute the main effect estimate of factor ,

Suppose, however, that we only have enough resources to do four runs. Is it still possible to estimate the main effect for ? Or any other main effect?

For example, using these four runs (1, 4, 6 and 7), we can still compute as follows:

The half fraction we used is a new design written as . Note that , which is the number of runs in this half-fraction design.

Construct the Fractional Factorial Design

To construct the fractional factorial design, we start from the full fractional design.

    X1  X2
1   -1  -1
2   +1  -1
3   -1  +1
4   +1  +1

We can add as the third column,

    X1  X2  X1*X2
1   -1  -1   +1
2   +1  -1   -1
3   -1  +1   -1
4   +1  +1   +1

We may now substitute in place of in the table.

    X1  X2  X3
1   -1  -1  +1
2   +1  -1  -1
3   -1  +1  -1
4   +1  +1  +1

Table of Contents

Comments