23
ANOVA
Model and Matrix Computations
Notation
The following notation is used throughout this chapter unless otherwise stated:
N
Number of cases
F
Number of factors
CN
Number of covariates
k
i
Number of levels of factor
i
Y
k
Value of the dependent variable for case
k
Z
jk
Value of the
j
th covariate for case
k
w
k
Weight for case
k
W
Sum of weights of all cases
The Model
A linear model with covariates can be written in matrix notation as
YX ZCe=++β (1)
where
Y
N
×1
vector of values of the dependent variable
X
Design matrix
Np
×
bg
of rank
qp
<
β
Vector of parameters
p
×1
bg
Z
Matrix of covariates
NCN
×
bg
C
Vector of covariate coefficients
CN
×
1
bg
e
Vector of error terms
N
×1
bg
24
ANOVA
Constraints
To reparametrize equation (1) to a full rank model, a set of non-estimable
conditions is needed. The constraint imposed on non-regression models is that all
parameters involving level 1 of any factor are set to zero.
For regression model, the constraints are that the analysis of variance
parameters estimates for each main effect and each order of interactions sum to
zero. The interaction must also sum to zero over each level of subscripts.
For a standard two way ANOVA model with the main effects
α
i
and
β
j
, and
interaction parameter
γ
ij
, the constraints can be expressed as
αβγ γ
αβγ γ
111 1
0
0
== == −
=== =
•••
•
ji
ij
non regression
regression
where
•
indicates summation.
Computation of Matrices
′
XX
Non-regression Model
The
′
XX
matrix contains the sum of weights of the cases that contain a particular
combination of parameters. All parameters that involve level 1 of any of the factors
are excluded from the matrix. For a two-way design with k
1
2
=
and k
2
3
=
, the
symmetric matrix would look like the following:
α
2
β
2
β
3
γ
22
γ
23
α
2
N
2
•
N
22
N
23
N
22
N
23
β
2
N
•
2
0
N
22
0
β
3
N
•
3
0
N
23
γ
22
N
22
0
γ
23
N
23
The elements
N
i
•
or
N
j
•
on the diagonal are the sums of weights of cases that
have level
i
of
α
or level
j
of
β
. Off-diagonal elements are sums of weights of cases
cross-classified by parameter combinations. Thus, N
•
3
is the sum of weights of
ANOVA
25
cases in level 3 of main effect
β
3
, while N
22
is the sum of weights of cases with
α
2
and
β
2
.
Regression Model
A row of the design matrix X is formed for each case. The row is generated as
follows:
If a case belongs to one of the 2 to k
i
levels of factor
i
, a code of 1 is placed in
the column corresponding to the level and 0 in all other
k
i
−
1 columns associated
with factor
i
. If the case belongs in the first level of factor
i
,
−
1 is placed in
all
the
k
i
−
1 columns associated with factor
i
. This is repeated for each factor. The
entries for the interaction terms are obtained as products of the entries in the
corresponding main effect columns. This vector of dummy variables for a case will
be denoted as di i NC
bg
,,,=1 K , where
NC
is the number of columns in the
reparametrized design matrix. After the vector d is generated for case
k
, the
ij
th
cell of
′
XX
is incremented by did jw
k
bg b g
, where
iNC
=1, ,K and
ji≥
.
Checking and Adjustment for the Mean
After all cases have been processed, the diagonal entries of
′
XX
are examined.
Rows and columns corresponding to zero diagonals are deleted and the number of
levels of a factor is reduced accordingly. If a factor has only one level, the analysis
will be terminated with a message. If the first specified level of a factor is missing,
the first non-empty level will be deleted from the matrix for non-regression model.
For regression designs, the first level cannot be missing. All entries of
′
XX
are
subsequently adjusted for means.
The highest order of interactions in the model can be selected. This will affect
the generation of
′
XX
. If none of these options is chosen, the program will
generate the highest order of interactions allowed by the number of factors. If sub-
matrices corresponding to main effects or interactions in the reparametrized model
are not of full rank, a message is printed and the order of the model is reduced
accordingly.
Cross-Product Matrices for Continuous Variables
Provisional means algorithm are used to compute the adjusted-for-the-means cross-
product matrices.
26
ANOVA
Matrix of Covariates
′
ZZ
The covariance of covariates m and l after case k has been processed is
¢
=
¢
-+
-
F
H
G
G
I
K
J
J
-
F
H
G
G
I
K
J
J
==
-
ÂÂ
ZZ ZZ
ml ml
kklk
j
lj
j
k
kmk
jmj
j
k
kk
kk
wWZ wZ WZ wZ
WW
af a f
1
11
1
where
W
k
is the sum of weights of the first k cases.
The Vector
′
ZY
The covariance between the mth covariate and the dependent variable after case k
has been processed is
′
=
′
−+
−
F
H
G
G
I
K
J
J
−
F
H
G
G
I
K
J
J
==
−
∑∑
ZY ZY
mm
kkk
jj
j
k
kmk
jmj
j
k
kk
kk
wWY wY WZ wZ
WW
a
f
a
f
1
11
1
The Scalar
′
YY
The corrected sum of squares for the dependent variable after case k has been
processed is
′
=
′
−+
−
F
H
G
G
I
K
J
J
=
−
∑
YY YY
kk
wWY wY
WW
kkk jj
j
k
kk
bg b g
1
1
2
1
The Vector
′
XY
′
XY is a vector with NC rows. The ith element is
′
=
=
∑
XY
ikkk
k
N
Yw
δ
1
,
ANOVA
27
where, for non-regression model,
δ
k
=
1 if case
k
has the factor combination in
column
i
of
′
XX;
δ
k
=
0 otherwise. For regression model,
δ
k
di=
bg
, where
di
bg
is the dummy variable for column
i
of case
k
. The final entries are adjusted
for the mean.
Matrix
′
XZ
The (
i
,
m
)th entry is
′
=
=
∑
XZ
im mk k k
k
N
Zw
δ
1
where
δ
k
has been defined previously. The final entries are adjusted for the mean.
Computation of ANOVA Sum of Squares
The full rank model with covariates
YX ZCe
=++β
can also be expressed as
YXb Xb ZCe
=+++
kk mm
where
X
and
b
are partitioned as
XXX=
km
|
and
β =
L
N
M
O
Q
P
b
b
k
m
.
评论0