Post by SonnyThanks a lot everyone.
I'll do a Principal Component analysis on all the items using their
correlation coefficients. I'll instruct SPSS to extract two most principal
components only, since this is what the items are intended for. I'll use
Oblique rotation to align the components, since I expect these two attitude
constructs are somewhat related to each other. Hopefully the results will
show that 1) much of the variance in the original items have been accounted
for by these two factors, and 2) the items will load high on their own
factor and low on the other. If not then I'll check to see which items
cross-load. Remove those items and repeat the process if necessary until I
have the two intended factors. Then compute a factor score using regression
or Bartless method for my next-step regression analysis.
What do you think?
Well, the answer of Rich Ulrich reminds me, that
my own (long) answer drove me a bit away from your
last question.
I understand you are in an explorative situation,
looking for factors such that two groups of
items are best separated, also exploring, which
items should be included. I think, your idea to
explore it this way is ok so far, and surely
interesting. But note, that factors, found by
arbitrarily removing and including items this
way, may reported later only as this: a pure
heuristic of a -possibly- only random concert of
items and sample-features, just only finding
an idea for a hypothese, which must investigated
later with different datasets and contrasted to
concurring models.
But you seem to be in such an explorative situation,
so I think, that your proposed aproach is meaningful
and interesting.
Gottfried Helms
P.s:
concerning that exploration I may add one comment:
For a flexible explorative approach to such questions
I wrote a DOS-based-program some years ago, which allows
to find similar structures as yours by interactively
rotating and selecting the interesting items, plus
some enhancements.
The idea is, to include all variables, the items
to be factored plus the (metric) items, to which
the factors should later be related in a common vector
space first.
say the loadingsmatrix, unknown values indicated by
*, unknown small loadings by ., uniteresting loadings
by ? , zero loadings by -
common itemspecific
factors variances
1 2 3 4 5 6 7 8 9
------------------------------------
co-variate items
age * * * * . - - - - -
scaleitems part a ------------------
it_a1 * . . . . - * - - - - -
it_a2 * * . . . - - * - - - -
it_a3 * . . . . - - - * - - -
scaleitems part b ------------------
it_b1 . * . . . - - - - * - -
it_b2 . * . . . - - - - - * -
it_b3 * * . . . - - - - - - *
where the (metric) co-variate items are already
included and the computation of factorscores
is not needed.
Improving for varimax-factors, in which you are
interested, would be done by "deactivating" the
crossloading items to find
common itemspecific
factors factors/variances
1 2 3 4 5 6 7 8 9 10 11
------------------------------------
co-variate items
age * * * ? * - - - - - -
scaleitems part a ------------------
it_a1 ** . . . . - * - - - - -
it_a2 ? ? . . . - - * - - - -
it_a3 ** . . . . - - - * - - -
scaleitems part b ------------------
it_b1 . ** . . . - - - - * - -
it_b2 . ** . . . - - - - - * -
it_b3 ? ? . . . - - - - - - *
interactively, where the factors are rotated for
optimizing for the "active" items only. This can
interactively be improved by just activating/
deactivating appropriate scale-items and re-rotate.
Unfortunately I couldn't implement oblique rotations
like promax and others, when I wrote that program.
But see below for "oblique factors" as "latent variables"
The inclusion of the co-variate items, which don't
influence the rotation-criteria, allows to see
correlations with the found factors in one shot,
without need of estimating factorscores.
This concept of including the covariates in the
overall vectorspace also allows to control that the
found factors are uncorrelated to an itemspecific
variance in the covariates, here "age", as well,
which could not be achieved with common procedures,
except by subsequent factor-analyses of the
factorscores with the covariates.
Unable to do oblique rotations with that program,
it would be possible instead, to proceed from the
above configuration, and to find a principal
component for the interesting scaleitems "a" first,
include that as a new "latent variable" into the set:
common itemspecific
factors factors/variances
1 2 3 4 5 6 7 8 9 10 11
------------------------------------
co-variate items
age * * * ? * - - - - - -
scaleitems part a ------------------
it_a1 ** . . . . - * - - - - -
it_a2 ? ? . . . - - * - - - -
it_a3 ** . . . . - - - * - - -
scaleitems part b ------------------
it_b1 ? ? . . . - - - - * - -
it_b2 ? ? . . . - - - - - * -
it_b3 ? ? . . . - - - - - - *
pc of common variance of "good" scaleitems a
pc_a 1 - - - - - - - - - - -
then do a new rotation for principal compoenent
of the interesting scaleitems b and add this
components as another new latent variable to the
set:
common itemspecific
factors factors/variances
1 2 3 4 5 6 7 8 9 10 11
------------------------------------
co-variate items
age * * * ? * - - - - - -
scaleitems part a ------------------
it_a1 ? ? . . . - * - - - - -
it_a2 ? ? . . . - - * - - - -
it_a3 ? ? . . . - - - * - - -
scaleitems part b ------------------
it_b1 ** - . . . - - - - * - -
it_b2 ** - . . . - - - - - * -
it_b3 ? ? . . . - - - - - - *
pc of common variance of "good" scaleitems a and b
pc_a ? ? - - - - - - - - - -
pc_b 1 - - - - - - - - - - -
and the new two latent variables are representants
of oblique factors, which would be found approximately
by an oblique rotation.
Rotating for getting the factor "age" in the
first columns, for instance, to find the
correlations of the found factors with "age"
(remember, an itemspecific error in "age" was
also excluded from the beginning) like
common itemspecific
factors factors/variances
1 2 3 4 5 6 7 8 9 10 11
------------------------------------
co-variate items
age 1 - - - - - - - - - -
scaleitems part a ------------------
it_a1 ? ? . . . ? * - - - - -
it_a2 ? ? . . . ? - * - - - -
it_a3 ? ? . . . ? - - * - - -
scaleitems part b ------------------
it_b1 ** - . . . ? - - - * - -
it_b2 ** - . . . ? - - - - * -
it_b3 ? ? . . . ? - - - - - *
pc of common variance of "good" scaleitems a and b
pc_a x * - - - ? - - - - - -
pc_b y * . - - ? - - - - - -
gives then the correlations of age with the latent
variables pc_a,pc_b (which represent oblique
principal factors of it_a1, it_a3, and
it_b1,it_b2 respectively) in the values of
x and y in the above table, and may be conceptually
superior to any factor solution of a canned procedure:
- the assumption of an itemspecific variance even of
the covariates can be respected in defining the
factors
- the so found oblique factors have an intuitive definition
as principal component of the (interesting)
common variance of the separate subsets of items,
and are dealt with as "latent" variables, completely ana-
loguously as any other item in this configuration.
- the correlation with the covariates need no inter-
mediate estimation of factor-scores (in fact, they
behave as if they were estimated by regression, if
that would be done)
If -after this monster-post- you are still reading... :-)
and are interested in this type of factor-exploration,
you may email me and get the program from my server.
Gottfried Helms