Dr. Weaver asks "why", and so do I. (although it seems he has had his
coffee and put the question in a nicer way.)
< on soapbox >
In the policy arena statisticians / methodologists prepend "median
split" with an adjective such as "dreaded", "infernal", "invidious".
As one wag psychologist put it "friends don't let friends do median splits".
Once upon a time there were two kinds of analysts/researchers. Good,
moral, praiseworthy ones who did ANOVA; and bad, immoral, disreputable
ones who did correlation/regression. Then due to Cohen and others it
became known that ANOVA was just a special case of correlation /
regression (glm).
People used median splits (and some still do) to force-fit their
analysis into an ANOVA rhetorical model/schema.
"Please pass the hammer, I want to put this nail with the twisty thing
around it into this board."
< off soapbox >
Art
Post by Bruce WeaverPost by tohenningHey everybody,
I have some probs with SPSS syntax again.
In principle, my prob is simple: i want to do a mediansplit on a
variable and obtain a new variable which equals 0 if case is under
median and equals 1 when case is over median.
I have tried a lot, but can't solve this prob.
maybe someone can help me, would be great.
thanks in advance,
thomas
You can use AGGREGATE to write the median yo want into the working data
file, and then compute a flag for below versus equal to or greater than
the median. Here's an example of how to get a median split on the
horsepower variable (horse) from the cars data file that ships with
SPSS. (This code requires a version of SPSS that allows AGGREGATE to
write the aggregated variable straight into the working data file. For
older versions, you'd have to write the aggregated variable out to
another file, then use MATCH FILES to bring it into the working data file.)
* ---------------------------------------- .
GET FILE='C:\Program Files\SPSS\Cars.sav'.
compute allrecs = 1.
aggregate
/break = allrecs
/medhorse = median(horse)
.
compute GEmedian = (horse GE medhorse).
MEANS
horse BY GEmedian /
CELLS MEAN COUNT STDDEV MIN MAX MEDIAN
.
* ---------------------------------------- .
Why do you want the median split, by the way? I ask because people
often use the resulting dichotomous variable as a predictor variable in
a regression model when they would be far better off using the original
continuous variable.