Post by d***@gmail.comPost by Rich UlrichPost by d***@gmail.comhi All
i have a data set where people were asked to rate the extent to which 10 words accurately described a sound (on a 5 point scale).
That is, they listen to the sound and then rate how well each word describes the sound.
I want to get a sense of whether some words were rated as more applicable than others. If possible, i'm more interested in comparing the words that people rated as most applicable to the sound than i am in the average rating for each sound. The first step i took was to convert the average ratings to indicate which word was the highly rated by each person. The trick is a person might have given more than one sound an equally high score. To my knowledge, this rules out a chi-square goodness of fit test because i don't have independence of observations (a person might have multiple words as equal highest rated for a given sound). Is there an alternative test for that type of situation?
ALTERNATIVELY
if i do just look at the mean ratings for each word, is it appropriate to use a one-way repeated measures ANOVA? As in, can i treat the words as a within subjects factor?
Happy to provide more info. Thanks for any advice
Overall - Do you have hypotheses? What do you hope to
be able to say, in the end?
From the phrases, "each sound" and "more than one sound",
I conclude that there are multiple sounds. That was not
obvious in the first paragraph. "They repeat these ratings for
(X-number) of sounds." So - How many sounds are there,
and how many raters?
Describe what effects you want/expect to show, and figure out
the testing from that starting point.
Also - With a large number of people, it might also be true that
there are different "styles" of responding that might be discernable.
That is, if there are positive/negative reactions to one sound, those
reactions might vary across raters. That's a further complication
for displaying results.
I'm also asking myself if there is a way to use factor analysis to
reduce "Words" as variables to a smaller number of "factor scores".
That could be done legitimately if there are a large number of
sounds - so you would factor-analyze the averages across raters.
If there are no more than a couple of dozen sounds, an exploratory
factoring of all scores is possible, and might be suggestive; but it
is not an analysis that journals want to report on. What's useful
about factoring depends on what your hypotheses are.
--
Rich Ulrich
thanks both for these replies - much appreciated.
But you still have not described what you want to show, to SAY.
Post by d***@gmail.com@Rich - to your questions,
The overall research question is whether people tend to agree that particular words better describe specific sounds (in this instance, in case you are curious they are bird sounds) :-)
"Sound A is characterized by squeeky and high; Sound B is pulsing and
melodic; .... " - and show the mean scores for each type of sound,
for each adjective.
Post by d***@gmail.comSo, i want to be able to show that some words are chosen as most applicable at a greater than chance level (hence, my first thought was the chi-square analysis, but i didn't account for the issue with independence of observations i.e., people being able to indicate two words as equally most applicable).
First, "greater than chance" seems like a criterion that is barely
interesting to anyone but you. If they aren't different, you
presumably could have chosen different adjectives, sounds, or
test conditions. An impressive difference among the display of
means should serve the purpose of showing differences (or, not).
Moving from "what to show/say" to "what to test" -
If you show the means, you can do a paired t-test between two
descriptors for a sound, or a repeated-measures ANOVA for one sound,
across the set of 12 descriptors. That would show "this sound" has
particular descriptors.
You can also do similar paired and RM analyses between two sounds
for a descriptor. That would show "this descriptor" is used more
often for particular sounds.
Second, "chi-squared analysis" typically means a contingency
table, which wastes the magnitude-data of the scores; and
vastly undercuts testing by increasing the d.f.
Third, the question of "independence" has to do with the fact
that your ratings of different sounds are the same people;
your /testing/, if you were testing, would be weaker if you
didn't take that into account. You have dependency whether
or not there are ties. "Ties" make certain "exact", non-parametric
tests awkward, but you should not be doing those, anyway.
Post by d***@gmail.comTo your other point, we ask them to rate 10 bird sounds in total. They are asked to rate the extent (1-5) that 12 descriptors apply to each bird sound. The same 12 descriptors each time. I had not been interested in styles of responding and would be happy to ignore that for the sake of answering the first question. That said, if the appropriate analysis answered both questions that might prove interesting.
I don't think i want to factor analyse the descriptors - its the applicability of the individual words that we are focused on.
Thanks again - keen to hear your further thoughts...
--
Rich Ulrich