How best to code ranked data in spss

Discussion:

(too old to reply)

Fergal O'Hanlon

2019-07-26 09:42:26 UTC

Hi,

My SurveyMonkey questionnaire asked a user to choose their top three items (uses 3 drop downs where first drop down is first preference, second dropdown is second perference and third drop down is third preference). Each of the dropdown contains a list of 25 choices.

At the moment in spss these are imported as 3 variables. So I can run descriptives, frequencies etc on these three variables.

For example, this is what the 3 variables look like at the minute. Each number corresponds to the 25 items in the each drop down. Var1 is first preference, var2 is second preference and var3 is third preference.

Respondent var1 var2 var3
1. 2 13 22
2. 6 7 12
3. 1 8 2
4. 3 11 4
5. 6 4 23
6. 2
7. 1 22
8. 6 11 3

Each respondent could have a number 1-25 in any of the three variables. A small few (9 from over 500 responses) on a paper version decided not to answer this question so there are a few participants where these variables are empty or they only entered a first preference, others only a first and second preference.

I'm not sure if this is the right mechanism to store these variables? For analysis of the data, would it be better to instead create 25 variables (representing each of the 25 items in the drop downs) and where relevant add in 1,2 or 3 to the relevant choice preference for each participant if they have chosen a ranking for that variable (item). Would I leave them empty if not chosen?

So taking first row above as an example, new var2 would have a 1, new var 13 a 2 and new var22 a 3.

Is this a better approach for further analysis of the data?

Thanks for your help.

Rich Ulrich

2019-07-26 21:07:16 UTC

Permalink

On Fri, 26 Jul 2019 02:42:26 -0700 (PDT), "Fergal O'Hanlon"

Or, better, you can run Mult-Response.

Post by Fergal O'Hanlon
For example, this is what the 3 variables look like at the minute. Each number corresponds to the 25 items in the each drop down. Var1 is first preference, var2 is second preference and var3 is third preference.
Respondent var1 var2 var3
1. 2 13 22
2. 6 7 12
3. 1 8 2
4. 3 11 4
5. 6 4 23
6. 2
7. 1 22
8. 6 11 3
Each respondent could have a number 1-25 in any of the three variables. A small few (9 from over 500 responses) on a paper version decided not to answer this question so there are a few participants where these variables are empty or they only entered a first preference, others only a first and second preference.
I'm not sure if this is the right mechanism to store these variables? For analysis of the data, would it be better to instead create 25 variables (representing each of the 25 items in the drop downs) and where relevant add in 1,2 or 3 to the relevant choice preference for each participant if they have chosen a ranking for that variable (item). Would I leave them empty if not chosen?
So taking first row above as an example, new var2 would have a 1, new var 13 a 2 and new var22 a 3.
Is this a better approach for further analysis of the data?

And what do you hope to learn from further analysis of data?

Without your asking, I will offer the opinion that this style
("show your top three") is crappy way of collecting data
unless you only have questions that particularly fit. Or, it
is okay for a /start/ if you go ahead and ask (for instance)
for a separate scoring of each of the 25: "Never heard of it",
"no opinion", hate, don't like, okay, like, love.

Rating of 1+2+3 is "Mentioned at all" - is that a useful concept?

How many of the 25 will have at least 10 mentions? Should
the rest be dropped from further consideration at all, or else
grouped as "other"?

Going to 25 variables is the obvious step if you want to look
at a correlation matrix (say) across mentions.

Hope this helps.

--
Rich Ulrich

Fergal O'Hanlon

2019-07-26 23:37:40 UTC

Permalink

Thx Rick much appreciated

Post by Fergal O'Hanlon
Hi,
My SurveyMonkey questionnaire asked a user to choose their top three items (uses 3 drop downs where first drop down is first preference, second dropdown is second perference and third drop down is third preference). Each of the dropdown contains a list of 25 choices.
At the moment in spss these are imported as 3 variables. So I can run descriptives, frequencies etc on these three variables.
For example, this is what the 3 variables look like at the minute. Each number corresponds to the 25 items in the each drop down. Var1 is first preference, var2 is second preference and var3 is third preference.
Respondent var1 var2 var3
1. 2 13 22
2. 6 7 12
3. 1 8 2
4. 3 11 4
5. 6 4 23
6. 2
7. 1 22
8. 6 11 3
Each respondent could have a number 1-25 in any of the three variables. A small few (9 from over 500 responses) on a paper version decided not to answer this question so there are a few participants where these variables are empty or they only entered a first preference, others only a first and second preference.
I'm not sure if this is the right mechanism to store these variables? For analysis of the data, would it be better to instead create 25 variables (representing each of the 25 items in the drop downs) and where relevant add in 1,2 or 3 to the relevant choice preference for each participant if they have chosen a ranking for that variable (item). Would I leave them empty if not chosen?
So taking first row above as an example, new var2 would have a 1, new var 13 a 2 and new var22 a 3.
Is this a better approach for further analysis of the data?
Thanks for your help.