Proper Route of Analysis?

Discussion:

(too old to reply)

DemPyro

2019-07-10 19:07:08 UTC

I'm trying to test the dependence of Outcome 1 based on Variables Operation, Sex and Age (all categorical) with the frequency of each association listed in another column such that the data looks like the following.

Column 1: Outcome 1 is yes or no (coded 1 for yes, 0 for no)
Column 2: Operation is yes or no (coded 1 for yes, 0 for no)
Column 3: Sex is male or female (coded 1 for male, 0 for female)
Column 4: Age is 4 buckets of 10 years each (0 for youngest, 1 for next, 2 for next, 3 for oldest age bucket).
Column 5: Number of people fitting each of the above 4 criteria.

I'm looking to calculate adjust odds ratios and 95% confidence intervals. In SPSS Staistics 25, would I do the following?

Analysis > Regression > Binary Logistic
Set-Up:

Dependent: Outcome 1
Covariates: Operation (cat, reference category First), Sex(cat, reference first), Age(cat, reference first),
Selecting CI for exp(B): 95%

Would the results of interest be: Block 0 Variables not in the Equation (for p-values to tell which adjusted covariate is significantly impacting the dependent variable), and Block 1 Variables in the Equation Exp(B) and the numbers beside it for 95% CI?

Bruce Weaver

2019-07-10 20:14:46 UTC

Permalink

Post by DemPyro
I'm trying to test the dependence of Outcome 1 based on Variables Operation, Sex and Age (all categorical) with the frequency of each association listed in another column such that the data looks like the following.
Column 1: Outcome 1 is yes or no (coded 1 for yes, 0 for no)
Column 2: Operation is yes or no (coded 1 for yes, 0 for no)
Column 3: Sex is male or female (coded 1 for male, 0 for female)
Column 4: Age is 4 buckets of 10 years each (0 for youngest, 1 for next, 2 for next, 3 for oldest age bucket).
Column 5: Number of people fitting each of the above 4 criteria.
I'm looking to calculate adjust odds ratios and 95% confidence intervals. In SPSS Staistics 25, would I do the following?
Analysis > Regression > Binary Logistic
Dependent: Outcome 1
Covariates: Operation (cat, reference category First), Sex(cat, reference first), Age(cat, reference first),
Selecting CI for exp(B): 95%
Would the results of interest be: Block 0 Variables not in the Equation (for p-values to tell which adjusted covariate is significantly impacting the dependent variable), and Block 1 Variables in the Equation Exp(B) and the numbers beside it for 95% CI?

Look up the WEIGHT command.

https://www.ibm.com/support/knowledgecenter/en/SSLVMB_26.0.0/statistics_reference_project_ddita/spss/base/syn_weight.html

Here is an example.

* Generate some fake data to show the file structure.
* NOTE that the N-variable will hold randomly generated values,
* and that you will have to replace them with the correct values.
SET RNG=MT MTINDEX=-12345.
NEW FILE.
DATASET CLOSE ALL.
INPUT PROGRAM.
NUMERIC Y Op Male AgeGrp N (F5.0).
LEAVE Y Op Male.
LOOP Y = 0 TO 1.
LOOP Op = 0 TO 1.
LOOP Male = 0 to 1.
LOOP AgeGrp = 0 TO 3.
COMPUTE N = TRUN(RV.UNIFORMI(25,100)).
END CASE.
END LOOP.
END LOOP.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.

* NOTE: This shows the structure of the needed data file,
* but the N variable holds a randomly generated number.
* You'll need to replace those values with the correct cell counts.

WEIGHT BY N. /* NOTICE the WEIGHT command here.

CROSSTABS Op by Y by Male by AgeGrp.

LOGISTIC REGRESSION VARIABLES Y
/METHOD=ENTER Op Male AgeGrp
/CONTRAST (Op)=Indicator(1)
/CONTRAST (Male)=Indicator(1)
/CONTRAST (AgeGrp)=Indicator(1)
/PRINT=CI(95)
/CRITERIA=PIN(0.05) POUT(0.10) ITERATE(20) CUT(0.5).

* Notice that the Case Processing Summary table for LOGISTIC REGRESSION
* shows the unweighted cases. The Classification Table further down in the
* output will show the total number of cases used in the analysis.

* Alternatively, use GENLIN with binomial error distribution and logit link to estimate the model.
* The case processing summary from GENLIN shows both weighted and unweighted N.
* Generalized Linear Models.
GENLIN Y (REFERENCE=FIRST) BY Op Male AgeGrp (ORDER=DESCENDING)
/MODEL Op Male AgeGrp INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT
/EMMEANS TABLES=Op SCALE=TRANSFORMED
/EMMEANS TABLES=Male SCALE=TRANSFORMED
/EMMEANS TABLES=AgeGrp SCALE=TRANSFORMED
/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

DemPyro

2019-11-04 04:40:43 UTC

Permalink

Post by Bruce Weaver

Look up the WEIGHT command.
https://www.ibm.com/support/knowledgecenter/en/SSLVMB_26.0.0/statistics_reference_project_ddita/spss/base/syn_weight.html
Here is an example.
* Generate some fake data to show the file structure.
* NOTE that the N-variable will hold randomly generated values,
* and that you will have to replace them with the correct values.
SET RNG=MT MTINDEX=-12345.
NEW FILE.
DATASET CLOSE ALL.
INPUT PROGRAM.
NUMERIC Y Op Male AgeGrp N (F5.0).
LEAVE Y Op Male.
LOOP Y = 0 TO 1.
LOOP Op = 0 TO 1.
LOOP Male = 0 to 1.
LOOP AgeGrp = 0 TO 3.
COMPUTE N = TRUN(RV.UNIFORMI(25,100)).
END CASE.
END LOOP.
END LOOP.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
* NOTE: This shows the structure of the needed data file,
* but the N variable holds a randomly generated number.
* You'll need to replace those values with the correct cell counts.
WEIGHT BY N. /* NOTICE the WEIGHT command here.
CROSSTABS Op by Y by Male by AgeGrp.
LOGISTIC REGRESSION VARIABLES Y
/METHOD=ENTER Op Male AgeGrp
/CONTRAST (Op)=Indicator(1)
/CONTRAST (Male)=Indicator(1)
/CONTRAST (AgeGrp)=Indicator(1)
/PRINT=CI(95)
/CRITERIA=PIN(0.05) POUT(0.10) ITERATE(20) CUT(0.5).
* Notice that the Case Processing Summary table for LOGISTIC REGRESSION
* shows the unweighted cases. The Classification Table further down in the
* output will show the total number of cases used in the analysis.
* Alternatively, use GENLIN with binomial error distribution and logit link to estimate the model.
* The case processing summary from GENLIN shows both weighted and unweighted N.
* Generalized Linear Models.
GENLIN Y (REFERENCE=FIRST) BY Op Male AgeGrp (ORDER=DESCENDING)
/MODEL Op Male AgeGrp INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT
/EMMEANS TABLES=Op SCALE=TRANSFORMED
/EMMEANS TABLES=Male SCALE=TRANSFORMED
/EMMEANS TABLES=AgeGrp SCALE=TRANSFORMED
/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

Hi Bruce,

thanks for the explanation. How would my method of regression change such that it takes into account the fact that the age buckets are ordinal? Would I merely treat them as nominal instead? Is there another regression that allows a dichotomous dependent variable with both nominal and ordinal independent variables?

Thanks

Bruce Weaver

2019-11-04 12:30:12 UTC

Permalink

Post by DemPyro

Post by Bruce Weaver

Look up the WEIGHT command.
https://www.ibm.com/support/knowledgecenter/en/SSLVMB_26.0.0/statistics_reference_project_ddita/spss/base/syn_weight.html
Here is an example.
* Generate some fake data to show the file structure.
* NOTE that the N-variable will hold randomly generated values,
* and that you will have to replace them with the correct values.
SET RNG=MT MTINDEX=-12345.
NEW FILE.
DATASET CLOSE ALL.
INPUT PROGRAM.
NUMERIC Y Op Male AgeGrp N (F5.0).
LEAVE Y Op Male.
LOOP Y = 0 TO 1.
LOOP Op = 0 TO 1.
LOOP Male = 0 to 1.
LOOP AgeGrp = 0 TO 3.
COMPUTE N = TRUN(RV.UNIFORMI(25,100)).
END CASE.
END LOOP.
END LOOP.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
* NOTE: This shows the structure of the needed data file,
* but the N variable holds a randomly generated number.
* You'll need to replace those values with the correct cell counts.
WEIGHT BY N. /* NOTICE the WEIGHT command here.
CROSSTABS Op by Y by Male by AgeGrp.
LOGISTIC REGRESSION VARIABLES Y
/METHOD=ENTER Op Male AgeGrp
/CONTRAST (Op)=Indicator(1)
/CONTRAST (Male)=Indicator(1)
/CONTRAST (AgeGrp)=Indicator(1)
/PRINT=CI(95)
/CRITERIA=PIN(0.05) POUT(0.10) ITERATE(20) CUT(0.5).
* Notice that the Case Processing Summary table for LOGISTIC REGRESSION
* shows the unweighted cases. The Classification Table further down in the
* output will show the total number of cases used in the analysis.
* Alternatively, use GENLIN with binomial error distribution and logit link to estimate the model.
* The case processing summary from GENLIN shows both weighted and unweighted N.
* Generalized Linear Models.
GENLIN Y (REFERENCE=FIRST) BY Op Male AgeGrp (ORDER=DESCENDING)
/MODEL Op Male AgeGrp INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT
/EMMEANS TABLES=Op SCALE=TRANSFORMED
/EMMEANS TABLES=Male SCALE=TRANSFORMED
/EMMEANS TABLES=AgeGrp SCALE=TRANSFORMED
/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

Hi Bruce,
thanks for the explanation. How would my method of regression change such that it takes into account the fact that the age buckets are ordinal? Would I merely treat them as nominal instead? Is there another regression that allows a dichotomous dependent variable with both nominal and ordinal independent variables?
Thanks

You could use polynomial contrasts to examine your age category variable. That will be easier, I think, if you estimate your model via UNIANOVA rather than REGRESSION. In the former, treat categorical variables as "fixed factors" and quantitative variables as "covariates". IIRC, UNIANOVA has a built-in polynomial contrast option.

HTH.

Continue reading on narkive:

Search results for 'Proper Route of Analysis?' (Questions and Answers)

replies

Do you Think the Japan尖閣諸島 --China钓鱼岛 Dispute is also America's Problem?

started 2013-01-30 00:50:11 UTC

current events

replies

I'm a freshman in high school but im like already wondering about college? Is it really that hard?

started 2011-01-17 13:09:02 UTC

higher education (university +)

replies

Some vegetarians claim we are not designed to eat meat yet we are not designed to eat dairy either?