Erin Holloway
2020-07-30 04:30:24 UTC
My data looks a bit like this:
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.