Discussion:
RTRIM -Simplying syntax so my computer doesnt crash
(too old to reply)
Erin Holloway
2020-07-30 04:30:24 UTC
Permalink
My data looks a bit like this:
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4

I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).

Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.

It is getting stuck on the 'list' and never makes it through.

Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance

STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
Rich Ulrich
2020-07-30 07:07:11 UTC
Permalink
On Wed, 29 Jul 2020 21:30:24 -0700 (PDT), Erin Holloway
Post by Erin Holloway
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Are you saying that, Yes, it does work when there
are few lines, or when you specify LIST CASES TO 100. ?

The first legitimate failure I think of is running out of disc.
That seems abnormal and wrong. 500 000 cases is no longer
too huge to process.

What are the symptoms of your crashes?
Post by Erin Holloway
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
Thousands of variables to list on one line? That might cause
some upset if you are writing a HUGE format across. The
default used to be to WRAP, which I learned to avoid.
Post by Erin Holloway
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
I wonder why you are showing us the Delete vars and Rename.
Once upon a time, LIST was not a procedure; it set a switch to LIST
after a procedure caused cases to be read. If your SPSS is that old,
then just running the sytax down through LIST will do nothing --
SPSS will sit there waiting for a procedure or EXE.

If your SPSS is that old, I don't know what happens when variables
are renamed between LIST and EXE, since LIST in that case would not
be "performed" in the order written in syntax.

For a newer SPSS, please describe the symptoms of "crash".


I find the below variaton of syntax easier to read.
* append | to the value so that one is always found.
STRING #temp(A11) , new1 to new10(A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #temp= CONCAT( RTRIM(old), '|' ) .
COMPUTE #L = CHAR.INDEX(#temp,"|") - 1.
COMPUTE new = CHAR.SUBSTR(#temp,1,#L).
END REPEAT.

COMMENT + will concatenate in file names, maybe not in COMPUTE.
COMMENT If not, use the concatenation function to combine.
--
Rich Ulrich
Erin Holloway
2020-08-20 05:17:58 UTC
Permalink
Sorry for my late response, I have had some time off.
Thanks for that Rich that works perfectly.

With regards to your questions- the 1000’s of variables are split into files, so wont impact…but just don’t want to rename them all. Also why I included the delete/rename syntax.

Bruce, can be numeric, no need for string for the new variable.

I have another one you guys might be able to help with...but ill post a new string.

Erin
Post by Rich Ulrich
On Wed, 29 Jul 2020 21:30:24 -0700 (PDT), Erin Holloway
Post by Erin Holloway
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Are you saying that, Yes, it does work when there
are few lines, or when you specify LIST CASES TO 100. ?
The first legitimate failure I think of is running out of disc.
That seems abnormal and wrong. 500 000 cases is no longer
too huge to process.
What are the symptoms of your crashes?
Post by Erin Holloway
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
Thousands of variables to list on one line? That might cause
some upset if you are writing a HUGE format across. The
default used to be to WRAP, which I learned to avoid.
Post by Erin Holloway
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
I wonder why you are showing us the Delete vars and Rename.
Once upon a time, LIST was not a procedure; it set a switch to LIST
after a procedure caused cases to be read. If your SPSS is that old,
then just running the sytax down through LIST will do nothing --
SPSS will sit there waiting for a procedure or EXE.
If your SPSS is that old, I don't know what happens when variables
are renamed between LIST and EXE, since LIST in that case would not
be "performed" in the order written in syntax.
For a newer SPSS, please describe the symptoms of "crash".
I find the below variaton of syntax easier to read.
* append | to the value so that one is always found.
STRING #temp(A11) , new1 to new10(A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #temp= CONCAT( RTRIM(old), '|' ) .
COMPUTE #L = CHAR.INDEX(#temp,"|") - 1.
COMPUTE new = CHAR.SUBSTR(#temp,1,#L).
END REPEAT.
COMMENT + will concatenate in file names, maybe not in COMPUTE.
COMMENT If not, use the concatenation function to combine.
--
Rich Ulrich
Bruce Weaver
2020-07-30 13:12:22 UTC
Permalink
Post by Erin Holloway
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
Hi Erin. Why would you not make your new variable numeric rather than string? E.g.,

NEW FILE.
DATASET CLOSE ALL.
DATA LIST LIST / sp_counts_1 sp_counts_2 (2A10).
BEGIN DATA
"1|2|4" "2|3|5"
"1" "2"
"4" "5"
"5|7" "6|8"
"8|9|2|4" "9|0|3|5"
"10|11|12" "13"
END DATA.
LIST.

RENAME VARIABLES (sp_counts_1 TO sp_counts_2 = old1 to old2).
DO REPEAT old = old1 TO old2 / new = sp_counts_1 TO sp_counts_2.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = NUMBER(CHAR.SUBSTR(old,1,#L),F5.0).
END REPEAT.
FORMATS sp_counts_1 TO sp_counts_2 (F5.0).
LIST.
DELETE VARIABLES old1 to old2.

Output from the final LIST command:

old1 old2 sp_counts_1 sp_counts_2

1|2|4 2|3|5 1 2
1 2 1 2
4 5 4 5
5|7 6|8 5 6
8|9|2|4 9|0|3|5 8 9
10|11|12 13 10 13


Number of cases read: 6 Number of cases listed: 6
Loading...