Discussion:
Converting only part of string to numeric variable
(too old to reply)
Merel Bakker
2022-05-30 13:42:27 UTC
Permalink
I want to convert part of a string variabel into a numeric variabel.
Not via the easy way with ''recode into different variables'', were for example mild -> 0, moderate -> 1, severe -> 2. Where I can change the numbers in labels with 'value' labels.
But a string variable with a width of 2427 characters, with mostly long sentences in them. From those long sentences I want to extract 1 word, and recode that into a numeric variable. So for example from the string: ''there is an infection with toxoplasmosis'', or 'infection with parvo', I want the system to only select strings with sentences that contains the word 'toxoplasmosis'. So far I am not succesfull :)... do I need to write a specific syntax for that?

Many many thanks in advance!

Merel
Rich Ulrich
2022-06-03 18:01:47 UTC
Permalink
On Mon, 30 May 2022 06:42:27 -0700 (PDT), Merel Bakker
Post by Merel Bakker
I want to convert part of a string variabel into a numeric variabel.
Not via the easy way with ''recode into different variables'', were
for example mild -> 0, moderate -> 1, severe -> 2. Where I can change
the numbers in labels with 'value' labels.
Post by Merel Bakker
But a string variable with a width of 2427 characters, with mostly
long sentences in them. From those long sentences I want to extract 1
word, and recode that into a numeric variable. So for example from the
string: ''there is an infection with toxoplasmosis'', or 'infection
with parvo', I want the system to only select strings with sentences
that contains the word 'toxoplasmosis'. So far I am not succesfull
:)... do I need to write a specific syntax for that?
Yes, you probably need to write specific syntax. See documentation
for char.index( ) -- beware that the matches are case-sensitive.

The exception that I think of would be if your long strings were
computer generated so that two long strings with the same clue-
word would be identical. In that case, you could use "Autorecode"
to obtain a set of numbers, 1-n, for n unique responses. I expect
that the value labels, which Autorecode creates from the values,
would be truncations of the long strings.
--
Rich Ulrich
Loading...