Discussion:
Combining information from duplicate cases into one case
(too old to reply)
Jacob
2010-02-10 13:21:32 UTC
Permalink
Hi

I have a problem with a very large dataset containing information on
blood samples acquired at different dates/times in 4500 patients. As
more samples are provided per patient the total number of entries is
apprx. 14000.

What I need to do is to join information from multiple rows with the
same unique identifier into one "case" with information on all
samples. I need to do this in order to merge the information with
another dataset with cases identified by the same unique ID.

Example:

What I have:

ID Result Sample date
010160 1.3 010209 12.30
010160 2.9 010209 22:30
150347 0.9 130909 15:00
150347 0.8 130909 23:45
200820 3.2 310809 09:30
200820 4.1 310809 18:00



What I need:

ID Result1 Sample date1 Result2 Sample date2
010160 1.3 010209 12.30 2.9 010209 22:30
150347 0.9 130909 15:00 0.8 130909 23:45
200820 3.2 310809 09:30 4.1 310809 18:00

Each patient can have from 1 to 12 blood samples drawn over the course
of a year.

Can someone please help?

Thank you!

Jacob Sorensen
Bruce Weaver
2010-02-10 13:57:55 UTC
Permalink
Post by Jacob
Hi
I have a problem with a very large dataset containing information on
blood samples acquired at different dates/times in 4500 patients. As
more samples are provided per patient the total number of entries is
apprx. 14000.
What I need to do is to join information from multiple rows with the
same unique identifier into one "case" with information on all
samples. I need to do this in order to merge the information with
another dataset with cases identified by the same unique ID.
ID      Result  Sample date
010160  1.3     010209 12.30
010160  2.9     010209 22:30
150347  0.9     130909 15:00
150347  0.8     130909 23:45
200820  3.2     310809 09:30
200820  4.1     310809 18:00
ID      Result1 Sample date1    Result2 Sample date2
010160  1.3     010209 12.30    2.9     010209 22:30
150347  0.9     130909 15:00    0.8     130909 23:45
200820  3.2     310809 09:30    4.1     310809 18:00
Each patient can have from 1 to 12 blood samples drawn over the course
of a year.
Can someone please help?
Thank you!
Jacob Sorensen
Look up the CASESTOVARS command. You should find examples in the Help
files. You could also take a look at this tutorial:

www.ats.ucla.edu/stat/spss/modules/reshapew115.htm

--
Bruce Weaver
***@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."
Jacob
2010-02-10 14:44:40 UTC
Permalink
Thank you very much.

I had a few problems with missing data, but everything is sorted out
now.


Jacob
m***@gmail.com
2018-01-22 06:50:04 UTC
Permalink
Hello, i am having the same problem. How did u fix it?
Yoyrs,
Mohamed
David Marso
2018-01-22 21:00:25 UTC
Permalink
You would do better to describe your specific problem in more depth and perhaps start a new thread rather than hop on top of a very old post? As Bruce stated CASESTOVARS addresses the original poster's question.
Rich Ulrich
2010-02-10 17:13:10 UTC
Permalink
On Wed, 10 Feb 2010 05:21:32 -0800 (PST), Jacob
Post by Jacob
Hi
I have a problem with a very large dataset containing information on
blood samples acquired at different dates/times in 4500 patients. As
more samples are provided per patient the total number of entries is
apprx. 14000.
What I need to do is to join information from multiple rows with the
same unique identifier into one "case" with information on all
samples. I need to do this in order to merge the information with
another dataset with cases identified by the same unique ID.
[snip, example]

No, you do not need to put all the data into one row in order
to do matching on ID.

You can use FILE= and TABLE= in order to match
one record in one file to several records in the other with the
same ID.

*If* it is better to have the separate lines, then don't
bother making a single line for each case. Of course, if the
new file format is better for you, then you do want to use
cases-to-vars.
--
Rich Ulrich
Loading...