Home > Software engineering >  How to group unique IDs to a single ID and put into new column?
How to group unique IDs to a single ID and put into new column?

Time:09-28

I have a dataframe that look like this:

HR O2Sat Temp DBP Resp P_ID
96 99.2 36.50 60.0 10.0 1
95 100.0 39.50 68.5 12.0 1
110 85.8 37.95 58.5 19.2 2
100 95.5 35.45 45.9 11.5 2
89 98.0 38.10 65.4 10.5 3
98 100.0 36.50 49.5 15.8 3
102 100.0 37.45 60.0 16.0 4
115 95.0 38.05 55.8 14.5 4

I want to have the 1st two rows (P_ID = 1) under single ID (ID = 1), and so on. So that when I do analysis on basis of patient ID, it will consider all rows having that ID as a group, instead of dealing them as separate rows. Like this: pic

How I can do that? Please help.


Update:

This will be my desired output of my dataframe. This is the time-series data from PhysioNet Sepsis Challenge 2019. I considered every row as a separate patient. But this is not good approach. I want to do my analysis patient wise no matter how many entries (rows) that patient has. Not considering individual row as a separate patient, but the individual row as a data of that patient having unique ID.

wd

CodePudding user response:

Let's imagine the following DataFrame:

  col1  col2
0    A     1
1    A     2
2    A     3
3    B     4
4    B     5
5    C     6
6    C     7
7    D     8

You can mask the duplicated keys:

df['col1'] = df.mask(df['col1'].duplicated(), '')

output:

  col1  col2
0    A     1
1          2
2          3
3    B     4
4          5
5    C     6
6          7
7    D     8

NB. be aware that the content of the dataframe changes, now you cannot use the value A in row 2 for instance

CodePudding user response:

You can try:

df.loc[df['P_ID'].eq(df['P_ID'].shift()), 'P_ID'] = ''
df = df.set_index('P_ID').reset_index()

Output:

>>> df
  P_ID   HR  O2Sat   Temp   DBP  Resp
0    1   96   99.2  36.50  60.0  10.0
1        95  100.0  39.50  68.5  12.0
2    2  110   85.8  37.95  58.5  19.2
3       100   95.5  35.45  45.9  11.5
4    3   89   98.0  38.10  65.4  10.5
5        98  100.0  36.50  49.5  15.8
6    4  102  100.0  37.45  60.0  16.0
7       115   95.0  38.05  55.8  14.5
  • Related