How to group unique IDs to a single ID and put into new column?-CodePudding

I have a dataframe that look like this:

HR	O2Sat	Temp	DBP	Resp	P_ID
96	99.2	36.50	60.0	10.0	1
95	100.0	39.50	68.5	12.0	1
110	85.8	37.95	58.5	19.2	2
100	95.5	35.45	45.9	11.5	2
89	98.0	38.10	65.4	10.5	3
98	100.0	36.50	49.5	15.8	3
102	100.0	37.45	60.0	16.0	4
115	95.0	38.05	55.8	14.5	4

I want to have the 1st two rows (P_ID = 1) under single ID (ID = 1), and so on. So that when I do analysis on basis of patient ID, it will consider all rows having that ID as a group, instead of dealing them as separate rows. Like this:

How I can do that? Please help.

Update:

This will be my desired output of my dataframe. This is the time-series data from PhysioNet Sepsis Challenge 2019. I considered every row as a separate patient. But this is not good approach. I want to do my analysis patient wise no matter how many entries (rows) that patient has. Not considering individual row as a separate patient, but the individual row as a data of that patient having unique ID.

CodePudding user response：

Let's imagine the following DataFrame:

  col1  col2
0    A     1
1    A     2
2    A     3
3    B     4
4    B     5
5    C     6
6    C     7
7    D     8

You can mask the duplicated keys:

df['col1'] = df.mask(df['col1'].duplicated(), '')

output:

  col1  col2
0    A     1
1          2
2          3
3    B     4
4          5
5    C     6
6          7
7    D     8

NB. be aware that the content of the dataframe changes, now you cannot use the value A in row 2 for instance

CodePudding user response：

You can try:

df.loc[df['P_ID'].eq(df['P_ID'].shift()), 'P_ID'] = ''
df = df.set_index('P_ID').reset_index()

Output:

>>> df
  P_ID   HR  O2Sat   Temp   DBP  Resp
0    1   96   99.2  36.50  60.0  10.0
1        95  100.0  39.50  68.5  12.0
2    2  110   85.8  37.95  58.5  19.2
3       100   95.5  35.45  45.9  11.5
4    3   89   98.0  38.10  65.4  10.5
5        98  100.0  36.50  49.5  15.8
6    4  102  100.0  37.45  60.0  16.0
7       115   95.0  38.05  55.8  14.5