I have a timeseries data of 5864 ICU Patients and my dataframe is like this. Each row is the ICU stay of respective patient at a particular hour.
HR | SBP | DBP | ICULOS | Sepsis | P_ID |
---|---|---|---|---|---|
92 | 120 | 80 | 1 | 0 | 0 |
98 | 115 | 85 | 2 | 0 | 0 |
93 | 125 | 75 | 3 | 1 | 0 |
95 | 130 | 90 | 4 | 1 | 0 |
102 | 120 | 80 | 1 | 0 | 1 |
109 | 115 | 75 | 2 | 0 | 1 |
94 | 135 | 100 | 3 | 0 | 1 |
97 | 100 | 70 | 4 | 1 | 1 |
85 | 120 | 80 | 5 | 1 | 1 |
88 | 115 | 75 | 6 | 1 | 1 |
93 | 125 | 85 | 1 | 0 | 2 |
78 | 130 | 90 | 2 | 0 | 2 |
115 | 140 | 110 | 3 | 0 | 2 |
102 | 120 | 80 | 4 | 0 | 2 |
98 | 140 | 110 | 5 | 1 | 2 |
I want to select the ICULOS where Sepsis = 1 (first hour only) based on patient ID. Like in P_ID = 0, Sepsis = 1 at ICULOS = 3. I did this on a single patient (the dataframe having data of only a single patient) using the code:
x = df[df['Sepsis'] == 1]["ICULOS"].values[0]
print("ICULOS at which Sepsis Label = 1 is:", x)
# Output
ICULOS at which Sepsis Label = 1 is: 46
If I want to check it for each P_ID, I have to do this 5864 times. Can someone help me with the code using a loop? The loop will go to each P_ID and then give the result of ICULOS where Sepsis = 1. Looking forward for help.
CodePudding user response:
for x in df['P_ID'].unique():
print(df.query('P_ID == @x and Sepsis == 1')['ICULOS'][0])
CodePudding user response:
First, filter the rows which have Sepsis=1. It will automatically filter the P_IDs which don't have Sepsis as 1. Thus, you will have fewer patients to iterate.
df1 = df[df.Sepsis==1]
for pid in df.P_ID.unique():
if pid not in df.P_ID:
print("P_ID: {pid} - it has no iclus at Sepsis Lable = 1")
else:
iclus = df1[df1.P_ID==pid].ICULOS.values[0]
print(f"P_ID: {pid} - ICULOS at which Sepsis Label = 1 is: {iclus}")