I am trying to define a function with a for for loop that will iterate the column weight and return a list of patient names that have a weight <= 150. I'm honestly just confused about how I should go about this. Any help will be much appreciated.
df: Patient Weight LDL
0 Rob 200 100
1 Bob 150 150
2 John 184 102
3 Phil 120 200
4 Jessica 100 143
# List of Tuples
Patients = [('Rob', 200, 100),
('Bob', 150, 150),
('John', 184, 102),
('Phil', 120, 200),
('Jessica', 100, 143 )
]
# Create a DataFrame object
df = pd.DataFrame(Patients, columns =['Patient', 'Weight', 'LDL'],
index =['0','1', '2', '3', '4'])
df
def greater_150(df, outcome = 'Weight'):
new_list = []
patient = df['Patient']
for column in df[['Patient', 'Weight']]:
if outcome <= 150:
new_list.append(patient)
return new_list
Ideally the Output I would want:
[ Rob, Bob, John]
TypeError:
'<=' not supported between instances of 'str' and 'int'
CodePudding user response:
Here's a simple approach that avoids iteration (as is typically ideal when pandas is involved).
df[df["Weight"] >= 150].Patient
returns the following pandas series:
0 Rob
1 Bob
2 John
Name: Patient, dtype: object
If you want, you can make this into a list with df[df["Weight"] >= 150].Patient.tolist()
, which yields ['Rob', 'Bob', 'John']
.
CodePudding user response:
Generally avoid iterations, as the answer by Ben points out. But if you want to learn how to do it, here's your function modified to iterate through the rows (not the columns!):
def greater_150(df, outcome = 'Weight'):
new_list = []
for index, data in df.iterrows():
if data[outcome] >= 150:
new_list.append(data["Patient"])
return new_list
CodePudding user response:
Try the following:
def greater_150(df):
return df.loc[df["Weight"] >= 150].Patient.tolist()