I have a problem. As a result, I get the last item. Please help.
df = pd.read_csv('patient_data_set_copy_test1.csv')
for index, row in df.iterrows():
if row['sex'] == 'Men':
# df1 = pd.DataFrame(colors)
row['height_p'] : random.randint(149, 192)
row.to_csv('patient_data_set_copy_test1.csv', header=False)
This is start file CSV:
id,sex,age,weight_p,height_p,BMI,Smoke,Smoke_Years,Smoke_amount_day,Chol_All,LDL,HDL,Sugar1,Sugar2,Sugar3,Systolic_pressure,Diastolic_presurre,Likelihood_of_obesity,Likelihood_of_diabetes,Likelihood_of_coronary_heart_disease
0,Woman,45,,,,Nie,,,,,,,,,,,,,
1,Man,41,,,,Nie,,,,,,,,,,,,,
2,Woman,26,,,,Tak,,,,,,,,,,,,,
3,Men,72,,,,Nie,,,,,,,,,,,,,
4,Woman,69,,,,Tak,,,,,,,,,,,,,
.
.
.
11342, Man,41,,,,Nie,,,,,,,,,,,,,
This is the result:
id,11357
sex,Men
age,82.0
weight_p,
height_p,173
BMI,
Smoke,Tak
Smoke_Years,
Smoke_amount_day,
Chol_All,
LDL,
HDL,
Sugar1,
Sugar2,
Sugar3,
Systolic_pressure,
Diastolic_presurre,
Likelihood_of_obesity,
Likelihood_of_diabetes,
Likelihood_of_coronary_heart_disease,
I would like to get the exact index of the selected man, and then update the CSV file. Thanks for all the replies.
CodePudding user response:
IIUC, there is no need for a loop here, you can use numpy.where
with pandas.Series.eq
.
Try this :
import pandas as pd
import numpy as np
df = pd.read_csv('patient_data_set_copy_test1.csv')
df['height_p'] = np.where(df['sex'].eq('Men'), np.random.randint(149, 192), np.NaN)
df.to_csv('patient_data_set_copy_test1.csv', index=False) #to overwrite the old file
CodePudding user response:
To update the height_p column for all rows in your DataFrame that have a sex of "Men" with a random integer between 149 and 192:
import random
df = pd.read_csv('patient_data_set_copy_test1.csv')
# Iterate over the rows of the DataFrame
for index, row in df.iterrows():
if row['sex'] == 'Men':
# Update the height_p column with a random integer between 149 and 192
df.loc[index, 'height_p'] = random.randint(149, 192)
# Save the updated DataFrame to the CSV file
df.to_csv('patient_data_set_copy_test1.csv', header=True)