How to assign new items to selected column in an existing Pandas csv file-CodePudding

I have a problem. As a result, I get the last item. Please help.

df = pd.read_csv('patient_data_set_copy_test1.csv')
for index, row in df.iterrows():
    if row['sex'] == 'Men':
        # df1 = pd.DataFrame(colors)
        row['height_p'] : random.randint(149, 192)
        row.to_csv('patient_data_set_copy_test1.csv', header=False)

This is start file CSV:

id,sex,age,weight_p,height_p,BMI,Smoke,Smoke_Years,Smoke_amount_day,Chol_All,LDL,HDL,Sugar1,Sugar2,Sugar3,Systolic_pressure,Diastolic_presurre,Likelihood_of_obesity,Likelihood_of_diabetes,Likelihood_of_coronary_heart_disease
0,Woman,45,,,,Nie,,,,,,,,,,,,,
1,Man,41,,,,Nie,,,,,,,,,,,,,
2,Woman,26,,,,Tak,,,,,,,,,,,,,
3,Men,72,,,,Nie,,,,,,,,,,,,,
4,Woman,69,,,,Tak,,,,,,,,,,,,,
.
.
.
11342, Man,41,,,,Nie,,,,,,,,,,,,,

This is the result:

id,11357
sex,Men
age,82.0
weight_p,
height_p,173
BMI,
Smoke,Tak
Smoke_Years,
Smoke_amount_day,
Chol_All,
LDL,
HDL,
Sugar1,
Sugar2,
Sugar3,
Systolic_pressure,
Diastolic_presurre,
Likelihood_of_obesity,
Likelihood_of_diabetes,
Likelihood_of_coronary_heart_disease,

I would like to get the exact index of the selected man, and then update the CSV file. Thanks for all the replies.

CodePudding user response：

IIUC, there is no need for a loop here, you can use numpy.where with pandas.Series.eq.

Try this :

import pandas as pd
import numpy as np

df = pd.read_csv('patient_data_set_copy_test1.csv')

df['height_p'] = np.where(df['sex'].eq('Men'), np.random.randint(149, 192), np.NaN) 

df.to_csv('patient_data_set_copy_test1.csv', index=False) #to overwrite the old file

CodePudding user response：

To update the height_p column for all rows in your DataFrame that have a sex of "Men" with a random integer between 149 and 192:

import random

df = pd.read_csv('patient_data_set_copy_test1.csv')

# Iterate over the rows of the DataFrame
for index, row in df.iterrows():
    if row['sex'] == 'Men':
        # Update the height_p column with a random integer between 149 and 192
        df.loc[index, 'height_p'] = random.randint(149, 192)

# Save the updated DataFrame to the CSV file
df.to_csv('patient_data_set_copy_test1.csv', header=True)