I have a dataset which I want to create a new column that is based on a division of two other columns using a for-loop with if-conditions.
This is the dataset, with the empty 'solo_fare' column created beforehand.
The task is to loop through each row and divide 'Fare' by 'relatives' to get the per-passenger fare. However, there are certain if-conditions to follow (passengers in this category should see per-passenger prices of between 3 and 8)
The code I have tried here doesn't seem to fill in the 'solo_fare' rows at all. It returns an empty column (same as above df).
for i in range(0, len(fare_result)):
p = fare_result.iloc[i]['Fare']/fare_result.iloc[i]['relatives']
q = fare_result.iloc[i]['Fare']
r = fare_result.iloc[i]['relatives']
# if relatives == 0, return original Fare amount
if (r == 0):
fare_result.iloc[i]['solo_fare'] = q
# if the divided fare is below 3 or more than 8, return original Fare amount again
elif (p < 3) or (p > 8):
fare_result.iloc[i]['solo_fare'] = q
# else, return the divided fare to get solo_fare
else:
fare_result.iloc[i]['solo_fare'] = p
How can I get this to work?
CodePudding user response:
You should probably not use a loop for this but instead just use loc
if you first create the 'solo fare' column and give every row the default value from Fare you can then change the value for the conditions you have set out
fare_result['solo_fare'] = fare_result['Fare']
fare_results.loc[(
(fare_results.Fare / fare_results.relatives) >= 3) & (
(fare_results.Fare / fare_results.relatives) <= 8), 'solo_fare'] = (
fare_results.Fare / fare_results.relatives)
CodePudding user response:
Did you try to initialize those new colums first ?
By that I mean that the statement fare_result.iloc[i]['solo_fare'] = q
only means that you are assigning the value q to the field solo_fare
of the line i
The issue there is that at this moment, the line i does not have any solo_fare
key. Hence, you are only filling the last value of your table here.
To solve this issue, try declaring the solo_fare
column before the for loop like:
fare_result['solo_fare'] = np.nan