In python, I'm dealing with a sales data of about 21,000 houses. Each house (row) has columns 'sqft_living' (int64), and 'grade' (int64) among others. I want to increase the 'grade' of every house by 2 if its sqft_living > 400, or by 1 otherwise. How do I do this?
CodePudding user response:
df = pd.DataFrame()
df['sqft_living'] = range(396, 405)
df['grade'] = range(9)
df
So we have:
sqft_living grade
0 396 0
1 397 1
2 398 2
3 399 3
4 400 4
5 401 5
6 402 6
7 403 7
8 404 8
We apply the condition:
df['grade'] = df.apply( lambda row: row.grade 1 if row.sqft_living < 400 else row.grade 2 , axis=1)
df
New dataframe:
sqft_living grade
0 396 1
1 397 2
2 398 3
3 399 4
4 400 6
5 401 7
6 402 8
7 403 9
8 404 10