I’m looking at this but I have no idea how to formulate it: Change Value of a Dataframe Column Based on a Filter
I need to change the values in medianIncome with values of 0.4999 or lower to 0.4999 or if 15.0001 or higher to 15.0001.
Here's sample data:
id longitude_x latitude ocean_proximity longitude_y state medianHouseValue housingMedianAge totalBedrooms totalRooms households population medianIncome
0 1 -122.23 37.88 NEAR BAY -122.23 CA 452.603 45.0 131.0 884.0 130.0 323.0 83252.0
1 396 -122.34 37.88 NEAR BAY -122.23 CA 350.004 41.0 930.0 3063.0 926.0 2560.0 17375.0
2 398 -122.29 37.88 NEAR BAY -122.23 CA 216.703 54.0 263.0 1211.0 230.0 525.0 38672.0
3 401 -122.28 37.88 NEAR BAY -122.23 CA 261.303 55.0 333.0 1845.0 335.0 772.0 42614.0
4 424 -122.26 37.88 NEAR BAY -122.23 CA 391.803 53.0 418.0 2553.0 404.0 898.0 62425.0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
929044 9476 -123.38 39.37 INLAND -121.24 CA 124.601 20.0 813.0 3947.0 732.0 1902.0 26424.0
929045 9494 -123.75 39.37 INLAND -121.24 CA 151.403 20.0 299.0 1377.0 282.0 830.0 32500.0
929046 10065 -121.03 39.37 INLAND -121.24 CA 85.000 15.0 327.0 1338.0 310.0 1174.0 26341.0
929047 10074 -120.10 39.37 INLAND -121.24 CA 117.301 34.0 411.0 2328.0 373.0 1016.0 45208.0
929048 21558 -121.24 39.37 INLAND -121.24 CA 89.401 18.0 616.0 2787.0 532.0 1387.0 23886.0
It shows:
np.where(df['x'] > 0 & df['y'] < 10, 1, 0)
So I'm at:
np.where(housing['medianIncome'] > 15.0001
And I'm stuck as to the rest. Only using pandas and numpy, not able to use lambda.
I'm expecting an outcome that won't give an error. As of yet, I don't have an outcome.
CodePudding user response:
Use Series.clip
:
housing = pd.DataFrame({'medianIncome':[20,5,0.07]})
housing['medianIncome'] = housing['medianIncome'].clip(upper=15.0001, lower=0.4999)
print (housing)
medianIncome
0 15.0001
1 5.0000
2 0.4999
Alternative with numpy.select
if need set another values by conditions:
housing['medianIncome'] = np.select([housing['medianIncome'].lt(0.4999),
housing['medianIncome'].gt(15.0001)],
[0,1],
default=housing['medianIncome'])
print (housing)
medianIncome
0 1.0
1 5.0
2 0.0