I'm trying to convert kilometer values in one column of a dataframe to mile values. I've tried various things and this is what I have now:
def km_dist(column, dist):
length = len(column)
for dist in zip(range(length), column):
if (column == data["dist"] and dist in data.loc[(data["dist"] > 25)]):
return dist / 5820
else:
return dist
data = data.apply(lambda x: km_dist(data["dist"], x), axis=1)
The dataset I'm working with looks something like this:
past_score dist income lab score gender race income_bucket plays_sports student_id lat long
0 8.091553 11.586920 67111.784934 0 7.384394 male H 3 0 1 0.0 0.0
1 8.091553 11.586920 67111.784934 0 7.384394 male H 3 0 1 0.0 0.0
2 7.924539 7858.126614 93442.563796 1 10.219626 F W 4 0 2 0.0 0.0
3 7.924539 7858.126614 93442.563796 1 10.219626 F W 4 0 2 0.0 0.0
4 7.726480 11.057883 96508.386987 0 8.544586 M W 4 0 3 0.0 0.0
With my code above, I'm trying to loop through all the "dist" values and if those values are in the right column ("data["dist"]") and greater than 25, divide those values by 5820 (the number of feet in a kilometer). More generally, I'd like to find a way to operate on specific elements of dataframes. I'm sure this is at least a somewhat common question, I just haven't been able to find an answer for it. If someone could point me towards somewhere with an answer, I would be just as happy.
CodePudding user response:
Instead your solution filter rows with mask and divide column dist
by 5820
:
data.loc[data["dist"] > 25, 'dist'] /= 5820
Working same like:
data.loc[data["dist"] > 25, 'dist'] = data.loc[data["dist"] > 25, 'dist'] / 5820
data.loc[data["dist"] > 25, 'dist'] /= 5820
print (data)
past_score dist income lab score gender race \
0 8.091553 11.586920 67111.784934 0 7.384394 male H
1 8.091553 11.586920 67111.784934 0 7.384394 male H
2 7.924539 1.350194 93442.563796 1 10.219626 F W
3 7.924539 1.350194 93442.563796 1 10.219626 F W
4 7.726480 11.057883 96508.386987 0 8.544586 M W
income_bucket plays_sports student_id lat long
0 3 0 1 0.0 0.0
1 3 0 1 0.0 0.0
2 4 0 2 0.0 0.0
3 4 0 2 0.0 0.0
4 4 0 3 0.0 0.0