Home > Software engineering >  Operations on specific elements of a dataframe in Python
Operations on specific elements of a dataframe in Python

Time:12-13

I'm trying to convert kilometer values in one column of a dataframe to mile values. I've tried various things and this is what I have now:

def km_dist(column, dist):
    length = len(column)
    for dist in zip(range(length), column):
        if (column == data["dist"] and dist in data.loc[(data["dist"] > 25)]):
            return dist / 5820
        else:
            return dist
    
data = data.apply(lambda x: km_dist(data["dist"], x), axis=1)

The dataset I'm working with looks something like this:

    past_score  dist    income  lab score   gender  race    income_bucket   plays_sports    student_id  lat long
0   8.091553    11.586920   67111.784934    0   7.384394    male    H   3   0   1   0.0 0.0
1   8.091553    11.586920   67111.784934    0   7.384394    male    H   3   0   1   0.0 0.0
2   7.924539    7858.126614 93442.563796    1   10.219626   F   W   4   0   2   0.0 0.0
3   7.924539    7858.126614 93442.563796    1   10.219626   F   W   4   0   2   0.0 0.0
4   7.726480    11.057883   96508.386987    0   8.544586    M   W   4   0   3   0.0 0.0

With my code above, I'm trying to loop through all the "dist" values and if those values are in the right column ("data["dist"]") and greater than 25, divide those values by 5820 (the number of feet in a kilometer). More generally, I'd like to find a way to operate on specific elements of dataframes. I'm sure this is at least a somewhat common question, I just haven't been able to find an answer for it. If someone could point me towards somewhere with an answer, I would be just as happy.

CodePudding user response:

Instead your solution filter rows with mask and divide column dist by 5820:

data.loc[data["dist"] > 25, 'dist'] /= 5820

Working same like:

data.loc[data["dist"] > 25, 'dist'] = data.loc[data["dist"] > 25, 'dist'] / 5820

data.loc[data["dist"] > 25, 'dist'] /= 5820
print (data)
   past_score       dist        income  lab      score gender race  \
0    8.091553  11.586920  67111.784934    0   7.384394   male    H   
1    8.091553  11.586920  67111.784934    0   7.384394   male    H   
2    7.924539   1.350194  93442.563796    1  10.219626      F    W   
3    7.924539   1.350194  93442.563796    1  10.219626      F    W   
4    7.726480  11.057883  96508.386987    0   8.544586      M    W   

   income_bucket  plays_sports  student_id  lat  long  
0              3             0           1  0.0   0.0  
1              3             0           1  0.0   0.0  
2              4             0           2  0.0   0.0  
3              4             0           2  0.0   0.0  
4              4             0           3  0.0   0.0  
  • Related