Home > Enterprise >  How to create a new column based on calculation done with the existing column in Pandas?
How to create a new column based on calculation done with the existing column in Pandas?

Time:09-23

import haversine as hs

Input data has latitude and longitude for customer pincode and branch pincode.

Input:

Customer_lat_lon     | branch_lat_lon|
(28.682,77.175)       (28.599,77.334)
(19.126,72.865)       (19.104,72.863)

i am creating a function which calculates the distance between the 2 columns.

def calc_distance:
  try:
    return hs.haversine(x,y)
  except:
     return np.nan

Now i need a new column as distance which calculates the distance between the 2 columns with the help of the function.

Example:

calc_distance(df['Customer_lat_lon'][0],df[branch_lat_lon][0])

gives me a result of 18.0612

How can I perform this for all the records. I have 1000 records for which distance needs to be calculated.

Expected output:

Customer_lat_lon     | branch_lat_lon| distance 
(28.682,77.175)       (28.599,77.334) | 18.0612

CodePudding user response:

Usedf.apply() to apply a function along an axis.

import numpy as np
import pandas as pd
from haversine import haversine


def calc_distance(s):
    try:
        return haversine(s.customer_lat_lon, s.branch_lat_lon)
    except Exception:
        return np.nan


df = pd.DataFrame(
    {
        "customer_lat_lon": [(28.682, 77.175), (19.126, 72.865)],
        "branch_lat_lon": [(28.599, 77.334), (19.104, 72.863)],
    }
)

df["Distance"] = df.apply(calc_distance, axis=1)
print(df)
   customer_lat_lon    branch_lat_lon   Distance
0  (28.682, 77.175)  (28.599, 77.334)  18.054029
1  (19.126, 72.865)  (19.104, 72.863)   2.455300
  • Related