Home > Software design >  How to make a dataframe into certain rannge
How to make a dataframe into certain rannge

Time:02-15

Here is my dataset

Address                                                 Latitude        Longitude
Bandar Udara Sultan Aji Muhammad Sulaiman               -12.658.151     1.168.977.258
Bandar Udara Halim Perdanakusuma                        -62.652.088     1.068.863.459
Bandar Udara Internasional Sultan Mahmud Badaruddin II  -28.947.992     1.047.046.471
Bandar Udara Internasional Zainuddin Abdul Madjid        -8.761.939     1.162.735.566
Bandar Udara Internasional Sultan Syarif Kasim II         4.645.874     1.014.477.217

The value of Latitude is between -10 to 10, and Longitude is 100 to 200, so we divide them to make them standard longitude and latitude values.

Here's my expected dataset

Address                                                 Latitude        Longitude
Bandar Udara Sultan Aji Muhammad Sulaiman                -1.2658151     116.8977258
Bandar Udara Halim Perdanakusuma                         -6.2652088     106.8863459
Bandar Udara Internasional Sultan Mahmud Badaruddin II   -2.8947992     104.7046471
Bandar Udara Internasional Zainuddin Abdul Madjid        -8.761939      116.2735566
Bandar Udara Internasional Sultan Syarif Kasim II         4.645874      101.4477217

CodePudding user response:

This is a very custom solution which considers the fact that your coordinates have certain ranges: more specifically, longitude will always have three digits before the decimals, and latitude will have always one (the minus sign is optional).

Basically, you first remove all the dots, then you insert a dot after 3 or 1 digit for the longitude and latitude, respectively.

import numpy as np
import pandas as pd

# Longitude
df['Longitude'] = df['Longitude'].str.replace('\.', '', regex=True)
df['Longitude'] = (df['Longitude'].str[:3]   '.'   df['Longitude'].str[3:]).astype(float)

# Latitude
negative_lat = df['Latitude'].str.startswith('-')
df['Latitude'] = df['Latitude'].str.replace('\.|-', '', regex=True)
df['Latitude'] = (df['Latitude'].str[:1]   '.'   df['Latitude'].str[1:]).astype(float)
df['Latitude'] = np.where(negative_lat, -1 * df['Latitude'], df['Latitude'])

print(df.dtypes)
print(df)
# Address       object
# Latitude     float64
# Longitude    float64
# dtype: object
#                                              Address  Latitude   Longitude
# 0          Bandar Udara Sultan Aji Muhammad Sulaiman -1.265815  116.897726
# 1                   Bandar Udara Halim Perdanakusuma -6.265209  106.886346
# 2  Bandar Udara Internasional Sultan Mahmud Badar... -2.894799  104.704647
# 3  Bandar Udara Internasional Zainuddin Abdul Madjid -8.761939  116.273557
# 4  Bandar Udara Internasional Sultan Syarif Kasim II  4.645874  101.447722
  • Related