I have the following pandas dataframe:
import pandas as pd
df = pd.DataFrame({'NAME': ['Paris', 'New York', 'Rio'],
'GEO': ['POINT (48.85 2.31647)',
'POINT (40.731499671618 -73.993457389558)',
'POINT (-22.9 -43.2)']})
print(df)
NAME GEO
Paris POINT (48.85 2.31647)
New York POINT (40.731499671618 -73.993457389558)
Rio POINT (-22.9 -43.2)
I need to separate the GEO column into two columns. One column to store latitude and another column to store longitude.
So, based on this code: Adding Lat Lon coordinates to separate columns (python/dataframe) , I implemented the following:
df['GEO'].str('POINT ()').str.strip(' ', expand=True).rename(columns={0:'LAT', 1:'LONG'})
However, it is giving the error: "TypeError: 'StringMethods' object is not callable"
I would like the output to be:
NAME GEO LAT LONG
Paris POINT (48.85 2.31647) 48.85 2.31647
New York POINT (40.731499671618 -73.993457389558) 40.731499671618 -73.993457389558
Rio POINT (-22.9 -43.2) -22.9 -43.2
CodePudding user response:
You could use a regex:
df2 = df.join(df['GEO'].str.extract(r'(?P<LAT>-?\d \.\d ) (?P<LONG>-?\d \.\d )'))
output:
NAME GEO LAT LONG
0 Paris POINT (48.85 2.31647) 48.85 2.31647
1 New York POINT (40.731499671618 -73.993457389558) 40.731499671618 -73.993457389558
2 Rio POINT (-22.9 -43.2) -22.9 -43.2
or, to get float
:
df2 = df.join(df['GEO'].str.extract(r'(?P<LAT>-?\d \.\d ) (?P<LONG>-?\d \.\d )'))
.astype(float))
output:
NAME GEO LAT LONG
0 Paris POINT (48.85 2.31647) 48.8500 2.316470
1 New York POINT (40.731499671618 -73.993457389558) 40.7315 -73.993457
2 Rio POINT (-22.9 -43.2) -22.9000 -43.200000
CodePudding user response:
You were very close. But the .str
function is not callable. You can not invoke it as .str()
. Modify your code to this and it works (but won't be quite as sleek as your one-liner)
df[['POINT', 'LAT', 'LONG']] = df['GEO'].str.split(' ', expand=True).rename(columns=({0:'POINT', 1:'LAT', 2:'LONG'}))
df['LAT'] = df['LAT'].str.replace('(','')
df['LONG'] = df['LONG'].str.replace(')','')
You can then delete df['POINT']