Home > Enterprise >  How to convert pandas dataframe to geopandas?
How to convert pandas dataframe to geopandas?

Time:09-19

I'm trying to upload excel and convert it to geodataframe

import pandas as pd 
import geopandas as gpd
df = pd.read_excel('Centroids.xlsx')
df.head()

servicename servicecentroid
0   Mönchengladbach, Kreisfreie Stadt   POINT (4070115.425463234 3123463.773862813)
1   Mettmann, Kreis POINT (4109488.971501033 3131686.7549837814)
2   Düsseldorf, Kreisfreie Stadt    POINT (4098292.026333667 3129901.416880203)

Then I'm trying to convert it to geodataframe, but the following error occurs

gdf = gpd.GeoDataFrame(df, geometry='servicecentroid')
TypeError: Input must be valid geometry objects: POINT (4070115.425463234 3123463.773862813)

Please help me what is wrong with my data?

Thank you.

CodePudding user response:

Are your servicecentroid's actual Points? If you want to create a GeoDataFrame you have to make you have a column 'geometry' with actual Point objects. For example:

df = pd.DataFrame({'servicename':['Mönchengladbach, Kreisfreie Stadt', 'Mettmann, Kreis', 'Düsseldorf, Kreisfreie Stadt'], 'geometry':[Point(4070115.425463234, 3123463.773862813), Point(4109488.971501033, 3131686.7549837814), Point(4098292.026333667, 3129901.416880203)]})

gdf = gpd.GeoDataFrame(df)
print(gdf.dtypes)

This will output (notice the geometry dtype):

servicename      object
geometry       geometry
dtype: object

Note that there is a comma separating the Point values, so:

Point(4070115.425463234, 3123463.773862813)

... instead of:

Point(4070115.425463234 3123463.773862813)

Edit: To make your live even easier, you can simply run the following code to transform the points in your original dataframe to actual Point objects. This will take the original values, split them, and re-build them as Points.

def my_func(x):
    l = re.search(r'\((.*?)\)',x).group(1).split(' ')
    return Point(float(l[0]), float(l[1]))

df.geometry = df.geometry.transform(my_func)

CodePudding user response:

  • it appears that servicecentroid is a WKT string
  • GeoDataFrame() geometry argument is a list/array/series of geometry objects not a column name
  • hence it becomes simple to convert series of WKT strings to series of geometric objects using shapely
import pandas as pd
import io
import shapely.wkt
import geopandas as gpd

df = pd.read_csv(
    io.StringIO(
        """servicename  servicecentroid
0   Mönchengladbach, Kreisfreie Stadt   POINT (4070115.425463234 3123463.773862813)
1   Mettmann, Kreis  POINT (4109488.971501033 3131686.7549837814)
2   Düsseldorf, Kreisfreie Stadt    POINT (4098292.026333667 3129901.416880203)"""
    ),
    sep="\s\s ",
    engine="python",
)


# NB CRS is missing,  looks like it is a UTM CRS....
gpd.GeoDataFrame(df, geometry=df["servicecentroid"].apply(shapely.wkt.loads))
  • Related