I'm wondering whether someone can help me with this, may be naive, issue, please? Thanks in advance for your opinion. Q: How can I use groupby to group by ['id', 'geometry']? Assuming the geopandas data reads for: pts =
id prix agent_code geometry
0 922769 3000 15 POINT (3681922.790 1859138.091)
1 1539368 3200 26 POINT (3572492.838 1806124.643)
2 922769 50 15 POINT (3681922.790 1859138.091)
3 1539368 200 26 POINT (3572492.838 1806124.643)
I have used something like this:
pts = pts.groupby(['id', 'geometry']).agg(prom_revenue=('prix',np.mean))..reset_index()
However Python raises the following error:
TypeError: '<' not supported between instances of 'Point' and 'Point'
Thanks for your help, dudes!
CodePudding user response:
Use to_wkt
from geometry
column to convert shape as plain text:
out = pts.groupby(['id', pts['geometry'].to_wkt()], as_index=False) \
.agg(prom_revenue=('prix', np.mean))
print(out)
# Output
id prom_revenue
0 922769 1525.0
1 1539368 1700.0
CodePudding user response:
If you take Corralien's answer a step further, you can restore the points again like this:
out = pts.groupby(['id', pts['geometry'].to_wkt()]).agg(prom_revenue=('prix', np.mean)).reset_index()
out.columns = ['id', 'geometry', 'pro_revenue']
out['geometry'] = gp.GeoSeries.from_wkt(out['geometry'])
out = gp.GeoDataFrame(out)
print(out)
Output:
id geometry pro_revenue
0 922769 POINT (3681922.790 1859138.091) 1525.0
1 1539368 POINT (3572492.838 1806124.643) 1700.0