!unzip https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_ua10_500k.zip
I'm using the above dataset to explode rows in geopandas dataframe
# Read shapefile
test = gpd.read_file("cb_2018_us_ua10_500k")
# Split Name10 column to extract city & state
test[['city', 'state_names']] = test['NAME10'].str.split(',', 1, expand=True)
# Remove trailing & leading spaces
test[['city', 'state_names']] = test[['city', 'state_names']].apply(lambda x: x.str.strip())
test.head()
UACE10 AFFGEOID10 GEOID10 NAME10 LSAD10 UATYP10 ALAND10 AWATER10 geometry city state_names
0 88732 400C100US88732 88732 Tucson, AZ 75 U 915276150 2078695 MULTIPOLYGON (((-110.81345 32.11910, -110.7987... Tucson AZ
1 01819 400C100US01819 01819 Alturas, CA 76 C 4933312 16517 MULTIPOLYGON (((-120.54610 41.51264, -120.5459... Alturas CA
2 22366 400C100US22366 22366 Davenport, IA--IL 75 U 357345121 21444164 MULTIPOLYGON (((-90.36678 41.53636, -90.36462 ... Davenport IA--IL
3 93322 400C100US93322 93322 Waynesboro, PA--MD 76 C 45455957 88872 MULTIPOLYGON (((-77.50746 39.71577, -77.50605 ... Waynesboro PA--MD
4 02548 400C100US02548 02548 Angola, IN 76 C 23646957 3913803 MULTIPOLYGON (((-85.01157 41.59300, -85.00589 ... Angola IN
I'm trying to explode state_names
by rows
test.assign(state=test["state_names"].str.split("--")).explode('state')
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-47-5532b7b6cbdf> in <module>
----> 1 test.assign(state=city_geo["state_names"].str.split("--")).explode('state')
TypeError: explode() takes 1 positional argument but 2 were given
when I'm trying to do without the geometry it's working
test = test[['UACE10', 'AFFGEOID10', 'GEOID10', 'NAME10', 'LSAD10', 'UATYP10',
'ALAND10', 'AWATER10', 'city', 'state_names']].head()
test.assign(state=test["state_names"].str.split("--")).explode('state')
UACE10 AFFGEOID10 GEOID10 NAME10 LSAD10 UATYP10 ALAND10 AWATER10 city state_names state
0 88732 400C100US88732 88732 Tucson, AZ 75 U 915276150 2078695 Tucson AZ AZ
1 01819 400C100US01819 01819 Alturas, CA 76 C 4933312 16517 Alturas CA CA
2 22366 400C100US22366 22366 Davenport, IA--IL 75 U 357345121 21444164 Davenport IA--IL IA
2 22366 400C100US22366 22366 Davenport, IA--IL 75 U 357345121 21444164 Davenport IA--IL IL
3 93322 400C100US93322 93322 Waynesboro, PA--MD 76 C 45455957 88872 Waynesboro PA--MD PA
3 93322 400C100US93322 93322 Waynesboro, PA--MD 76 C 45455957 88872 Waynesboro PA--MD MD
4 02548 400C100US02548 02548 Angola, IN 76 C 23646957 3913803 Angola IN IN
How to explode geopandas dataframe with Geometry?
CodePudding user response:
In this case, the data can be read in as a data frame and then converted to a geopandas data frame after some processing.
import geopandas as gpd
url = 'https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_ua10_500k.zip'
test = gpd.read_file(url)
df = pd.DataFrame(test)
df[['city', 'state_names']] = df['NAME10'].str.split(',', 1, expand=True)
df = df.assign(state=df["state_names"].str.split("--")).explode('state')
# convert df to gdf
test = gpd.GeoDataFrame(df, geometry='geometry')
test.crs
output
<Geographic 2D CRS: EPSG:4269>
Name: NAD83
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: North America - onshore and offshore: Canada - Alberta; British Columbia; Manitoba; New Brunswick; Newfoundland and Labrador; Northwest Territories; Nova Scotia; Nunavut; Ontario; Prince Edward Island; Quebec; Saskatchewan; Yukon. Puerto Rico. United States (USA) - Alabama; Alaska; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Hawaii; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming. US Virgin Islands. British Virgin Islands.
- bounds: (167.65, 14.92, -47.74, 86.46)
Datum: North American Datum 1983
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich