How to concatenate back with np.split_array
divided dataframes which were used in a loop?
I have a dataframe as shapefile (or csv would be the same, except the geometry part) and some other dataframes. To fasten the whole process I split df
into 10 parts, then run a loop and get 10 separate dataframes.
After that, I could export each dataframe as shp or csv file, then write a code that loops through the directory, finds corresponding files and merge them, but I would like to do that without exporting files and directly after the loop ends. Is this could be done?
import geopandas as gpd
import pandas as pd
import numpy as np
df = gpd.read_csv(r'E:\...\Polygons.shp')
some_other_df = gpd.read_file(r'E:\...\Small_polygon.shp')
points = gpd.read_file(r'E:\...\points.shp')
df_split = np.array_split(df, 10)
for i, v in enumerate(df_split, 0):
# do something here
points_clip = gpd.clip(points, v)
some_other_df_Clip = gpd.clip(some_other_df, v)
new_dataframe = ...
# here I get 10 separate dataframes
new_dataframe.to_file(fr'W:\...\final_{i}.shp')
# how to merge all 10 new_dataframe to one?
CodePudding user response:
IIUC:
list_of_dataframes = []
for i, v in enumerate(df_split, 0):
#Your logic goes here
list_of_dataframes.append(new_dataframe)
df_final = pd.concat(list_of_dataframes, ignore_index=True)