Home > Blockchain >  Concatenate back with `np.split_array` divided dataframes which were used in a loop in Python
Concatenate back with `np.split_array` divided dataframes which were used in a loop in Python

Time:12-24

How to concatenate back with np.split_array divided dataframes which were used in a loop?

I have a dataframe as shapefile (or csv would be the same, except the geometry part) and some other dataframes. To fasten the whole process I split df into 10 parts, then run a loop and get 10 separate dataframes.

After that, I could export each dataframe as shp or csv file, then write a code that loops through the directory, finds corresponding files and merge them, but I would like to do that without exporting files and directly after the loop ends. Is this could be done?

import geopandas as gpd
import pandas as pd
import numpy as np

df = gpd.read_csv(r'E:\...\Polygons.shp')
some_other_df = gpd.read_file(r'E:\...\Small_polygon.shp')
points = gpd.read_file(r'E:\...\points.shp')

df_split = np.array_split(df, 10)

for i, v in enumerate(df_split, 0):

    # do something here
    points_clip = gpd.clip(points, v)
    some_other_df_Clip = gpd.clip(some_other_df, v)

    new_dataframe = ...
    # here I get 10 separate dataframes

    new_dataframe.to_file(fr'W:\...\final_{i}.shp')

    # how to merge all 10 new_dataframe to one?

CodePudding user response:

IIUC:

list_of_dataframes = []

for i, v in enumerate(df_split, 0):
    #Your logic goes here
    list_of_dataframes.append(new_dataframe)

df_final = pd.concat(list_of_dataframes, ignore_index=True)
  • Related