Home > Enterprise >  Divide a large dataframe into smaller sub dataframes in order
Divide a large dataframe into smaller sub dataframes in order

Time:05-09

Is there any way to divide the very large data frame into smaller 5 sub-data frames with equal parts? I cannot use the train test split because it does not keep the data in order. The solution that already exists Split a large pandas dataframe. It does not serve my purpose. I have tried it, and it gives this below mentioned output which is not desired output.Input is

new_dict1 = {'ABW':{'ABR':1,'BPR':1,'CBR':1,'DBR':0},'BCW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0},
    'CBW':{'ABR':1,'BPR':1,'CBR':0,'DBR':0},'MCW':{'ABR':1,'BPR':1,'CBR':0,'DBR':1},
    'DBW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0},'MNW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0},
    'RBW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0},'EBW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0},
    'GBW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0},'HBW':{'ABR':0,'BPR':0,'CBR':1,'DBR':0}}
import pandas as pd
df2 = pd.DataFrame.from_dict(new_dict1,orient="index")

The output that I have got is

  [  ABR  BPR  CBR  DBR
     ABW    1    1    1    0
     BCW    0    0    1    0
     CBW    1    1    0    0
     MCW    1    1    0    1
     DBW    0    0    1    0,      ABR  BPR  CBR  DBR
     MNW    0    0    1    0
     RBW    0    0    1    0
     EBW    0    0    1    0
     GBW    0    0    1    0
     HBW    0    0    1    0]

This is not desired output. Desired output is divide the large dataframe into five sub-data farmes.

CodePudding user response:

Following my comment. Here is an example, note it's probably not the best approach..:

import numpy as np
dfs = np.array_split(df2, 5)
for index, df in enumerate(dfs):
    globals()['df%s' % index] = pd.DataFrame(df)

df3
  • Related