Home > Software design >  DataFrame append to DataFrame row by row and reset if condition is matched
DataFrame append to DataFrame row by row and reset if condition is matched

Time:05-14

I have the dataframe which I want to slice to many dataframes, by adding rows by one and make a chech of the condition. If meet condition(sum of row's items>50k) then lets start a new Frame.

here is the example of how it might looks like:

CodePudding user response:

Sum Score cumulatively, floor divide it by 50,000, and shift it up one cell (since you want each group to be > 50,000 and not < 50,000).

import pandas as pd
import numpy as np

# Generating DataFrame with random data
df = pd.DataFrame(np.random.randint(1,60000,15))

# Creating new column that's a cumulative sum with each
# value floor divided by 50000
df['groups'] = df[0].cumsum() // 50000

# Values shifted up one and missing values filled with the maximum value
# so that values at the bottom are included in the last DataFrame slice
df.groups = df.groups.shift(-1, fill_value=df.groups.max())

Then as per this answer you can use pandas.DataFrame.groupby in a list comprehension to return a list of split DataFrames.

df_list = [df_slice for _, df_slice in df.groupby(['groups'])]
  • Related