I have the dataframe which I want to slice to many dataframes, by adding rows by one and make a chech of the condition. If meet condition(sum of row's items>50k) then lets start a new Frame.
here is the example of how it might looks like:
CodePudding user response:
Sum Score
cumulatively, floor divide it by 50,000, and shift it up one cell (since you want each group to be > 50,000 and not < 50,000).
import pandas as pd
import numpy as np
# Generating DataFrame with random data
df = pd.DataFrame(np.random.randint(1,60000,15))
# Creating new column that's a cumulative sum with each
# value floor divided by 50000
df['groups'] = df[0].cumsum() // 50000
# Values shifted up one and missing values filled with the maximum value
# so that values at the bottom are included in the last DataFrame slice
df.groups = df.groups.shift(-1, fill_value=df.groups.max())
Then as per this answer you can use pandas.DataFrame.groupby
in a list comprehension to return a list of split DataFrames.
df_list = [df_slice for _, df_slice in df.groupby(['groups'])]