I am trying to find the number of consecutive negative values of 20 rows or more for a column in a dataframe. However, once it groups in chunks of 20 or more, I want to add corresponding 30 rows of the original dataframe after each chunk.
This is my attempt (got help from a posted question here):
n = df['Slope'].lt(0)
mask = n.ne(n.shift()).cumsum()[n]
dfL = [g for i, g in df.groupby(mask) if (len(g[g['Slope'] < 0]) >= 20)]
df_cn = pd.concat(dfL)
I get the chunks of consecutive negative values, but I do not know how to now add corresponding 30 rows after each chunk.
CodePudding user response:
Please try to provide a minimal reproducible example next time and a small sample of the desired output
I created a random dfL of mine and it worked well
n = df['Slope'].lt(0)
mask = n.ne(n.shift()).cumsum()[n]
dfL = [g for i, g in df.groupby(mask) if (len(g[g['Slope'] < 0]) >= 20)]
From here I created the code:
for x in range(len(dfL)): # dfL is reaturning a list of dfs with each chunk
if len(dfL)>0: # here I want to be sure, that we have a chunk in the dfL
df_cn= dfL[x] # selecting chunk from dfL
print('Chunk: df_cn_' str(x) ' created') # feedback for testing
idx=dfL[x].index # last index from chunk # since chunk size >=20, we need to be sure to get the last index of it.
print('Chunk from ' str(min(idx)) ' to ' str(max(idx)) ' total ' str(len(dfL[x])) ' indexes in the chunk') # feedback with size of chunk
df_rest=df.loc[max(idx) 1:max(idx) 31] # get the next 30 rows from original df based on max index from last chunk
df_cn_ext = pd.concat([df_cn, df_rest]) # concatenate (join on Y-Achse) the chunk and 30rows of original df, if the
exec(f'df_cn_ext_{x}=df_cn_ext[:]') # creating separated dataframes trough suffixes for each chunk 30 rows groups
print('Dataframe df_cn_ext_' str(x) ' created from index ' str(min(idx)) ' to ' str(max(idx) 31))
else:
print('no chunks in the df found')
Please note:
1- I've separated each chunk 30 rows in new dfs with suffixes (df_cn_ext_suffix)
2- If the last value of chunk is near the end of dfL, it won't add 30 rows, but the maximum number of rows available.
here some output of my code:
Chunk: df_cn_0 created
Chunk from 3 to 39 total 37 indexes in the chunk
Dataframe df_cn_ext_0 created from index 3 to 70
Chunk: df_cn_1 created
Chunk from 41 to 66 total 26 indexes in the chunk
Dataframe df_cn_ext_1 created from index 41 to 97