I'm looking for a way to split the following example data frame at each duplicate 0 in the index, and then save the info up to the split into a csv.
index ID Col1 Col2
0 0 a b
1 1 c d
2 2 e f
0 0 g h
1 1 i j
2 2 k l
0 0 m n
1 1 o p
2 2 q r
I've tried playing around with groupby but i'm just getting the zero's grouped together and the 1's grouped together etc.
What I'm aiming for would be to have the data frame split like this
index ID Col1 Col2
0 0 a b
1 1 c d
2 2 e f
index ID Col1 Col2
0 0 g h
1 1 i j
2 2 k l
so on and so forth. And saved to separate dataframes and csvs.
CodePudding user response:
Assuming "index" is a column:
group = df['index'].eq(0).groupby(df['index']).cumcount()
for name, d in df.groupby(group):
print(f'dataframe {name}')
print(d) # to save: d.to_csv(f'df_{name}.csv')
output:
dataframe 0
index ID Col1 Col2
0 0 0 a b
1 1 1 c d
2 2 2 e f
dataframe 1
index ID Col1 Col2
3 0 0 g h
4 1 1 i j
5 2 2 k l
dataframe 2
index ID Col1 Col2
6 0 0 m n
7 1 1 o p
8 2 2 q r
If "index" is the index:
group = df.index.to_series().eq(0).groupby(df.index).cumcount()
for name, d in df.groupby(group):
print(f'dataframe {name}')
print(d)
output:
dataframe 0
ID Col1 Col2
index
0 0 a b
1 1 c d
2 2 e f
dataframe 1
ID Col1 Col2
index
0 0 g h
1 1 i j
2 2 k l
dataframe 2
ID Col1 Col2
index
0 0 m n
1 1 o p
2 2 q r