Home > front end >  Python: Split DataFrame into Dict of DataFrames, based on range of values
Python: Split DataFrame into Dict of DataFrames, based on range of values

Time:10-28

I have a DataFrame df that I need to split based on whether the value in a specific column ColB is within a given range;

1-3, 3-5, 5-7 etc

Input:

Time ColA ColB ColC
1    100  1.1  500
2    105  3.2  600
3    107  7.7  550
4    106  2.4  750
5    104  5.2  950
6    103  6.9  450

Desired Output:

Time ColA ColB ColC
1    100  1.1  500
4    106  2.4  750


Time ColA ColB ColC
2    105  3.2  600



Time ColA ColB ColC
3    107  7.7  550
5    104  5.2  950
6    103  6.9  450

Is there a nice way to do this without creating a loop in Python? Also, would it be more efficient to store the output as a list of DataFrames or a Dictionary of Dataframes? I ask as its a fairly large dataset.

CodePudding user response:

Use pandas.cut

https://pandas.pydata.org/docs/reference/api/pandas.cut.html

ie.

groups = pd.cut(df["ColB"], [1,3,5,7])
[d for _, d in df.groupby(groups)]

CodePudding user response:

You can try this:

lst = [(1,3), (3,5), (5,7)]
result = [df[df['ColB'].between(a,b)] for a,b in lst]
for i in result:
    print(i, "\n")
    
   Time  ColA  ColB  ColC
0     1   100   1.1   500
3     4   106   2.4   750 

   Time  ColA  ColB  ColC
1     2   105   3.2   600 

   Time  ColA  ColB  ColC
4     5   104   5.2   950
5     6   103   6.9   450 
  • Related