Home > Blockchain >  How to separate dataframe by consecutive months in column?
How to separate dataframe by consecutive months in column?

Time:01-26

           Ticker      Month  Price
0   ABC US EQUITY   1/1/2020     20
1   ABC US EQUITY   2/1/2020     20
2   ABC US EQUITY   3/1/2020     20
3   ABC US EQUITY   4/1/2020     20
4   ABC US EQUITY   5/1/2020     20
5   ABC US EQUITY   6/1/2020     20
6   ABC US EQUITY   7/1/2020     20
7   ABC US EQUITY   8/1/2020     20
8   ABC US EQUITY   9/1/2020     20
9   ABC US EQUITY  10/1/2020     20
10  ABC US EQUITY   1/1/2021     20
11  ABC US EQUITY   2/1/2021     20
12  ABC US EQUITY   3/1/2021     20
13  ABC US EQUITY   4/1/2021     20
14  ABC US EQUITY   5/1/2021     20
15  ABC US EQUITY   6/1/2021     20
16  ABC US EQUITY   7/1/2021     20
17  ABC US EQUITY   8/1/2021     20
18  ABC US EQUITY   9/1/2021     20

As you can see, I have a dataframe as above. I want to separate the dataframe into two by Month; one from 01/01/2020 to 10/01/2020 and the other one from 01/01/2021 to 09/01/2021. The rule should be that the rows with consecutive months should be together. how do I do it using pandas? Thanks.

CodePudding user response:

Code

# Convert the column to datetime
s = pd.to_datetime(df.Month, format='%m/%d/%Y')

# Convert to period and calculate diff
# Then check whether the diff is equal to offset of 1 Month    
m = s.dt.to_period('M').diff() == pd.offsets.MonthEnd()

# Use cumsum to create sequential counter
df['flag'] = (~m).cumsum()

Result

print(df)
           Ticker      Month  Price  flag
0   ABC US EQUITY   1/1/2020     20     1
1   ABC US EQUITY   2/1/2020     20     1
2   ABC US EQUITY   3/1/2020     20     1
3   ABC US EQUITY   4/1/2020     20     1
4   ABC US EQUITY   5/1/2020     20     1
5   ABC US EQUITY   6/1/2020     20     1
6   ABC US EQUITY   7/1/2020     20     1
7   ABC US EQUITY   8/1/2020     20     1
8   ABC US EQUITY   9/1/2020     20     1
9   ABC US EQUITY  10/1/2020     20     1
10  ABC US EQUITY   1/1/2021     20     2
11  ABC US EQUITY   2/1/2021     20     2
12  ABC US EQUITY   3/1/2021     20     2
13  ABC US EQUITY   4/1/2021     20     2
14  ABC US EQUITY   5/1/2021     20     2
15  ABC US EQUITY   6/1/2021     20     2
16  ABC US EQUITY   7/1/2021     20     2
17  ABC US EQUITY   8/1/2021     20     2
18  ABC US EQUITY   9/1/2021     20     2

print(df[df.flag == 1])
          Ticker      Month  Price  flag
0  ABC US EQUITY   1/1/2020     20     1
1  ABC US EQUITY   2/1/2020     20     1
2  ABC US EQUITY   3/1/2020     20     1
3  ABC US EQUITY   4/1/2020     20     1
4  ABC US EQUITY   5/1/2020     20     1
5  ABC US EQUITY   6/1/2020     20     1
6  ABC US EQUITY   7/1/2020     20     1
7  ABC US EQUITY   8/1/2020     20     1
8  ABC US EQUITY   9/1/2020     20     1
9  ABC US EQUITY  10/1/2020     20     1

  • Related