I've created this data frame -
Range = np.arange(0,9,1)
A={
0:2,
1:2,
2:2,
3:2,
4:3,
5:3,
6:3,
7:2,
8:2
}
Table = pd.DataFrame({"Row": Range})
Table["Intervals"]=(Table["Row"]%9).map(A)
Table
Row Intervals
0 0 2
1 1 2
2 2 2
3 3 2
4 4 3
5 5 3
6 6 3
7 7 2
8 8 2
I'd like to create another column that will be based on the intervals columns and will act as sort of a counter - so the values will be 1,2,1,2,1,2,3,1,2. The logic is that I want to count by the value of the intervals column. I've tried to use group by but the issue is that the values are displayed multiple times.
Logic: We have 2 different values - 2 and 3. Each value will occur in the intervals column as the value itself - so 2 for example will occur twice 2,2. And 3 will occur 3 times - 3,3,3. For the first 4 rows, the value 2 is displayed twice - that is why the new column should be 1,2 (counter of the first 2) and then again 1,2 (counter of the second 2). Afterward, there is 3, so the values are 1,2,3. And then once again 2, so the values are 1,2.
Hope I managed to explain myself.
Thanks in advance!
CodePudding user response:
You can use groupby.cumcount
combined with mod
:
group = Table['Intervals'].ne(Table['Intervals'].shift()).cumsum()
Table['Counter'] = Table.groupby(group).cumcount().mod(Table['Intervals']).add(1)
Or:
group = Table['Intervals'].ne(Table['Intervals'].shift()).cumsum()
Table['Counter'] = (Table.groupby(group)['Intervals']
.transform(lambda s: np.arange(len(s))%s.iloc[0] 1)
)
Output:
Row Intervals Counter
0 0 2 1
1 1 2 2
2 2 2 1
3 3 2 2
4 4 3 1
5 5 3 2
6 6 3 3
7 7 2 1
8 8 2 2