Home > Enterprise >  Find the last index with specific value for each group and complete all the above values in group wi
Find the last index with specific value for each group and complete all the above values in group wi

Time:10-10

I have the following data frame:

Group  value_1  value_2
     1      1         1
     1      1         1
     1      1         1
     1      0         1
     1      0         1
     1      1         1
     2      0         0
     2      0         0
     2      0         0
     2      0         0
     2      1         0
     3      1         1
     3      1         1
     3      0         1
     3      0         1
     3      0         1
     3      1         1
     3      1         1
     3      1         1
     3      1         1
     3      1         1
     4      1         1
     4      1         1
     4      1         1
     4      1         1
     5      1         1
     6      0         0

Currently, the value in the value_2 column is the most frequent value of each group in column value_1.

My desired output is:

Group  value_1  value_2
     1      1         0
     1      1         0
     1      1         0
     1      0         0
     1      0         0
     1      1         1
     2      0         0
     2      0         0
     2      0         0
     2      0         0
     2      1         0
     3      1         0
     3      1         0
     3      0         0
     3      0         0
     3      0         0
     3      1         1
     3      1         1
     3      1         1
     3      1         1
     3      1         1
     4      1         1
     4      1         1
     4      1         1
     4      1         1
     5      1         1
     6      0         0

If a group includes a value of 0 in the value_1 column, change the values in the indices between the last value of 0 to the first value in the group in the value_2 column to be 0.

CodePudding user response:

zeros_propagated = df["value_1"].replace(1, np.nan).groupby(df["Group"]).bfill()
mode_per_group   = df.groupby("Group")["value_1"].transform(lambda gr: 
                                                            gr.mode().iat[0])

df["value_2"] = zeros_propagated.fillna(mode_per_group, downcast="int")
  • backpropagate the 0 values per group; replacing 1 with NaN so that they will become 0
    • values after last 0, if any, will be NaN...
  • ...which we fill by combining it with the modes per group

to get

>>> df

    Group  value_1  value_2
0       1        1        0
1       1        1        0
2       1        1        0
3       1        0        0
4       1        0        0
5       1        1        1
6       2        0        0
7       2        0        0
8       2        0        0
9       2        0        0
10      2        1        0
11      3        1        0
12      3        1        0
13      3        0        0
14      3        0        0
15      3        0        0
16      3        1        1
17      3        1        1
18      3        1        1
19      3        1        1
20      3        1        1
21      4        1        1
22      4        1        1
23      4        1        1
24      4        1        1
25      5        1        1
26      6        0        0

CodePudding user response:

I guess the answer of @Mustafa looks more clean but here is my attempt. I wrote a little helper function which checks for the most frequent values, if there is a 0, search the last index, and return a list of values for each group conditionally.

def func(grp):
    most_common_val = grp.value_counts().idxmax()
    
    if 0 in grp.values and not most_common_val==0:
        last_zero = grp[grp==0].last_valid_index()
        return [most_common_val if idx>last_zero else 0 for idx, val in zip(grp.index, grp)]
    else:
        return [most_common_val for _ in range(len(grp))]

df['desired_value_2'] = df[['Group', 'value_1']].groupby('Group')['value_1'].transform(func)

Output:

    Group  value_1  desired_value_2
0       1        1                0
1       1        1                0
2       1        1                0
3       1        0                0
4       1        0                0
5       1        1                1
6       2        0                0
7       2        0                0
8       2        0                0
9       2        0                0
10      2        1                0
11      3        1                0
12      3        1                0
13      3        0                0
14      3        0                0
15      3        0                0
16      3        1                1
17      3        1                1
18      3        1                1
19      3        1                1
20      3        1                1
21      4        1                1
22      4        1                1
23      4        1                1
24      4        1                1
25      5        1                1
26      6        0                0
  • Related