how do you fill row values of a column groupby with the max value of the grouped data-CodePudding

I am trying to fill the values of a column in grouped data with the maximum value of the grouped data.

The following is a sample of the data

    
df1 = [[52, '1', '0'], [52, '1', '1'],
       [52, '1', '0'], [52, '2', '0'],
       [53, '2', '0'], [52, '2', '0']]
    
df = pd.DataFrame(df1, columns =['Cow','Lact', 'fail'])

Producing the following dataframe

  Cow Lact fail
0   52  1   0
1   52  1   1
2   52  1   0
3   52  2   0
4   53  2   0
5   52  2   0

In this example I would like to replace the 0 values with 1 (max value) for cow = 52 lact = 1


  Cow Lact fail
0   52  1   1
1   52  1   1
2   52  1   1
3   52  2   0
4   53  2   0
5   52  2   0

I have unsuccessfully modified code that appeared in Pandas groupby: change values in one column based on values in another column

grouped = df.groupby(["Cow", "Lact"], as_index=False).max()['fail']
for i in grouped:
    if i == 1:
        df['fail'] =  1

Solutions and clarification re failure of my approach appreciated. Thanks

CodePudding user response：

You can use a group by in combination with a transform "max." I'm not sure if you would simply want to replace the 'fail' column or if you would want to make a new column but this should get you the expected results.

df['fail'] = df.groupby(['Cow', 'Lact'])['fail'].transform(max)

CodePudding user response：

You were almost there, directly use transform('max'):

df['fail'] = df.groupby(["Cow", "Lact"])['fail'].transform('max')

output:

   Cow Lact fail
0   52    1    1
1   52    1    1
2   52    1    1
3   52    2    0
4   53    2    0
5   52    2    0