Home > Enterprise >  How to get the max values based on first column, then index other columns values, if criteria not ma
How to get the max values based on first column, then index other columns values, if criteria not ma

Time:10-12

I have a set of data which i would like to get the maximum values based on a column with similar data first, and the outcome is the data of another column of the maximum. If the column data is empty, then i would like use another column with similar data to get the maximum values, else, it will just return to the original data value

Data set is as below

df = pd.DataFrame({'ID': ['M2','M7','M9','M4','M3','M6','M99','M8','M10','M10'],
                    'Color' : ['pink','blue','blue','yellow','blue','pink','red','pink','red','blue'],
                    'Number': [23255,1755,7897,6666,1311,78946,44893,148791,18783,487913],
                    'Brand':["Brand9","Brand9","Brand9","Brand7","Brand7","","Brand1","Brand1","",""]})

In[1]:
    ID   Color  Number   Brand
0   M2    pink   23255  Brand9
1   M7    blue    1755  Brand9
2   M9    blue    7897  Brand9
3   M4  yellow    6666  Brand7
4   M3    blue    1311  Brand7
5   M6    pink   78946        
6  M99     red   44893  Brand1
7   M8    pink  148791  Brand1
8  M10     red   18783        
9  M10    blue  487913       

First, i would like to find the Max Number based on same Brand, then index the color, and also the ID, those empty data will return empty first

i used df['Max'] = df.groupby('Brand')['Number'].transform('max') to get the max number but i stuck to get the color and the ID

expected outcome as below:

    ID   Color  Number   Brand     Max  Max Color Max ID
0   M2    pink   23255  Brand9   23255    pink     M2
1   M7    blue    1755  Brand9   23255    pink     M2
2   M9    blue    7897  Brand9   23255    pink     M2
3   M4  yellow    6666  Brand7    6666  yellow     M4
4   M3    blue    1311  Brand7    6666  yellow     M4
5   M6    pink   78946                               
6  M99     red   44893  Brand1  148791    pink     M8
7   M8    pink  148791  Brand1  148791    pink     M8
8  M10     red   18783                               
9  M10    blue  487913           

***************************************
Max    : Max Number based on brand
Max Color : Max Number based on brand index color
Max ID : Max Number based on ID

Then, for those empty, i will find the maximum based on the same ID, if there is no same ID or brand, it will return to the same data

the final expected result is as below:

     ID   Color  Number   Brand     Max Max Color Max ID Final ID Final Color
0   M2    pink   23255  Brand9   23255      pink     M2       M2        pink
1   M7    blue    1755  Brand9   23255      pink     M2       M2        pink
2   M9    blue    7897  Brand9   23255      pink     M2       M2        pink
3   M4  yellow    6666  Brand7    6666    yellow     M4       M4      yellow
4   M3    blue    1311  Brand7    6666    yellow     M4       M4      yellow
5   M6    pink   78946                                        M6        pink  ---- return to same data
6  M99     red   44893  Brand1  148791      pink     M8       M8        pink
7   M8    pink  148791  Brand1  148791      pink     M8       M8        pink  
8  M10     red   18783                                       M10        blue  ---- max based on ID
9  M10    blue  487913                                       M10        blue  ---- max based on ID

CodePudding user response:

Try:

mask = df["Brand"].eq("")

x = df.loc[df.groupby("Brand")["Number"].transform("idxmax")].reset_index(
    drop=True
)
y = df.loc[df[mask].groupby("ID")["Number"].transform("idxmax")]
z = x.copy()[["Number", "Color", "ID"]]
z[mask] = ""
x.loc[mask] = y.values

print(
    pd.concat(
        [df, z.add_prefix("Max "), x[["ID", "Color"]].add_prefix("Final ")],
        axis=1,
    )
)

Prints:

    ID   Color  Number   Brand Max Number Max Color Max ID Final ID Final Color
0   M2    pink   23255  Brand9      23255      pink     M2       M2        pink
1   M7    blue    1755  Brand9      23255      pink     M2       M2        pink
2   M9    blue    7897  Brand9      23255      pink     M2       M2        pink
3   M4  yellow    6666  Brand7       6666    yellow     M4       M4      yellow
4   M3    blue    1311  Brand7       6666    yellow     M4       M4      yellow
5   M6    pink   78946                                           M6        pink
6  M99     red   44893  Brand1     148791      pink     M8       M8        pink
7   M8    pink  148791  Brand1     148791      pink     M8       M8        pink
8  M10     red   18783                                          M10        blue
9  M10    blue  487913                                          M10        blue
  • Related