Home > other >  Update a column values based on two other column's complex conditions
Update a column values based on two other column's complex conditions

Time:11-03

For the DataFrame df below

df = pd.DataFrame([('Tesla','Model3', '-', 'Motor'),
                  ('Tesla', 'ModelS', '-', 'MotorMD3'),
                  ('Tesla', 'ModelX', '-', 'MotorMD3'),
                  ('Tesla', 'ModelY', '-', 'Motor'),
                  ('Jeep',  'Wrangler','Grouped','Engine'),
                  ('Jeep',  'Compass', 'Grouped','EngineMD3'),
                  ('Jeep',  'Patriot', 'Grouped','Engine'),
                  ('Jeep',  'Cherokee','Grouped','Engine'),
                  ('Ford',  'Mustang', 'Grouped','Engine'),
                  ('Ford',  'F150',     'Grouped','Engine')  ],columns=['Make','Model','Status','Type'])


df
     Make    Model        Status    Type
0   Tesla   Model3             -    Motor
1   Tesla   ModelS             -    MotorMD3
2   Tesla   ModelX             -    MotorMD3
3   Tesla   ModelY             -    Motor
4   Jeep    Wrangler     Grouped    Engine
5   Jeep    Compass      Grouped    EngineMD3
6   Jeep    Patriot      Grouped    Engine
7   Jeep    Cherokee     Grouped    Engine
8   Ford    Mustang      Grouped    Engine
9   Ford    F150         Grouped    Engine

I am trying to update the column Type with EngineMD3 for all same Make, if EngineMD3 is present in any of the Models in that Make, and if the Status is Grouped for that Make. But if the Status is not Grouped, Type should be kept as such for each Models. If 'EngineMD3' is not present the Type should be maintained as Engine.

For instance, Tesla is not Grouped, so each model keeps their Type the same. But Jeep is Grouped, and Compass is having its Type as EngineMD3, so EngineMD3 is updated as the Type for all Jeep Models. Ford is Grouped but none of the Models have type EngineMD3 so Type is kept as Engine

Expected output

     Make    Model    Status       Type
0   Tesla   Model3      -           Motor  #For Tesla Type is maintained for each model seperately since it is not grouped
1   Tesla   ModelS      -         MotorMD3
2   Tesla   ModelX      -         MotorMD3
3   Tesla   ModelY      -            Motor
4   Jeep    Wrangler   Grouped   EngineMD3 #Since Jeep is grouped, all its Type is changed to EngineMD3 since one of the model had EngineMD3
5   Jeep    Compass    Grouped   EngineMD3
6   Jeep    Patriot    Grouped   EngineMD3
7   Jeep    Cherokee   Grouped   EngineMD3
8   Ford    Mustang    Grouped      Engine #Even though Ford is grouped, since there is no EngineMD3 the Type is maintained as Engine.
9   Ford    F150       Grouped      Engine

In other words, The conditions are for all the makes(eg. Jeeps) If the make is grouped and if FD3 is appended to any of the model types, then all the grouped models in the same make will have the FD3 appended to them

I tried to use np.select to update the Type column with multiple conditions but I couldn't give a condition to select all same Make at once and update the Type. Please do help I am running out of options here.

CodePudding user response:

df.loc[df.Status.eq('Grouped'), 'Type'] = df[df.Status.eq('Grouped')].groupby('Make').Type.transform(
    lambda x: 'EngineMD3' if x.eq('EngineMD3').any() else x)
    Make     Model   Status       Type
0  Tesla    Model3        -      Motor
1  Tesla    ModelS        -   MotorMD3
2  Tesla    ModelX        -   MotorMD3
3  Tesla    ModelY        -      Motor
4   Jeep  Wrangler  Grouped  EngineMD3
5   Jeep   Compass  Grouped  EngineMD3
6   Jeep   Patriot  Grouped  EngineMD3
7   Jeep  Cherokee  Grouped  EngineMD3
8   Ford   Mustang  Grouped     Engine
9   Ford      F150  Grouped     Engine

CodePudding user response:

You can try this:

def process(g):
    if sum(g.Status=='Grouped')>0:
        g['Type'] = 'EngineMD3' if sum(g.Type=='EngineMD3')>0 else 'Engine'
    return g
df.groupby('Make').apply(process)

Output:

    Make    Model       Status  Type
0   Tesla   Model3      -       Motor
1   Tesla   ModelS      -       MotorMD3
2   Tesla   ModelX      -       MotorMD3
3   Tesla   ModelY      -       Motor
4   Jeep    Wrangler    Grouped EngineMD3
5   Jeep    Compass     Grouped EngineMD3
6   Jeep    Patriot     Grouped EngineMD3
7   Jeep    Cherokee    Grouped EngineMD3
8   Ford    Mustang     Grouped Engine
9   Ford    F150        Grouped Engine

CodePudding user response:

We can use the following function :

def trans(s_1, s_2, s_3):
    
    Make_values, new_col = [], []
          
    for i in range(len(s_1)):
        if s_3[i]=="EngineMD3" and s_2[i]=="Grouped":
            Make_values.append(s_1[i])
          
    for i in range(len(Make_values)):
          
        for j in range(len(s_3)):
          
            if s_1[j]==Make_values[i]:
          
                new_col.append("EngineMD3")
            else:
                new_col.append(s_3[j])
    return new_col

And we apply it:

df['Type']=trans(df['Make'], df['Status'], df['Type'])

CodePudding user response:

I'm no python expert, I'm sure there are more optimal solutions... But here is one.

  1. Checking the length of the query > 0.
  2. If it is > 0, then we store the value of Make that satisfies the conditions you mentioned in y variable.
  3. Then we replace the Type of that specific Make.
if len(df.query("Status == 'Grouped' & Type.str.startswith('EngineMD3')")) > 0:
    x = df.query("Status == 'Grouped' & Type.str.startswith('EngineMD3')")['Make'].values
    for i in range(len(x)):
        y = x[i]
        df['Type'].loc[df['Make'] == y] = 'EnginerMD3'
    display(df)
  • Related