Home > database >  Update columns based on multiple complex conditions in Pandas
Update columns based on multiple complex conditions in Pandas

Time:11-04

For the DataFrame df below

df = pd.DataFrame([('Tesla','Model3', '-', 'Tesla'),
                  ('Tesla', 'ModelS', '-', 'Tesla_MD3'),
                  ('Tesla', 'ModelX', '-', 'Tesla_MD3'),
                  ('Tesla', 'ModelY', '-', 'Tesla'),
                  ('Jeep',  'Wrangler','Grouped','Jeep'),
                  ('Jeep',  'Compass', 'Grouped','Jeep_MD3'),
                  ('Jeep',  'Patriot', 'Grouped','Jeep'),
                  ('Jeep',  'Cherokee','Grouped','Jeep'),
                  ('Ford',  'Mustang', 'Grouped','Ford'),
                  ('Ford',  'F150',     'Grouped','Ford')  ],columns=['Make','Model','Status','Type'])


df
     Make    Model        Status    Type
0   Tesla   Model3             -    Tesla
1   Tesla   ModelS             -    Tesla_MD3
2   Tesla   ModelX             -    Tesla_MD3
3   Tesla   ModelY             -    Tesla
4   Jeep    Wrangler     Grouped    Jeep
5   Jeep    Compass      Grouped    Jeep_MD3
6   Jeep    Patriot      Grouped    Jeep
7   Jeep    Cherokee     Grouped    Jeep
8   Ford    Mustang      Grouped    Ford
9   Ford    F150         Grouped    Engine

I am trying to update the column Type with Make_MD3 for all same Make, if Make_MD3 is present in any of the Models in that Make, and if the Status is Grouped for that Make. But if the Status is not Grouped, Type should be kept as such for each Models. If 'Make_MD3' is not present the Type should be maintained as Make.

For instance, Tesla is not Grouped, so each model keeps their Type the same. But Jeep is Grouped, and Compass is having its Type as Jeep_MD3, so Jeep_MD3 is updated as the Type for all Jeep Models. Ford is Grouped but none of the Models have type Ford_MD3 so Type is kept as Ford

Expected output

     Make    Model    Status       Type
0   Tesla   Model3      -           Tesla  #For Tesla Type is maintained for each model seperately since it is not grouped
1   Tesla   ModelS      -         Tesla_MD3
2   Tesla   ModelX      -         Tesla_MD3
3   Tesla   ModelY      -            Tesla
4   Jeep    Wrangler   Grouped     Jeep_MD3 #Since Jeep is grouped, all its Type is changed to Jeep_MD3 since one of the model had Jeep_MD3
5   Jeep    Compass    Grouped     Jeep_MD3
6   Jeep    Patriot    Grouped     Jeep_MD3
7   Jeep    Cherokee   Grouped     Jeep_MD3
8   Ford    Mustang    Grouped        Ford #Even though Ford is grouped, since there is no Ford_MD3 the Type is maintained as Ford.
9   Ford    F150       Grouped        Ford

In other words, The conditions are for all the makes(eg. Jeeps) If the make is grouped and if MD3 is appended to any of the model types, then all the grouped models in the same make will have the MD3 appended to them. I tried the below code but it doesn't work. Can someone fix that approach or try a new approach?

df.loc[df.Status.eq('Grouped'), 'Type'] = df[df.Status.eq('Grouped')].groupby('Make').Type.transform(
    lambda x: f"{x.str.replace('_MD3','')}_{MD3}" if x.str.contains('_MD3').any() else x)

CodePudding user response:

Try:

df.loc[df.Status == "Grouped", "Type"] = (
    df[df.Status == "Grouped"]
    .groupby("Make")["Type"]
    .transform(
        lambda x: x.name   "_MD3" if x.str.endswith("_MD3").any() else x.name
    )
)

print(df)

Prints:

    Make     Model   Status       Type
0  Tesla    Model3        -      Tesla
1  Tesla    ModelS        -  Tesla_MD3
2  Tesla    ModelX        -  Tesla_MD3
3  Tesla    ModelY        -      Tesla
4   Jeep  Wrangler  Grouped   Jeep_MD3
5   Jeep   Compass  Grouped   Jeep_MD3
6   Jeep   Patriot  Grouped   Jeep_MD3
7   Jeep  Cherokee  Grouped   Jeep_MD3
8   Ford   Mustang  Grouped       Ford
9   Ford      F150  Grouped       Ford
  • Related