For the DataFrame df
below
df = pd.DataFrame([('Tesla','Model3', '-', 'Motor'),
('Tesla', 'ModelS', '-', 'MotorMD3'),
('Tesla', 'ModelX', '-', 'MotorMD3'),
('Tesla', 'ModelY', '-', 'Motor'),
('Jeep', 'Wrangler','Grouped','Engine'),
('Jeep', 'Compass', 'Grouped','EngineMD3'),
('Jeep', 'Patriot', 'Grouped','Engine'),
('Jeep', 'Cherokee','Grouped','Engine'),
('Ford', 'Mustang', 'Grouped','Engine'),
('Ford', 'F150', 'Grouped','Engine') ],columns=['Make','Model','Status','Type'])
df
Make Model Status Type
0 Tesla Model3 - Motor
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped Engine
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped Engine
7 Jeep Cherokee Grouped Engine
8 Ford Mustang Grouped Engine
9 Ford F150 Grouped Engine
I am trying to update the column Type
with EngineMD3
for all same Make
, if EngineMD3
is present in any of the Models
in that Make
, and if the Status
is Grouped
for that Make
. But if the Status
is not Grouped
, Type
should be kept as such for each Models
. If 'EngineMD3' is not present the Type
should be maintained as Engine
.
For instance, Tesla
is not Grouped
, so each model keeps their Type
the same. But Jeep
is Grouped
, and Compass
is having its Type
as EngineMD3
, so EngineMD3
is updated as the Type
for all Jeep
Models
. Ford
is Grouped
but none of the Models
have type EngineMD3
so Type
is kept as Engine
Expected output
Make Model Status Type
0 Tesla Model3 - Motor #For Tesla Type is maintained for each model seperately since it is not grouped
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped EngineMD3 #Since Jeep is grouped, all its Type is changed to EngineMD3 since one of the model had EngineMD3
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped EngineMD3
7 Jeep Cherokee Grouped EngineMD3
8 Ford Mustang Grouped Engine #Even though Ford is grouped, since there is no EngineMD3 the Type is maintained as Engine.
9 Ford F150 Grouped Engine
In other words,
The conditions are for all the makes
(eg. Jeeps) If the make
is grouped and if FD3
is appended to any of the model types
, then all the grouped models in the same make
will have the FD3 appended to them
I tried to use np.select
to update the Type
column with multiple conditions but I couldn't give a condition to select all same Make
at once and update the Type
. Please do help I am running out of options here.
CodePudding user response:
df.loc[df.Status.eq('Grouped'), 'Type'] = df[df.Status.eq('Grouped')].groupby('Make').Type.transform(
lambda x: 'EngineMD3' if x.eq('EngineMD3').any() else x)
Make Model Status Type
0 Tesla Model3 - Motor
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped EngineMD3
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped EngineMD3
7 Jeep Cherokee Grouped EngineMD3
8 Ford Mustang Grouped Engine
9 Ford F150 Grouped Engine
CodePudding user response:
You can try this:
def process(g):
if sum(g.Status=='Grouped')>0:
g['Type'] = 'EngineMD3' if sum(g.Type=='EngineMD3')>0 else 'Engine'
return g
df.groupby('Make').apply(process)
Output:
Make Model Status Type
0 Tesla Model3 - Motor
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped EngineMD3
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped EngineMD3
7 Jeep Cherokee Grouped EngineMD3
8 Ford Mustang Grouped Engine
9 Ford F150 Grouped Engine
CodePudding user response:
We can use the following function :
def trans(s_1, s_2, s_3):
Make_values, new_col = [], []
for i in range(len(s_1)):
if s_3[i]=="EngineMD3" and s_2[i]=="Grouped":
Make_values.append(s_1[i])
for i in range(len(Make_values)):
for j in range(len(s_3)):
if s_1[j]==Make_values[i]:
new_col.append("EngineMD3")
else:
new_col.append(s_3[j])
return new_col
And we apply it:
df['Type']=trans(df['Make'], df['Status'], df['Type'])
CodePudding user response:
I'm no python expert, I'm sure there are more optimal solutions... But here is one.
- Checking the length of the query > 0.
- If it is > 0, then we store the value of Make that satisfies the conditions you mentioned in y variable.
- Then we replace the Type of that specific Make.
if len(df.query("Status == 'Grouped' & Type.str.startswith('EngineMD3')")) > 0:
x = df.query("Status == 'Grouped' & Type.str.startswith('EngineMD3')")['Make'].values
for i in range(len(x)):
y = x[i]
df['Type'].loc[df['Make'] == y] = 'EnginerMD3'
display(df)