I have a dataframe that looks like this
import pandas as pd
all_data_set = [
('A','Area1','AA','A B D E','A B','D E'),
('B','Area1','AA','A B D E','A B','D E'),
('C','Area2','BB','C','C','C'),
('E','Area1','CC','A B D E','A B','D E'),
('F','Area3','BB','F G','G','F')
]
all_df = pd.DataFrame(data = all_data_set, columns = ['Name','Area','Type','Group','AA members','CC members'])
Name Area Type Group AA members CC members
0 A Area1 AA A B D E A B D E
1 B Area1 AA A B D E A B D E
2 C Area2 BB C C C
3 E Area1 CC A B D E A B D E
4 F Area3 BB F G G F
The last row (row 4) is in correct.
Anything that is type BB should only have itself (F) in Group
AA members
CC members
So it should look like this:
4 F Area3 BB F F F
Todo this I was trying to:
check when Type is
BB
and Length ofGroup
is = 2 items like this:df = (all_data_set.loc[(all_data_set['Type']== 'BB')]['Group'].str.split().str.len() == 2)
Then Iterate over every row and to find the cases like this
make a new Df with all the drop rows and make the Group , AA members, CC members = Name
Drop the row where that happens in
all_df
Merge
3.
back in toall_df
Is there a better pandas way to do this?
CodePudding user response:
Try
# identify rows where Type is BB
m = all_df['Type'] == 'BB'
# for Type BB rows, replace Group, AA members and CC members values by Name
all_df.loc[m, ['Group', 'AA members', 'CC members']] = all_df.loc[m, 'Name']
print(all_df)
Name Area Type Group AA members CC members
0 A Area1 AA A B D E A B D E
1 B Area1 AA A B D E A B D E
2 C Area2 BB C C C
3 E Area1 CC A B D E A B D E
4 F Area3 BB F F F
CodePudding user response:
You can try iloc
and for loop.
for row in all_df.index:
if all_df.iloc[row,2] == "BB":
all_df.iloc[row,3:] = all_df["Name"][row]
all_df
Name Area Type Group AA members CC members
0 A Area1 AA A B D E A B D E
1 B Area1 AA A B D E A B D E
2 C Area2 BB C C C
3 E Area1 CC A B D E A B D E
4 F Area3 BB F F F