I have a dataframe with column that looks like the first column 'WBS #". How can I manipulate this column with Python/pandas code to label each row as either parent or child in the way they are in the second column "Parent/Child"?
df:
WBS # | Parent/Child |
---|---|
1 | Parent |
1.1 | Parent |
1.1.1 | Child |
1.1.2 | Parent |
1.1.2.1 | Parent |
1.1.2.1.1 | Parent |
1.1.2.1.1.1 | Child |
1.1.2.1.1.2 | Child |
1.1.2.1.2 | Child |
1.1.2.1.3 | Child |
1.1.2.2 | Child |
The goal is to be able to have each row properly labelled as either parent or child so all of the child elements can be rolled up appropriately.
CodePudding user response:
Remove the last part of each WBS to get the parents, then use isin
combined with numpy.where
:
parents = df['WBS #'].str.replace('\.\d $', '', regex=True).unique()
# array(['1', '1.1', '1.1.2', '1.1.2.1', '1.1.2.1.1'], dtype=object)
df['Parent/Child'] = np.where(df['WBS #'].isin(parents), 'Parent', 'Child')
output:
WBS # Parent/Child
0 1 Parent
1 1.1 Parent
2 1.1.1 Child
3 1.1.2 Parent
4 1.1.2.1 Parent
5 1.1.2.1.1 Parent
6 1.1.2.1.1.1 Child
7 1.1.2.1.1.2 Child
8 1.1.2.1.2 Child
9 1.1.2.1.3 Child
10 1.1.2.2 Child