I'm having trouble figuring out a workflow to identify items in a dataframe column that are equal to one another and then if that condition is true, create a new column where the items are appended.
For example, I want to see if the first item in the "name" column, in this example "Chimney", duplicates in the column and if it does then take those duplicating rows and append paths from the path column.
I've tried using iloc with an IF condition but was only able to evaluate the first set of matches based on the index. Also tried looking into the .duplicated function but had no luck appending items in a separate column once duplicates were found.
dict = {'name':["Chimney", "Chimney", "Columbia", "Washington","Washington","Washington"],
'Path': ["Path1", "Path2", "Path3", "Path 4", "Path 5", "Path 6"]}
df = pd.DataFrame(dict)
name Path
0 Chimney Path1
1 Chimney Path2
2 Columbia Path3
3 Washington Path4
4 Washington Path5
5 Washington Path6
What I am looking for:
name Append_Path
0 Chimney Path1, Path2
2 Columbia Path3
3 Washington Path4, Path5, Path6
The code below is what I thought was on the right track but now feel that this logic won't accomplish what I am trying to do.
n = 0
for i in df["name"]:
n = 1
if df["name"].iloc[n] == df["name"].iloc[n 1]:
df["Append_Path"] = df["Path"].iloc[n] ", " df["Path"].iloc[n 1]
else:
pass
Any Help on this is greatly appreciated.
CodePudding user response:
df.groupby("name").Path.apply(lambda x: ", ".join(x)).reset_index()
# name Path
# 0 Chimney Path1, Path2
# 1 Columbia Path3
# 2 Washington Path 4, Path 5, Path 6