I have a data frame like so:
Input:
year ip type
2020 101 Missing
2021 101 Type 1
2022 101 Type 2
2020 102 Missing
2021 102 Missing
2020 103 Missing
2021 103 Type 2
2021 104 Type 1
2022 104 Type 2
2022 104 Type 2
How can I convert my data frame to the following:
Expected Output:
ip type
101 Missing/Type 1/Type 2
102 Missing
103 Missing/Type 2
104 Type 1/Type 2
Where I get all unique types for each IP. How can I do this in python pandas?
CodePudding user response:
In your case do drop_duplicates
before groupby.agg
out = df.drop_duplicates(['ip','type']).groupby('ip')['type'].agg('/'.join).reset_index()
Out[638]:
ip type
0 101 Missing/Type1/Type2
1 102 Missing
2 103 Missing/Type2
3 104 Type1/Type2
CodePudding user response:
You can try drop_duplicates
then agg
out = df.drop_duplicates(['ip', 'type']).groupby('ip', as_index=False).agg('/'.join)
print(out)
ip type
0 101 Missing/Type 1/Type 2
1 102 Missing
2 103 Missing/Type 2
3 104 Type 1/Type 2