I have created a dataframe
df=pd.DataFrame({'Weather':[32,45,12,18,19,27,39,11,22,42],
'Id':[1,2,3,4,5,1,6,7,8,2]})
df.head()
You can see Id on index 5th and 9th are duplicated. So, I want to append string --duplicated with Id on 5th and 9th index.
df.loc[df['Id'].duplicated()]
Output
Weather Id
5 27 1
9 42 2
Expected output
Weather Id
5 27 1--duplicated
9 42 2--duplicated
CodePudding user response:
Do you want an aggregated DataFrame with modification of your previous output using assign
?
(df.loc[df['Id'].duplicated()]
.assign(Id=lambda d: d['Id'].astype(str).add('--duplicated'))
)
output:
Weather Id
5 27 1--duplicated
9 42 2--duplicated
Or, in place modification of the original DataFrame with boolean indexing?
m = df['Id'].duplicated()
df.loc[m, 'Id'] = df.loc[m, 'Id'].astype(str) '--duplicated'
Output:
Weather Id
0 32 1
1 45 2
2 12 3
3 18 4
4 19 5
5 27 1--duplicated
6 39 6
7 11 7
8 22 8
9 42 2--duplicated
CodePudding user response:
If need add suffix to filtered rows use DataFrame.loc
by mask :
m = df['Id'].duplicated()
df.loc[m,'Id' ] = df.loc[m,'Id' ].astype(str).add('--duplicated')
print (df)
Weather Id
0 32 1
1 45 2
2 12 3
3 18 4
4 19 5
5 27 1--duplicated
6 39 6
7 11 7
8 22 8
9 42 2--duplicated
Or use boolean indexing
and then add suffix:
df1 = df[df['Id'].duplicated()].copy()
df1['Id'] = df1['Id'].astype(str) '--duplicated'
print (df1)
Weather Id
5 27 1--duplicated
9 42 2--duplicated