I need to rename values in a list contained in a pandas column and keep unique values if duplicates exist within the list. Please note that I want to apply this to a pandas column.
text matches
flowers are red. [red,yellow,pink]
airplanes are blue. [blue, indigo]
xxxxxxxx [orange]
I need to replace pink and yellow with red and blue with indigo. Since red and blue already exist in the list, I just want to keep one red and one blue. My code looks like this
df["rename"] = df["matches"].str.replace("pink","red")
I need my output to look like this:
text matches rename final
flowers are red. [red,pink] [red, red] [red]
airplanes are blue. [blue, indigo] [blue, blue] [blue]
xxxxxxxx [orange] [orange] [orange]
Thank you in advance!
CodePudding user response:
Here is a work around using apply
and lambda
df['rename'] = df['matches'].apply(lambda x: [i.replace('pink', 'red').replace('indigo', 'blue') for i in x])
df['final'] = df['rename'].apply(lambda x: list(set(x)))
print(df)
text matches rename final
0 flowers are red. [red, pink] [red, red] [red]
1 airplanes are blue. [blue, indigo] [blue, blue] [blue]
2 xxxxxxxx [orange] [orange] [orange]