I have following dataset and numpy array in column B and I want to make "new_column" by removing the duplicated elements of arrays in column B as shown.
A B new Column
1 ["A","a","123","123","A"] ["A","a","123"]
2 ["abc","a","1234","123","abc"] ["abc","a","1234","123"]
3 ["abcd","abcd","abcd"] ["abcd"]
4 ["hello","mello"] ["hello","mello"]
5 ["hi","hi","why"] ["hi","why"]
I am using following codes but they are not giving the desired output.Please help.
def u_value(a):
return np.unique(a)
or
def ddpe(a):
a=list(dict.fromkeys(a))
return a
CodePudding user response:
Here is problem values are not lists, but strings, so use ast.literal_eval
for lists:
import ast
def ddpe(a):
return list(dict.fromkeys(ast.literal_eval(a)))
df['new Column'] = df['B'].apply(ddpe)