Home > Back-end >  Create list of unique substrings in Column by delimiter
Create list of unique substrings in Column by delimiter

Time:06-08

How can I get a List of unique values in my pandas data frame (df) column "OWNER" by the delimiter ";" ? The dtype of OWNER is string.

Thanks a lot! <3

OWNER
"A"
"B;C"
"B;C"

The Result should be: unique_value = ["A","B","C"]

CodePudding user response:

You can split by ';', then explode and find uniques:

>>> df = pd.Series(['A', 'B;C', 'B;C'], name='OWNER').to_frame()
>>> df

  OWNER
0     A
1   B;C
2   B;C

>>> df['OWNER'].str.split(';').explode().unique().tolist()
['A', 'B', 'C']

Alternatively, you can join the elements by ';' and then split by the same. Then use dict.fromkeys to filter out the duplicates and maintain order, finally convert to a list.

>>> list(dict.fromkeys(';'.join(df['OWNER']).split(';')))
['A', 'B', 'C']
  • Related