Home > Mobile >  How to join items from same columm using pandas in python?
How to join items from same columm using pandas in python?

Time:10-05

print(dfs["Categorias"])

I m getting this

0                         wordpress, criação de sites
1                                    criação de sites
2             e-commerce, criação de sites, wordpress
3                           marketing digital, vendas

How can i remove repeated items and join the unique values in list?

Thank you

CodePudding user response:

You could use sets and itertools.chain:

from itertools import chain
set(chain(*df['Categorias'].str.split(',\s ')))

Output:

{'criação de sites', 'e-commerce', 'marketing digital', 'vendas', 'wordpress'}

Optionally, as list:

>>> list(set(chain(*df['Categorias'].str.split(',\s '))))
['criação de sites', 'e-commerce', 'marketing digital', 'vendas', 'wordpress']

CodePudding user response:

Are you looking for something like that:

Split each row into a list and explode this list into rows then get unique values of the column.

>>> df['Categorias'].str.split(r',\s ').explode().unique().tolist()
['wordpress', 'criação de sites', 'e-commerce', 'marketing digital', 'vendas']

Step by step:

>>> df = df['Categorias'].str.split(r',\s ')
0
0                [wordpress, criação de sites]
1                           [criação de sites]
2    [e-commerce, criação de sites, wordpress]
3                  [marketing digital, vendas]
Name: Categorias, dtype: object

>>> df = df.explode()
0
0            wordpress
0     criação de sites
1     criação de sites
2           e-commerce
2     criação de sites
2            wordpress
3    marketing digital
3               vendas
Name: Categorias, dtype: object

>>> df.unique().tolist()
['wordpress', 'criação de sites', 'e-commerce', 'marketing digital', 'vendas']

CodePudding user response:

One way is to convert the dataframe column to a list, remove duplicates using a set and then join them using string operations.

>>> ', '.join(set(df['Categorias'].str.split(', ').explode().tolist()))
  • Related