Home > Enterprise >  convert pandas data frame column values into comma separated strings
convert pandas data frame column values into comma separated strings

Time:06-17

i have a data frame that looks like this:

    Column1                      Column2

'['jjhjh', 'adads','adsd']',    'dwdwdqw'
'['adads','adsd']',             'dwdwdqw'
'['jjhjh', 'adads','adsd']',    'dwdwdqw'
'['adads','adsd']',             'dwdwdqw'
'['adads','adsd']',             'dwdwdqw'

Although the items in Column1 look like lists of items, they are strings. I want to remove the square brackets from this string, remove the quotes and replace all values in that column with the same values but as comma separated strings. My desired output would be:

   Column1                   Column2

'jjhjh', 'adads','adsd',    'dwdwdqw'
'adads','adsd',             'dwdwdqw'
'jjhjh', 'adads','adsd',    'dwdwdqw'
'adads','adsd',             'dwdwdqw'
'adads','adsd',             'dwdwdqw'

I tried the following function but it does not replace the elements:

def string_convert(column_name):
  lista=[]
  for i in column_name:
    i=i.strip("[]")
    i=eval(i)
    lista.append(i)
  for m in lista:
    if m == tuple:
      column_name = m[0]   ','   m[1]
    else:
      column_name = m
  return df['other']

Can anyone help me with this? Thanks in advance.

CodePudding user response:

As desired, one of the options to convert a string in the mold of a list into a real list to separate the values by comma is:

import pandas as pd

df = pd.DataFrame(columns=['Column1','Column2'])
df['Column1'] = ['\'[\'jjhjh\', \'adads\',\'adsd\']\'','\'[\'jjhjh\', \'adads\',\'adsd\']\'','\'[\'jjhjh\', \'adads\',\'adsd\']\'']
df['Column2'] = ['dwdwdqw','dwdwdqw','dwdwdqw']

print('After Changes:\n\n',df)

for i, col_1 in df['Column1'].items():
    a = eval(col_1[1:-1])
    separate_comma = ",". join(a)
    df['Column1'][i] = separate_comma

print('\n\nBefore Changes:\n\n',df)

Output:

After Changes:

                        Column1  Column2
0  '['jjhjh', 'adads','adsd']'  dwdwdqw
1  '['jjhjh', 'adads','adsd']'  dwdwdqw
2  '['jjhjh', 'adads','adsd']'  dwdwdqw


Before Changes:

             Column1  Column2
0  jjhjh,adads,adsd  dwdwdqw
1  jjhjh,adads,adsd  dwdwdqw
2  jjhjh,adads,adsd  dwdwdqw

Greetings from Brazil!

CodePudding user response:

This loop worked for me.

for i, row in df.iterrows():
    tmp_val = row['Column1'].replace("'", "").replace("[", "").replace("]", "").split(',')
    row['Column1'] = tmp_val

This loops through each row of the DataFrame and modifies the cell in Column1 by doing a string replace to remove the characters you don't want(square brackets and single quotes). Then .split(',') creates the list and the last line of code replaces the original value with our newly created list.

  • Related