i have a data frame that looks like this:
Column1 Column2
'['jjhjh', 'adads','adsd']', 'dwdwdqw'
'['adads','adsd']', 'dwdwdqw'
'['jjhjh', 'adads','adsd']', 'dwdwdqw'
'['adads','adsd']', 'dwdwdqw'
'['adads','adsd']', 'dwdwdqw'
Although the items in Column1 look like lists of items, they are strings. I want to remove the square brackets from this string, remove the quotes and replace all values in that column with the same values but as comma separated strings. My desired output would be:
Column1 Column2
'jjhjh', 'adads','adsd', 'dwdwdqw'
'adads','adsd', 'dwdwdqw'
'jjhjh', 'adads','adsd', 'dwdwdqw'
'adads','adsd', 'dwdwdqw'
'adads','adsd', 'dwdwdqw'
I tried the following function but it does not replace the elements:
def string_convert(column_name):
lista=[]
for i in column_name:
i=i.strip("[]")
i=eval(i)
lista.append(i)
for m in lista:
if m == tuple:
column_name = m[0] ',' m[1]
else:
column_name = m
return df['other']
Can anyone help me with this? Thanks in advance.
CodePudding user response:
As desired, one of the options to convert a string in the mold of a list into a real list to separate the values by comma is:
import pandas as pd
df = pd.DataFrame(columns=['Column1','Column2'])
df['Column1'] = ['\'[\'jjhjh\', \'adads\',\'adsd\']\'','\'[\'jjhjh\', \'adads\',\'adsd\']\'','\'[\'jjhjh\', \'adads\',\'adsd\']\'']
df['Column2'] = ['dwdwdqw','dwdwdqw','dwdwdqw']
print('After Changes:\n\n',df)
for i, col_1 in df['Column1'].items():
a = eval(col_1[1:-1])
separate_comma = ",". join(a)
df['Column1'][i] = separate_comma
print('\n\nBefore Changes:\n\n',df)
Output:
After Changes:
Column1 Column2
0 '['jjhjh', 'adads','adsd']' dwdwdqw
1 '['jjhjh', 'adads','adsd']' dwdwdqw
2 '['jjhjh', 'adads','adsd']' dwdwdqw
Before Changes:
Column1 Column2
0 jjhjh,adads,adsd dwdwdqw
1 jjhjh,adads,adsd dwdwdqw
2 jjhjh,adads,adsd dwdwdqw
Greetings from Brazil!
CodePudding user response:
This loop worked for me.
for i, row in df.iterrows():
tmp_val = row['Column1'].replace("'", "").replace("[", "").replace("]", "").split(',')
row['Column1'] = tmp_val
This loops through each row of the DataFrame and modifies the cell in Column1 by doing a string replace to remove the characters you don't want(square brackets and single quotes). Then .split(',') creates the list and the last line of code replaces the original value with our newly created list.