I want to change the characters at once, but it doesn't change when I use the special symbol like [ or ( or : or - . What should I do?
my sample datatable is below
df
col1
0 ( red ) apple
1 [ 20220901 ] autumn
2 - gotohome
3 sample : salt bread
and I want to get this below
df
col1
0 red
apple
1 20220901
autumn
2 gotohome
3 sample
salt bread
my trial is below but it's not working.
change_word = {
'( red )' : 'red\n',
'[ 20220901 ]' : '20220901\n',
'- ' : '',
':' : '\n'
}
regex = r'\b(?:' r'|'.join(change_word.keys()) r')\b'
df["col1"] = df["col1"].str.replace(regex, lambda m: change_word[m.group()], regex=True)
CodePudding user response:
You can maybe use something like:
import re
badchars = '()[]\t-:'
df2 = (df['col1']
.str.strip(badchars ' ') # strip unwanted chars at extremities
.str.split(f'\s*[{re.escape(badchars)}] \s*') # split on badchars spaces
.explode().to_frame() # explode as new rows
)
Output:
col1
0 red
0 apple
1 20220901
1 autumn
2 gotohome
3 sample
3 salt bread