I'm trying to solve the sorting for multi-character. below is the couple for lines from the data.
data = pandas.read_csv('data.csv')
data
001@|@02@|@ABC@|@IND123
002@|@02@|@ABC@|@IND223
003@|@02@|@ABC@|@IND333
004@|@02@|@ABC@|@IND443
I am trying with the below code:
res = re.split('@|@',data)
print(res)
001@|@02@|@ABC@|@IND123
['001', '|', '02', '|', 'ABC', '|', 'IND123']
Please suggest.
CodePudding user response:
You should escape the pipe because in regex, x|y
means x
or y
, so your regex splits your data at @
or @
. You can also specify the regex separator as the sep
argument to pd.read_csv
and have pandas split your data correctly as it reads it.
pd.read_csv('data.csv', header = None, sep='@\\|@', engine = 'python')
0 1 2 3
0 1 2 ABC IND123
1 2 2 ABC IND223
2 3 2 ABC IND333
3 4 2 ABC IND443