Home > Net >  sorting for multi-character
sorting for multi-character

Time:08-27

I'm trying to solve the sorting for multi-character. below is the couple for lines from the data.

data = pandas.read_csv('data.csv')
data

001@|@02@|@ABC@|@IND123
002@|@02@|@ABC@|@IND223
003@|@02@|@ABC@|@IND333
004@|@02@|@ABC@|@IND443

I am trying with the below code:

res = re.split('@|@',data)
print(res)

001@|@02@|@ABC@|@IND123

['001', '|', '02', '|', 'ABC', '|', 'IND123']

Please suggest.

CodePudding user response:

You should escape the pipe because in regex, x|y means x or y, so your regex splits your data at @ or @. You can also specify the regex separator as the sep argument to pd.read_csv and have pandas split your data correctly as it reads it.

pd.read_csv('data.csv', header = None, sep='@\\|@', engine = 'python')

    0   1   2   3
0   1   2   ABC IND123
1   2   2   ABC IND223
2   3   2   ABC IND333
3   4   2   ABC IND443
  • Related