Home > Mobile >  Python complicated string needs regex
Python complicated string needs regex

Time:10-19

I have a request respond from api and it looks like this:

'224014@@@1;1=8.4=0;2=33=0;3=9.4=0@@@2;1=15=0;2=3.3=1;3=4.2=0;4=5.7=0;5=9.4=0;6=22=0@@@3;1=17=0;2=7.4=0;3=27=0@@@4;1=14=0;2=7.8=0;3=5.9=0;4=23=0;5=4.0=1'

I had splited them for your EASY READING with some explaination:

[1]The 6 digital numbers string mens UPDATE TIME.
[2]It sets apart something like'@@@X'and the X means  Race No.
[3]For each race (after '@@@X'),there is a pattern for each horse.
[4]For each horse,Horse_No,Odd & status are inside the pattern(eg:1=8.4=0)and they were 
connected using '='
[5]Number of races and number of horses are not certain(maybe more or less)

(UPDATE TIME)'224014
(Race 1)@@@1;1=8.4=0;2=33=0;3=9.4=0
(Race 2)@@@2;1=15=0;2=3.3=1;3=4.2=0;4=5.7=0;5=9.4=0;6=22=0
(Race 3)@@@3;1=17=0;2=7.4=0;3=27=0
(Race 4)@@@4;1=14=0;2=7.8=0;3=5.9=0;4=23=0;5=4.0=1'

Expcet output using python (i guess regex is necessary):

[
 {'Race_No':1,'Horse_No':1,"Odd":8.4,'status':0,'updatetime':224014},
 {'Race_No':1,'Horse_No':2,"Odd":33,'status':0,'updatetime':224014},
 {'Race_No':1,'Horse_No':3,"Odd":9.4,'status':0,'updatetime':224014},

 {'Race_No':2,'Horse_No':1,"Odd":15,'status':0,'updatetime':224014},
 {'Race_No':2,'Horse_No':2,"Odd":3.3,'status':1,'updatetime':224014},
 {'Race_No':2,'Horse_No':3,"Odd":4.2,'status':0,'updatetime':224014},
 {'Race_No':2,'Horse_No':4,"Odd":5.7,'status':0,'updatetime':224014},
 {'Race_No':2,'Horse_No':5,"Odd":5.9,'status':0,'updatetime':224014},
 {'Race_No':2,'Horse_No':6,"Odd":22,'status':0,'updatetime':224014},

 {'Race_No':3,'Horse_No':1,"Odd":17,'status':0,'updatetime':224014},
 {'Race_No':3,'Horse_No':2,"Odd":7.4,'status':0,'updatetime':224014},
 {'Race_No':3,'Horse_No':3,"Odd":27,'status':0,'updatetime':224014},

 {'Race_No':4,'Horse_No':1,"Odd":14,'status':0,'updatetime':224014},
 {'Race_No':4,'Horse_No':2,"Odd":7.8,'status':0,'updatetime':224014},
 {'Race_No':4,'Horse_No':3,"Odd":5.9,'status':0,'updatetime':224014},
 {'Race_No':4,'Horse_No':4,"Odd":23,'status':0,'updatetime':224014},
 {'Race_No':4,'Horse_No':5,"Odd":4.0,'status':1,'updatetime':224014} 
]

CodePudding user response:

You can do this with re and str.split,

import re

data = """224014
(Race 1)@@@1;1=8.4=0;2=33=0;3=9.4=0
(Race 2)@@@2;1=15=0;2=3.3=1;3=4.2=0;4=5.7=0;5=9.4=0;6=22=0
(Race 3)@@@3;1=17=0;2=7.4=0;3=27=0
(Race 4)@@@4;1=14=0;2=7.8=0;3=5.9=0;4=23=0;5=4.0=1"""

data_list = data.split('\n')
updatetime = int(data_list[0])
result = []
for s in data_list[1:]:
    split_val = re.split('.*@@@(\d );', s)
    d = {'Race_No': int(split_val[1]), 'updatetime': updatetime}
    for i in split_val[2].split(';'):
        h_no, odd, st = i.split('=')
        tmp = {'Horse_No': int(h_no), 'Odd': float(odd), 'status': int(st)}
        tmp.update(d)
        result.append(tmp)

Result:

[{'Horse_No': 1, 'Odd': 8.4, 'status': 0, 'Race_No': 1, 'updatetime': 224014},
 {'Horse_No': 2, 'Odd': 33.0, 'status': 0, 'Race_No': 1, 'updatetime': 224014},
 {'Horse_No': 3, 'Odd': 9.4, 'status': 0, 'Race_No': 1, 'updatetime': 224014},
 {'Horse_No': 1, 'Odd': 15.0, 'status': 0, 'Race_No': 2, 'updatetime': 224014},
 {'Horse_No': 2, 'Odd': 3.3, 'status': 1, 'Race_No': 2, 'updatetime': 224014},
 {'Horse_No': 3, 'Odd': 4.2, 'status': 0, 'Race_No': 2, 'updatetime': 224014},
 {'Horse_No': 4, 'Odd': 5.7, 'status': 0, 'Race_No': 2, 'updatetime': 224014},
 {'Horse_No': 5, 'Odd': 9.4, 'status': 0, 'Race_No': 2, 'updatetime': 224014},
 {'Horse_No': 6, 'Odd': 22.0, 'status': 0, 'Race_No': 2, 'updatetime': 224014},
 {'Horse_No': 1, 'Odd': 17.0, 'status': 0, 'Race_No': 3, 'updatetime': 224014},
 {'Horse_No': 2, 'Odd': 7.4, 'status': 0, 'Race_No': 3, 'updatetime': 224014},
 {'Horse_No': 3, 'Odd': 27.0, 'status': 0, 'Race_No': 3, 'updatetime': 224014},
 {'Horse_No': 1, 'Odd': 14.0, 'status': 0, 'Race_No': 4, 'updatetime': 224014},
 {'Horse_No': 2, 'Odd': 7.8, 'status': 0, 'Race_No': 4, 'updatetime': 224014},
 {'Horse_No': 3, 'Odd': 5.9, 'status': 0, 'Race_No': 4, 'updatetime': 224014},
 {'Horse_No': 4, 'Odd': 23.0, 'status': 0, 'Race_No': 4, 'updatetime': 224014},
 {'Horse_No': 5, 'Odd': 4.0, 'status': 1, 'Race_No': 4, 'updatetime': 224014}]

CodePudding user response:

I would do this without regular expressions:

s = "010013@@@1;1=8.5=0;2=35=0;3=9.6=0;4=3.4=1;5=21=0;6=9.8=0;7=13=0;8=10=0;9=14=0;10=6.9=0;11=19=0;12=11=0@@@2;1=16=0;2=3.3=1;3=4.4=0;4=5.6=0;5=8.9=0;6=22=0;7=40=0;8=12=0;9=9.7=0;10=23=0;11=24=0@@@3;1=17=0;2=7.4=0;3=29=0;4=9.8=0;5=9.2=0;6=5.2=0;7=8.3=0;8=18=0;9=2.7=1;10=19=0@@@4;1=13=0;2=8.0=0;3=6.0=0;4=25=0;5=4.3=1;6=8.8=0;7=37=0;8=12=0;9=6.4=0;10=9.3=0;11=34=0;12=15=0@@@5;1=19=0;2=5.4=0;3=16=0;4=27=0;5=11=0;6=5.1=0;7=12=0;8=19=0;9=4.0=1;10=6.4=0;11=36=0;12=25=0@@@6;1=15=0;2=9.6=0;3=7.9=0;4=24=0;5=16=0;6=4.1=1;7=31=0;8=4.2=0;9=12=0;10=18=0;11=28=0;12=7.2=0@@@7;1=14=0;2=7.7=0;3=10=0;4=11=0;5=12=0;6=7.8=0;7=14=0;8=14=0;9=14=0;10=5.2=1;11=9.0=0;12=8.7=0@@@8;1=4.2=1;2=5.8=0;3=14=0;4=7.1=0;5=15=0;6=7.1=0;7=27=0;8=22=0;9=10=0;10=20=0;11=14=0;12=10=0"

result = []
updatetime, *races = s.split("@@@")
updatetime = int(updatetime)
for race in races:
    raceid, *horses = race.split(";")
    raceid = int(raceid)
    for horse in horses:
        horseid, odd, state = horse.split("=")
        result.append({
            'Horse_No': int(horseid), 
            'Odd': float(odd), 
            'status': int(state),
            'Race_No': raceid, 
            'updatetime': updatetime
        })
print(result)
  • Related