world!
I webscraped some data but it jammed it all into one place, so I'm trying to split a list of strings and the string are composed out of string characters and out of numbers. I want to split them and the moment a number appears and make myself a data table out of that.
Imagine there is a list strings :
string0 = 'string123' ; string1 = 'a12' ; string2 = 'bob69'....
Has anyone have got any ideas how can I do that?
CodePudding user response:
You can split using a regex with only a lookbehind and lookahead (see re
documentation for reference):
import re
re.split('(?<=\D)(?=\d)', string0)
output: ['string', '123']
NB. if you want to split on any change from non-number to number and conversely:
re.split('(?<=\D)(?=\d)|(?<=\d)(?=\D)', 'abc123abc123')
## OR
re.findall('(\D |\d )', 'abc123abc123')
output: ['abc', '123', 'abc', '123']
CodePudding user response:
What about using regex? i.e., the re package in python, combined with the split method? Something like this could work:
import re
string = 'string01string02string23string4string500string'
strlist = re.split('(\d )', string)
print(strlist)
['string', '01', 'string', '02', 'string', '23', 'string', '4', 'string', '500', 'string']
You would then need to combine every other element in the list in your case i think, so something like this:
cmb = [i j for i,j in zip(strlist[::2], strlist[1::2])]
print(cmb)
['string01', 'string02', 'string23', 'string4', 'string500']
CodePudding user response:
You should look into a module called regex. Regex is a powerful tool which helps us solve situations like this.