I have a set of data that comes the following way: Name features. The features are numbered in the following fashion:
'A 1:0.5 2:0.4 3:0.6 4:0.2 5:0.1 6:0.6
B 1:0.0 2:0.9 3:1.0 4:0.3 5:0.1 6:0.3'
I would ideally like to have a list only with the features (as floats), for instance, for A:
mylist = [0.5,0.4,0.6,0.2,0.1,0.6]
After using split and removing the name, I am left with a list of strings that contains the "index" of the feature. This:
myliststr = ['1:0.5','2:0.4','3:0.6','4:0.2','5:0.1','6:0.6']
How can I effectively remove these "indexes" from the strings before converting them to float?
CodePudding user response:
You can use list comprehension with split
:
myliststr = ['1:0.5','2:0.4','3:0.6','4:0.2','5:0.1','6:0.6']
output = [float(x.split(':')[1]) for x in myliststr]
print(output) # [0.5, 0.4, 0.6, 0.2, 0.1, 0.6]
Likewise, you can process the original text as follows, by combining list comprehension and split
appropriately:
s = '''A 1:0.5 2:0.4 3:0.6 4:0.2 5:0.1 6:0.6
B 1:0.0 2:0.9 3:1.0 4:0.3 5:0.1 6:0.3'''
output = [[float(x.split(':')[1]) for x in line.split()[1:]] for line in s.splitlines()]
print(output) # [[0.5, 0.4, 0.6, 0.2, 0.1, 0.6], [0.0, 0.9, 1.0, 0.3, 0.1, 0.3]]