I have a file say, file1.txt
which looks something like below.
27,28,29,30,1,0.67
31,32,33,34,1,0.84
35,36,37,38,1,0.45
39,40,41,42,1,0.82
43,44,45,46,1,0.92
43,44,45,46,1,0.92
51,52,53,54,2,0.28
55,56,57,58,2,0.77
59,60,61,62,2,0.39
63,64,65,66,2,0.41
75,76,77,78,3,0.51
90,91,92,93,3,0.97
Where the last column is the fitness and the 2nd last column is the class. Now I read this file like :
rule_file_name = 'file1.txt'
rule_fp = open(rule_file_name)
list1 = []
for line in rule_fp.readlines():
list1.append(line.replace("\n","").split(","))
Then a default dictionary was created to ensure the rows are separated according to the classes.
from collections import defaultdict
classes = defaultdict(list)
for _list in list1:
classes[_list[-2]].append(_list)
Then they are paired up within each class using the below logic.
from random import sample, seed
seed(1)
for key, _list in classes.items():
_list=sorted(_list,key=itemgetter(-1),reverse=True)
length = len(_list)
middle_index = length // 2
first_half = _list[:middle_index]
second_half = _list[middle_index:]
result=[]
result=list(zip(first_half,second_half))
Later using the 2 rows of the pair, a 3rd row is being created using the below logic:
ans=[[random.choice(choices) for choices in zip(*item)] for item in result]
So if there were initially 12 rows
in the file1, that will now form 6 pairs and hence 6 new rows will be created. I simply want to append those newly created rows to the file1 using below logic:
list1.append(ans)
print(ans)
with open(f"output.txt", 'w') as out:
new_rules = [list(map(str, i)) for i in list1]
for item in new_rules:
out.write("{}\n".format(",".join(item)))
#out.write("{}\n".format(item))
But now my output.txt
looks like:
27,28,29,30,1,0.67
31,32,33,34,1,0.84
35,36,37,38,1,0.45
39,40,41,42,1,0.82
43,44,45,46,1,0.92
43,44,45,46,1,0.92
51,52,53,54,2,0.28
55,56,57,58,2,0.77
59,60,61,62,2,0.39
63,64,65,66,2,0.41
75,76,77,78,3,0.51
90,91,92,93,3,0.97
['43', '44', '41', '46', '1', '0.82'],['27', '28', '45', '46', '1', '0.92'],['35', '36', '33', '38', '1', '0.84']
['55', '60', '57', '58', '2', '0.77'],['51', '64', '53', '66', '2', '0.28']
['75', '91', '77', '93', '3', '0.51']
But my desired outcome is:
27,28,29,30,1,0.67
31,32,33,34,1,0.84
35,36,37,38,1,0.45
39,40,41,42,1,0.82
43,44,45,46,1,0.92
43,44,45,46,1,0.92
51,52,53,54,2,0.28
55,56,57,58,2,0.77
59,60,61,62,2,0.39
63,64,65,66,2,0.41
75,76,77,78,3,0.51
90,91,92,93,3,0.97
43,44,41,46,1,0.82
27,28,45,46,1,0.92
35,36,33,38,1,0.84
55,60,57,58,2,0.77
51,64,53,66,2,0.28
75,91,77,93,3,0.51
CodePudding user response:
I would use numpy, it is flexible and compact.
import numpy as np
fin = 'file1.txt'
col1, col2, col3, col4, jclass, fitness = np.loadtxt(fin, unpack=True, delimiter=',')
rows = np.column_stack((col1, col2, col3, col4, jclass, fitness))
print(rows[0])
print(rows[-1])
print(fitness)
Then apply your logic to the rows
array