I have a list of lists. Each row is a list with numbers. These numbers may repeat in the next rows.
I'd like to find a single number for each row which won't repeat in the rest of the list.
I saved the list in a .csv file to make it more portable. Here are the first rows of it.
10000241;10006041;102458567;102463076;102465209;102468399;102471447;;;;;;;;;;
10000241;10006041;102457597;102458567;102459006;102463076;102471447;;;;;;;;;;
10000241;10000311;10059021;102456340;102458959;102460803;102464618;102465620;;;;;;;;;
10000241;10000311;102459290;102464008;102464618;102467881;102468156;;;;;;;;;;
10000241;10000311;102457895;102458959;102459289;102459290;102461512;102464618;102468503;;;;;;;;
1000021;10000241;102457597;102458567;102466421;102466422;102475670;;;;;;;;;;
10000241;102468922;102470951;102471518;;;;;;;;;;;;;
10000241;102457537;102458526;102460609;102461735;102462564;102465464;102470554;102470715;;;;;;;;
So, for example, the first value of the first row (10000241) already appears in the second row, so this value shouldn't be chosen. I can see that the value 102465209 doesn't repeat in the next rows, so this value should be chosen. The same operation for every sub sequential row.
The result should look like this:
102465209
102471447
...
I can see that there should be some type of iterator going through every element and every row checking for repetitions but I can't quite get the solution.
It should be noted that there should be one value for each row, otherwise there should be a message warning about it.
CodePudding user response:
Watchatcha.
You can try do something like that:
lines_to_read = '''10000241;10006041;102458567;102463076;102465209;102468399;102471447;;;;;;;;;;
10000241;10006041;102457597;102458567;102459006;102463076;102471447;;;;;;;;;;
10000241;10000311;10059021;102456340;102458959;102460803;102464618;102465620;;;;;;;;;
10000241;10000311;102459290;102464008;102464618;102467881;102468156;;;;;;;;;;
10000241;10000311;102457895;102458959;102459289;102459290;102461512;102464618;102468503;;;;;;;;
1000021;10000241;102457597;102458567;102466421;102466422;102475670;;;;;;;;;;
10000241;102468922;102470951;102471518;;;;;;;;;;;;;
10000241;102457537;102458526;102460609;102461735;102462564;102465464;102470554;102470715;;;;;;;;'''
lines_to_read = lines_to_read.splitlines()
list_to_fill = []
for x in lines_to_read:
list_to_fill.append(x.split(';'))
final_list = []
for element in list_to_fill:
for item in element:
if item:
final_list.append(item)
final_list = set(final_list)
final_list
If you are using a list already you can skip first two lines of code.
Hope was usefull.
CodePudding user response:
You can use python set
:
l = [10000241,10006041,102458567,102463076,102465209,102468399,102471447]
l2 = [10000241,10006041,102457597,102458567,102459006,102463076,102471447]
list(set(l)-set(l2))
>> [102465209, 102468399]