I have a list of triples,
posPools = [[13, 14, 15], [17, 19, 20], [16, 20, 22], [15, 16, 24], [15, 22, 23], [13, 20, 23], [15, 18, 19], [10, 15, 22], [7, 8, 9], [8, 10, 17], [15, 16, 17], [10, 15, 16], [8, 9, 15], [15, 16, 22], [7, 8, 9], [1, 8, 11], [1, 2, 4], [3, 6, 7], [10, 3, 1], [2, 5, 8]]
Each list within posPools represents a contaminated pool with water from 3 wells x,y,z. Only one well (x y or z) needs to be contaminated in order to contaminate the entire pool. There are 24 total wells.
I have a flat list of potentially contaminated wells, and I know for sure that any well # not in this list is not contaminated.
potPosWell = [4, 6, 8, 9, 15, 18, 20, 23]
If one pool, e.g. posPools[0] = [13,14,15] contains only one element in common with potPosWell, e.g. #15, then I know for sure that well #15 is contaminated, and I want to append 15 to a new list posWell.
However, if a pool, e.g.posPools[8] = [7,8,9] contains more than one element in common with potPosWell, e.g. #8 & #9, then I don't want to append either 8 or 9 to posWell. (That is because I wouldn't know for sure whether well #8 or #9 contaminated the pool).
The output should be
posWell = [4,6,8,15,20]
I have no idea how to iterate over posPools to check if each sub-list has only one element in common potPosWell and then append that element to a new list posWell.
My working idea: First, find which triples in posPools share only one element in common with potPosWell and append those triples into a new nested list. Second, determine which element within each triple is equivalent to potPosWells and append it to PosWell. Third, eliminate duplicate elements in posWells.
I'm not even sure if the data structures I have now are optimal for this sort of comparison.
These lists are small enough to do by hand, but it just clicked in my head that this would be something fun to program into Python. Alas, I barely knew enough Python to clean up the data to this stage. I would greatly appreciate if someone gave me examples of how to implement what I'm thinking about.
*Edit: There is one pool in posPools [10,3,1] that contains no wells from potPosWell. This means that it was a false positive when tested for contamination. None of the wells 10,3, or 1 are actually contaminated since none appear in potPosWell.
CodePudding user response:
One way using set.intersection
:
pot = set(potPosWell)
list({i.pop() for pool in posPools if len(i:=pot.intersection(pool)) == 1})
In case if you are on less than python 3.8:
res = []
for pool in posPools:
i = pot.intersection(pool)
if len(i) == 1:
res.append(i.pop())
Output:
[4, 6, 8, 15, 20]