I have a data set like this:
dfdict = {
'a' : 'd1',
'b' : 'd1',
'c' : 'd1',
'd' : 'd2',
'e' : 'd2',
'f' : 'd3',
'g' : 'd3'
}
df = pd.DataFrame(list(dfdict.items()), columns = ['staff', 'division'])
df
staff division
0 a d1
1 b d1
2 c d1
3 d d2
4 e d2
5 f d3
6 g d3
So there are staff in different divisions. How can I assign these staff to pairs of 2, let's say to form code review groups, with these conditions:
- Each pair has a primary and a secondary reviewer.
- Each staff will serve as the primary reviewer in 1 pair and secondary in another.
- A staff can only form a pair with another staff from a different division.
- Two staff can only form a pair once. That is, if there's already a pair with
a
as primary andd
as secondary, then whend
is primarya
can't be the secondary.
An outcome may look like this:
primary secondary
a d
b e
c g
d b
e f
f
g
I leave two last secondary values blank to describe a complication here. Up to the point where f
is primary, either c
or a
are eligible values for secondary. However, if a
is picked from the eligible pool, then the last pair is g
and c
, which violates the conditions as c
and g
is already a pair above.
CodePudding user response:
You may want to use some Constraint Programming tools. But since the constraints are very simple, we can handle them directly. The following code uses constraint propagation with a "Minimum Remaining Values" heuristic.
Credit: my code is heavily based on a Sudoku solver by Peter Norvig.
Warning: the code assumes staff codes are one character long: you will need to modify the len()
tests if you want longer codes or real names.
###Set constraints
staff_divs = {
'a' : 'd1',
'b' : 'd1',
'c' : 'd1',
'd' : 'd2',
'e' : 'd2',
'f' : 'd3',
'g' : 'd3'
}
pairings = {}
for pri,div in staff_divs.items():
pairings[pri] = ''.join([sec for sec, div2 in staff_divs.items() if div != div2])
### Constraint Propagation
def assign(pairings, pri, sec):
"""Eliminate all the other values (except sec) from pairings[pri] and propagate.
Return pairings, except return False if a contradiction is detected."""
other_pairings = pairings[pri].replace(sec, '')
if all(eliminate(pairings, pri, sec2) for sec2 in other_pairings):
return pairings
else:
return False
def eliminate(pairings, pri, sec):
"""Eliminate sec from pairings[pri]; propagate when only one left.
Return pairings, except return False if a contradiction is detected."""
if sec not in pairings[pri]:
return pairings ## Already eliminated
pairings[pri] = pairings[pri].replace(sec,'')
## If pri is reduced to one value sec2, then eliminate sec2 from the other pri's.
if len(pairings[pri]) == 0:
return False ## Contradiction: removed last value
elif len(pairings[pri]) == 1:
sec2 = pairings[pri]
if not all(eliminate(pairings, pri2, sec2) for pri2 in pairings if pri2 != pri):
return False
return pairings
### Search
def search(pairings):
"Using depth-first search and propagation, try all possible pairings."
if pairings is False:
return False ## Failed earlier
if all(len(sec) == 1 for sec in pairings.values()):
return pairings ## Solved!
## Chose the unfilled pri with the fewest possibilities
n,pri = min((len(pairings[pri]), pri) for pri in pairings if len(pairings[pri]) > 1)
return some(search(assign(pairings.copy(), pri, sec))
for sec in pairings[pri])
### Utilities
def some(seq):
"Return some element of seq that is true."
for e in seq:
if e: return e
return False
### Main
if __name__ == '__main__':
result = search(pairings)
for pri,sec in result.items():
print(f'{pri} : {sec}')
CodePudding user response:
Find a 2-factor of the graph
If you make a graph of a solution to your problem, where every staff member is connected to his two partners, you get a 2-regular graph. That means that every vertex will have degree 2, and the graph will consist of some number of disjoint simple cycles -- starting at any staff member, if you follow primary->secondary links for each pairing, then you'll eventually arrive back where you started.
In order to solve your problem, you can start by connecting each staff member with all his permissible partners (everyone in a different division), and then find a 2-regular subgraph that covers all the vertices. That is called a 2-factor of the graph. It's also known as a disjoint cycle cover.
The good news is that the problem can be solved in polynomial time, as explained here on the theoretical CS stack.
The bad news is that it's probably a lot more work than you were expecting.