I have a dataframe (but it also can be just sets/lists):
Group Letter
1 {a,b,c,d,e}
2 {b,c,d,e,f}
3 {b,c,d,f,g}
4 {a,b,c,f,g}
5 {a,c,d,e,h}
I want to add column with intersection of group 1-2, 1-2-3, 1-2-3-4, 1-2-3-4-5. So it'll be sth like this:
Group Letter Intersection
1 {a,b,c,d,e} None
2 {b,c,d,e,f} {b,c,d,e}
3 {b,c,d,f,g} {b,c,d}
4 {a,b,c,f,g} {b,c}
5 {a,c,d,e,h} {c}
I've read abt np.intersect1d, set.intersection, so I can do an intersection of multiple sets. But I don't know how to do it in smart way. Can someone help me with this problem?
CodePudding user response:
You might itertools.accumulate
for this task as follows
import itertools
letters = [{"a","b","c","d","e"},{"b","c","d","e","f"},{"b","c","d","f","g"},{"a","b","c","f","g"},{"a","c","d","e","h"}]
intersections = list(itertools.accumulate(letters, set.intersection))
print(intersections)
output
[{'e', 'a', 'b', 'c', 'd'}, {'b', 'e', 'c', 'd'}, {'b', 'c', 'd'}, {'b', 'c'}, {'c'}]
Note first element is {'e', 'a', 'b', 'c', 'd'}
rather than None
, so you would need to alter intersections
in that regard.