Select Random Value from Pandas list column for each row ensuring that value don't get picked a-CodePudding

I have a Pandas DataFrame below

import pandas as pd

df = pd.DataFrame({
    'poc': ["a", "b", "c", "d"],
    'school': ["school1", "school2", "school3", "school4"],
    'volunteers': [["sam", "mat", "ali", "mike", "guy", "john"],
                   ["sam", "mat", "ali", "mike"],
                   ["rose", "sam", "mike", "jorge"],
                   ["susan", "jack", "alex", "mat", "mike"]]
})

poc	school	volunteers
a	school1	['sam', 'mat', 'ali', 'mike', 'guy', 'john']
b	school2	['sam', 'mat', 'ali', 'mike']
c	school3	['rose', 'sam', 'mike', 'jorge']
d	school4	['susan', 'jack', 'alex', 'mat', 'mike']

I need to create a new column that has a random pick from the volunteers column to select 1 volunteer for each school ensuring that the same volunteer doesn't get picked twice.

So far I have tried:

import random

df["random_match"] = [random.choice(x) for x in df["volunteers"]]

but this just gives me a random volunteer without ensuring it is not repeated.

CodePudding user response：

This should work. Just accumulate what you have seen so far and remove it from the set of available choices. I am assuming a default value of NAN if nothing fits.

df["random_match"] = pd.NA
already_picked = set()
for row_idx in range(len(df)):
    available_group = set(df.iloc[row_idx]["volunteers"]) - already_picked
    if len(available_group) > 0:
        chosen_name = random.sample(available_group, 1)[0]
        df.loc[row_idx, 'random_match'] = chosen_name
        already_picked.add(chosen_name)

CodePudding user response：

You could try this:

selected = []
for i, list_of_volunteers in enumerate(df["volunteers"].values):
    shuffle(list_of_volunteers)
    for volunteer in list_of_volunteers:
        if volunteer in df.loc[i, "volunteers"] and volunteer not in selected:
            df.loc[i, "pick"] = volunteer
            selected.append(volunteer)
            break

print(df)
# Outputs
  poc   school                        volunteers  pick
0   a  school1  [mat, ali, mike, john, sam, guy]   mat
1   b  school2             [mike, sam, ali, mat]  mike
2   c  school3          [sam, mike, rose, jorge]   sam
3   d  school4    [mike, jack, alex, mat, susan]  jack