Home > Net >  Using groupby to speed up random number generation in part of a dataframe
Using groupby to speed up random number generation in part of a dataframe

Time:01-02

I have a program that uses a mask similar to the check marked answer shown here to create multiple sets of random numbers in a dataframe, df.

Create random.randint with condition in a group by?

My code:

for city in state:
    mask = df['City'] == city
    df.loc[mask, 'Random'] = np.random.randint(1, 200, mask.sum())

This takes quite some time the bigger dataframe df is. Is there a way to speed this up with groupby?

CodePudding user response:

You can try:

df['Random'] = df.assign(Random=0).groupby(df['City'])['Random'] \
                 .transform(lambda x: np.random.randint(1, 200, len(x)))
  • Related