I'm trying to sample 1000 unique users within a data. These can be any 1000 users. But I want to extract all rows for the 1000 unique users.
Input
User_ID | Ship Date |
---|---|
A454 | 8/2/2019 |
A454 | 9/2/2019 |
G658 | 9/2/2019 |
G658 | 9/2/2019 |
from random import sample
df['User_ID'].sample(n=1000, random_state=1)
I tried the above code, but this just gives the unique IDs and not all rows for 1000 unique users.
CodePudding user response:
IIUC, get the unique
values, sample
and slice with isin
and boolean indexing:
from random import sample
out = df[df['User_ID'].isin(random.sample(list(df['User_ID'].unique()), 1000))]