Home > other >  Pandas select rows based on randomly selected group from a specific column
Pandas select rows based on randomly selected group from a specific column

Time:08-09

I have a dataframe

> df = C1. C2.  C3
>      a.  1.   2
>      a.  3.   5
>      b.  6.   7 
>      c.  0.   1 
>      b.  2.   3
>      a.  3.   1

I want to randomly select a value from C1 and take all its rows. So if I select 'a' I will have:

df = C1. C2.  C3
     a.  1.   2
     a.  3.   5
     a.  3.   1

How can I do it? Thanks

CodePudding user response:

Use Series.sample for one element Series with random value of column C1 and then select all values in boolean indexing:

df1 = df[df.C1.isin(df.C1.sample(n=1))]
print (df1)
   C1   C2  C3
0  a.  1.0   2
1  a.  3.0   5
5  a.  3.0   1

CodePudding user response:

You shuffle the dataframe (sample) and then take all the rows with the same value at the first row C1:

df[df['C1'] == df.sample(frac=1).iloc[0,0]]
  • Related