Pandas: retain n% of data for unique values?-CodePudding

Home > Mobile > Pandas: retain n% of data for unique values?

Pandas: retain n% of data for unique values?

Time：07-10

I have a table of unique products and reviews:

ProductID  Comment
  1        Great product!
  2        Terrible
  2        Amazing!

The table (a csv) is about ~170,000 rows. I'm looking to retain 5% of comments for each unique ProductID. Is there a functionality in Pandas that will let me do this?

CodePudding user response：

you could use groupby with sample.

df.groupby('ProductID').sample(frac=.05)

Page link：https//www.codepudding.com/Mobile/470486.html

Prev:What metadata can actually go into a scrapy.Field object?

Next:Search for dictionary value and output "address" in python

Tags：

python

pandas

Links：
CodePudding