when we use dataset with pandas.dataframe(), sometimes labels categories are not same ratio.
example) bike: car = 7:3
price | label |
---|---|
200 | bike |
100 | bike |
700 | bike |
300 | bike |
5500 | car |
400 | bike |
5200 | car |
310 | bike |
2000 | car |
20 | bike |
In this case, car and bike are not same ratio. so, I want to make each category to be in same ratios.
car shows only 3 times, so 4 bike rows are deleted like this...
price | label |
---|---|
200 | bike |
300 | bike |
5500 | car |
5200 | car |
2000 | car |
20 | bike |
order is not important. I just want to get same ratio categories.
I did count car labels and bike labels, and check fewer labels(In this time, car is fewer labels), and read each rows to move another dataframe. It takes a lot of time, so Inconvenience.
Is there a easiest way to make number of labels equal with pandas dataframe? or just count each label and make another dataframe?
Thank you.
CodePudding user response: