Home > database >  Expand number of dataframe rows based on sample count values
Expand number of dataframe rows based on sample count values

Time:08-24

I have a pandas dataframe with a column for each day of the week, and a column that counts the occurrences of some event:

# initialize data of lists.
data = {'Day': ['M', 'T', 'W', 'Th', 'F', 'Sa', 'Su'],
        'Count': [1, 0, 3, 1, 2, 4, 2]}
  
# Create DataFrame
df = pd.DataFrame(data)
print(df)

outputs:

  Day  Count
0   M      1
1   T      0
2   W      3
3  Th      1
4   F      2
5  Sa      4
6  Su      2

I want to create a Series where the number of rows is equal to the sum of the count column above. The series will have one row for every instance an event took place during a given day. So if an event took place 3 times on Wednesday and 1 time on Thursday, there would be 3 W rows and 1 Th row. This is my desired output:

   Day
0    M
1    W
2    W
3    W
4   Th
5    F
6    F
7   Sa
8   Sa
9   Sa
10  Sa
11  Su
12  Su

How can I achieve this?

CodePudding user response:

Do with reindex

out = df.reindex(df.index.repeat(df['Count']))
Out[967]: 
  Day  Count
0   M      1
2   W      3
2   W      3
2   W      3
3  Th      1
4   F      2
4   F      2
5  Sa      4
5  Sa      4
5  Sa      4
5  Sa      4
6  Su      2
6  Su      2

CodePudding user response:

Here is a way to do this kind of transformation by using enter image description here

  • Related