Home > Blockchain >  Store series with different length in for loop
Store series with different length in for loop

Time:04-15

The original df is like below:

Hour  Count
0     15
0     0
0     0
0     17
0     18
0     12
1     55
1     0
1     0
1     0
1     53
1     51
...

I was looping through this df hour by hour and remove Count=0 in that hour, then drew a boxplot of Count in that hour. Then I ended up with 24 graphs.

Can I put those 24 boxplots onto the same graph when looping? For example getting an output df2 like below and using plt.boxplot(df2), but I'm not sure if that Nan will cause error.

       Hour=0    Hour=1  ...
0      15        55
1      17        53
2      18        51
3      12        Nan

Another thing is that after removing 0, each hour with have different length of data in Count. How to append this data and get a df2 like above?

You can use the code below for original df:

df = pd.DataFrame({
    'Hour': {0:1, 1:1, 2:1, 3:1, 4:1, 5:1, 6:2, 7:2, 8:2, 9:2, 10:2, 11:2},
    'Count': {0:15, 1:0, 2:0, 3:17, 4:18, 5:12, 6:55, 7:0, 8:0, 9:0, 10:53, 11:51}})

Here is the code for making hourly boxplots:

for i in range(2):
    table1 = df[df['Hour'] == i]
    table2 = table1[table1['large_cnt'] != 0]

    fig = plt.figure(1, figsize=(9, 6))
    plt.boxplot(table2['large_cnt'])
    plt.show()

CodePudding user response:

One option is to pivot the filtered DataFrame and plot the boxplot:

df.query('Count!=0').assign(i=lambda x: x.groupby('Hour').cumcount()).pivot('i', 'Hour', 'Count').boxplot();

enter image description here

  • Related