Home > front end >  compute median every 12 values
compute median every 12 values

Time:05-22

ex_array = [-8.23294593e-02, -4.07239507e-02,  6.08131029e-02,  2.72433402e-02,
   -4.73587631e-02,  5.15452252e-02,  1.32902476e-01,  1.22322232e-01,
    2.71845990e-02, -1.16927038e-01, -2.62239877e-01, -1.46526396e-01,
   -1.82859136e-01, -1.02089602e-01, -1.91863501e-04, -5.42572200e-02,
   -1.41798506e-01,  2.32538185e-02,  1.44525705e-01,  1.33945461e-01,
    5.01618120e-02, -1.32664337e-01, -2.97395262e-01, -1.02531532e-01,
   -7.80204566e-02, -5.46991495e-02,  1.05868862e-01,  7.25526818e-03,
    5.04192997e-02,  7.41281286e-02,  1.75069159e-01,  1.64488914e-01,
    7.55396024e-02, -6.23800645e-02, -1.76950023e-01, -5.91491004e-02,
   -4.00535768e-02,  6.59473071e-04,  5.98125666e-02, -1.49608356e-02,
   -1.45519585e-02,  1.49876707e-01,  1.92880709e-01,  2.33158881e-01,
    7.59751625e-02, -2.46659059e-02, -1.40025102e-01, -3.02416639e-02]

I need to compute the median for every 12 values. Each value represents a month (from January to December), so I would like to obtain the median for each month of the year. Like this:

Representation of what I'm asking

Approaches:

  • I could convert the array to a dataframe, and add a new column representing each month. Later, grouping by month and computing the median. But I feel it has to be an easier solution.

  • Another solution I thought to convert to a dataframe and slice every 12 values, each time starting from a different value. It works but I'm having problems obtaining a workable array. Adding example for the first three months:

'''

sol_array = []

sol_array.append(pd.DataFrame(ex_array).iloc[0::12].median().to_string())
sol_array.append(pd.DataFrame(ex_array).iloc[1::12].median().to_string())
sol_array.append(pd.DataFrame(ex_array).iloc[2::12].median().to_string())

But this is the outcome. The 0 and the apostrophes shouldn't be there.

['0   -0.075844',
 '0   -0.089111',
 '0    0.042705',
 '0    0.002147',
 '0   -0.010528',
 '0    0.109443',
 '0    0.198334',
 '0    0.20983',
 '0    0.075139',
 '0   -0.062405']

So, do you know another way to obtain the same outcome. I only have 120 values, so it is still viable to arrange the groups manually (only 10 groups) but I feel it's not an ideal solution.

Or, do you know how to correct the above method I and obtain a workable array?

CodePudding user response:

Let us use numpy operations:

np.median(np.reshape(ex_array, (12, -1), 'F'), axis=1)

array([-0.08017496, -0.04771155,  0.06031283, -0.00385278, -0.03095536,
        0.06283668,  0.15979743,  0.14921719,  0.06285071, -0.08965355,
       -0.21959495, -0.08084032])

CodePudding user response:

A couple of alternatives:

list(pd.DataFrame(ex_array).groupby(lambda i:i).median()[0])

or

import statistics
[statistics.median(ex_array[i] for i in range(j, len(ex_array), 12)) for j in range(12)]

In both cases the output (for the data in your question) is

[
 -0.08017495795, -0.0477115501, 0.06031283475, -0.00385278371,
 -0.030955360799999998, 0.0628366769, 0.15979743200000002, 0.14921718750000001,
 0.0628507072, -0.08965355124999999, -0.21959495, -0.0808403162
]
  • Related