ex_array = [-8.23294593e-02, -4.07239507e-02, 6.08131029e-02, 2.72433402e-02,
-4.73587631e-02, 5.15452252e-02, 1.32902476e-01, 1.22322232e-01,
2.71845990e-02, -1.16927038e-01, -2.62239877e-01, -1.46526396e-01,
-1.82859136e-01, -1.02089602e-01, -1.91863501e-04, -5.42572200e-02,
-1.41798506e-01, 2.32538185e-02, 1.44525705e-01, 1.33945461e-01,
5.01618120e-02, -1.32664337e-01, -2.97395262e-01, -1.02531532e-01,
-7.80204566e-02, -5.46991495e-02, 1.05868862e-01, 7.25526818e-03,
5.04192997e-02, 7.41281286e-02, 1.75069159e-01, 1.64488914e-01,
7.55396024e-02, -6.23800645e-02, -1.76950023e-01, -5.91491004e-02,
-4.00535768e-02, 6.59473071e-04, 5.98125666e-02, -1.49608356e-02,
-1.45519585e-02, 1.49876707e-01, 1.92880709e-01, 2.33158881e-01,
7.59751625e-02, -2.46659059e-02, -1.40025102e-01, -3.02416639e-02]
I need to compute the median for every 12 values. Each value represents a month (from January to December), so I would like to obtain the median for each month of the year. Like this:
Approaches:
I could convert the array to a dataframe, and add a new column representing each month. Later, grouping by month and computing the median. But I feel it has to be an easier solution.
Another solution I thought to convert to a dataframe and slice every 12 values, each time starting from a different value. It works but I'm having problems obtaining a workable array. Adding example for the first three months:
'''
sol_array = []
sol_array.append(pd.DataFrame(ex_array).iloc[0::12].median().to_string())
sol_array.append(pd.DataFrame(ex_array).iloc[1::12].median().to_string())
sol_array.append(pd.DataFrame(ex_array).iloc[2::12].median().to_string())
But this is the outcome. The 0 and the apostrophes shouldn't be there.
['0 -0.075844',
'0 -0.089111',
'0 0.042705',
'0 0.002147',
'0 -0.010528',
'0 0.109443',
'0 0.198334',
'0 0.20983',
'0 0.075139',
'0 -0.062405']
So, do you know another way to obtain the same outcome. I only have 120 values, so it is still viable to arrange the groups manually (only 10 groups) but I feel it's not an ideal solution.
Or, do you know how to correct the above method I and obtain a workable array?
CodePudding user response:
Let us use numpy
operations:
np.median(np.reshape(ex_array, (12, -1), 'F'), axis=1)
array([-0.08017496, -0.04771155, 0.06031283, -0.00385278, -0.03095536,
0.06283668, 0.15979743, 0.14921719, 0.06285071, -0.08965355,
-0.21959495, -0.08084032])
CodePudding user response:
A couple of alternatives:
list(pd.DataFrame(ex_array).groupby(lambda i:i).median()[0])
or
import statistics
[statistics.median(ex_array[i] for i in range(j, len(ex_array), 12)) for j in range(12)]
In both cases the output (for the data in your question) is
[
-0.08017495795, -0.0477115501, 0.06031283475, -0.00385278371,
-0.030955360799999998, 0.0628366769, 0.15979743200000002, 0.14921718750000001,
0.0628507072, -0.08965355124999999, -0.21959495, -0.0808403162
]