Home > front end >  How to calculate the median (without np.median) from frequency data?
How to calculate the median (without np.median) from frequency data?

Time:10-23

I am trying to create a function which receives an ordered array of values and associated frequencies as input and produces the median of the observations as output. My idea was to recreate the original data set by repeatedly adding each value, in order, to a new variable according to its frequency of occurrence. After that, I would just call a function I've already created for calculating the median of a set of raw observations.

So, for example. So we have:

severities = np.arange(7)
with_helmet = np.array([248, 58, 11, 3, 2, 8, 1])

Then I want my function to add zero 248 times, one 58 times, and so on. I'm new to numpy, and I'm embarrassed to say I'm not sure how to do this. A helpful function I found was

np.repeat(array, repeats)

but that duplicates each element a set number of times, whereas I want to duplicate each element in values the number of times it occurs (i.e. according to the corresponding frequency value).

Can anyone provide in suggestions (in base python and numpy only)?

CodePudding user response:

If I understand, this is it:

import numpy as np
import collections

severities = np.arange(7)
with_helmet = np.array([248, 58, 11, 3, 2, 8, 1])

ans = np.repeat(severities, with_helmet)
counter = collections.Counter(ans)
print(ans)
print(counter)
  • Related