I have a pandas dataframe and a column in the dataframe has these values.
df['column'] = [84.0, 85.0, 75.0, nan, 51.0, 50.0, 70.0, 85.0 ... ]
I am trying to get the frequency of getting a value between an interval like
freq = {
15 : 40, # number of values between 10 and 20 were 40. (mean taken to be 15)
25 : 47, # number of values between 20 and 30 were 47. (mean taken to be 25)
...
}
Is there any specific function in pandas to do this kind of operation rather than making a for loop and checking each value and incrementing the count in the freq dictionary?
[Edit] my goal is to get a dictionary like this and then to replace NaN
with freq.keys()
in the ratio of freq.values()
Thank you
CodePudding user response:
# create intervals
bins = pd.interval_range(0, 100, freq=10)
# assign each value in df["column"] to bin and count bin occurences
counts = pd.cut(df["column"], bins).value_counts()
# create a Series, indexed by interval midpoints and convert to dictionary
pd.Series(counts.values, index=bins.mid).to_dict()