Home > Back-end >  How to create a frequency table with partially individual and range count(python/pandas)
How to create a frequency table with partially individual and range count(python/pandas)

Time:05-21

if I have a data frame like this, I've tried binning and value count, cannot figure out how to put them together or there is some other way?

value
-4
-4
-3
-2 
0 
44 
59 
199

I need an output like this

value   value_cont
-4          2
-3          1
-2          1
0           1
1-200       3

CodePudding user response:

You have several options.

1- replace the values higher or equal to 1 with '1-200':

(df
 .assign(value=df['value'].mask(df['value'].ge(1), '1-200'))
 .groupby('value')['value'].agg(count='count')
 .reset_index()
)

output:

   value  count
0     -4      2
1     -3      1
2     -2      1
3      0      1
4  1-200      3

2- use pandas.cut with explicit bins:

bins = [-4,-3,-2,-1,0]
(pd
 .cut(df['value'], [float('-inf')] bins [float('inf')], labels=bins ['1-200'])
 .value_counts(sort=False).rename_axis('value')
 .reset_index(name='count')
)

output:

   value  count
0     -4      2
1     -3      1
2     -2      1
3     -1      0
4      0      1
5  1-200      3

CodePudding user response:

You can try select the bin rows then drop them from original dataframe and add the bin value.

out = (df['value'].value_counts()
       .to_frame('value_count')
       .rename_axis('value').reset_index())

df_ = out[out['value'].between(1, 200)]
out = out[~out.index.isin(df_.index)]
out.loc[len(out)] = ['1-200', df_['value_count'].sum()]
print(out)

   value  value_count
0     -4            2
1     -3            1
2     -2            1
3      0            1
4  1-200            3
  • Related