Home > other >  How to detect the number of grouped data based on their frequency
How to detect the number of grouped data based on their frequency

Time:05-07

I have a vector of numbers

 x <- c(1,1,1,3,3,3,2,2,1,2,1,2,55,56,55,54,55,54,53,55,56,55,7,7,9,9,8,8,11,110,111,11,112,113,111,112,33)

if I plot x, hist(x) The histogram shows that the data are grouped in 4 different groups.
enter image description here

How can I obtain that information without the need to plot the data? Is there a way to obtain the density of x and count the number of groups?

I tried to use the density() function but I couldn't find a way to count the groups of data. I think a way to count the number of peaks would work as well. Maybe it would be useful to be able to set a frequency threshold to define the group.

I would like something that would return the number of group, in this case 4 groups, or 3 if we set the frequency threshold to 5.

CodePudding user response:

This uses hist(), but does not generate the plot:

x <- c(1,1,1,3,3,3,2,2,1,2,1,2,55,
       56,55,54,55,54,53,55,56,55,
       7,7,9,9,8,8,11,110,111,11,
       112,113,111,112,33)
h <- hist(x, plot=FALSE)
newx <- cut(x, breaks=h$breaks, include.lowest=TRUE)
table(newx)
#> newx
#>    [0,20]   (20,40]   (40,60]   (60,80]  (80,100] (100,120] 
#>        20         1        10         0         0         6

Created on 2022-05-06 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related