I have data from a questionnaire that has a column for year of birth. So the range of data was too large and my mapping became confusing. I'm now trying to take the years, group them up by decade decade, and then chart them. But I don't know how to group them.
my data is like:
birth_year <- data.frame("years"=c(
"1920","1923","1930","1940","1932","1935","1942","1944","1952","1956","1996","1961",
"1962","1966","1978","1987","1998","1999","1967","1934","1945","1988","1976","1978",
"1951","1986","1942","1999","1935","1920","1933","1987","1998","1999","1931","1977",
"1920","1931","1977","1999","1967","1992","1998","1984"
))
However, I want my data by group as:
birth_year count
(1920-1930]: 5
(1931-1940]: 8
(1941-1950]: 4
(1951-1960]: 3
(1961-1970]: 5
(1971-1980]: 5
(1981-1990]: 5
(1991-2000]: 9
and then plot as a range group.
CodePudding user response:
We can use cut()
to group the data, and then plot with ggplot()
.
birth_year <- data.frame("years"=c(
"1920","1923","1930","1940","1932","1935","1942","1944","1952","1956","1996","1961",
"1962","1966","1978","1987","1998","1999","1967","1934","1945","1988","1976","1978",
"1951","1986","1942","1999","1935","1920","1933","1987","1998","1999","1931","1977",
"1920","1931","1977","1999","1967","1992","1998","1984"
))
birth_year$yearGroup <- cut(as.integer(birth_year$years),breaks = 8,dig.lab = 4,
include.lowest = FALSE)
library(ggplot2)
ggplot(birth_year,aes(x = yearGroup)) geom_bar()
CodePudding user response:
birth_year %>%
mutate(val=cut_width(as.numeric(years),10,boundary = 1920, dig.lab=-1))%>%
count(val)
val n
1 [1920,1930] 5
2 (1930,1940] 8
3 (1940,1950] 4
4 (1950,1960] 3
5 (1960,1970] 5
6 (1970,1980] 5
7 (1980,1990] 5
8 (1990,2000] 9