I want to create a new column, which contains the totals count of a unique instance of another column.
x <- c("1", "1", "1", "1", "2", "2", "2", "3", "3", "3", "4", "4", "5", "6", "6", "6")
y <- c("Y", "Y", "Y", "Y", "N", "N", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "N", "Y", "Y")
df <- data.frame(x, y)
What I want is the following:
# x y z
#
# 1 Y 4
# 1 Y 4
# 1 Y 4
# 1 Y 4
# 2 N 3
# 2 N 3
# 2 Y 3
# 3 Y 3
# 3 Y 3
# 3 Y 3
# 4 Y 2
# 4 Y 2
# 5 Y 1
# 6 N 3
# 6 Y 3
# 6 Y 3
I have done the following script but it is convuluted.
library(plyr)
library(dplyr)
library(purrr)
library(tidyverse)
library(ggtext)
library(stringr)
x <- c("1", "1", "1", "1", "2", "2", "2", "3", "3", "3", "4", "4", "5", "6", "6", "6")
y <- c("Y", "Y", "Y", "Y", "N", "N", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "N", "Y", "Y")
df <- data.frame(x, y)
unique_count <- as.data.frame(table(df$x))
colnames(unique_count)[1] <- "x"
colnames(unique_count)[2] <- "z"
df <- purrr::reduce(list(df,unique_count), dplyr::left_join, by = 'x')
CodePudding user response:
A possible solution:
library(dplyr)
df %>%
add_count(x, name = "z")
#> x y z
#> 1 1 Y 4
#> 2 1 Y 4
#> 3 1 Y 4
#> 4 1 Y 4
#> 5 2 N 3
#> 6 2 N 3
#> 7 2 Y 3
#> 8 3 Y 3
#> 9 3 Y 3
#> 10 3 Y 3
#> 11 4 Y 2
#> 12 4 Y 2
#> 13 5 Y 1
#> 14 6 N 3
#> 15 6 Y 3
#> 16 6 Y 3
CodePudding user response:
This is exactly the same as @Pauls answer, but just with other words and one line more!
library(dplyr)
df %>%
group_by(x) %>%
mutate(z = n())
x y z
<chr> <chr> <int>
1 1 Y 4
2 1 Y 4
3 1 Y 4
4 1 Y 4
5 2 N 3
6 2 N 3
7 2 Y 3
8 3 Y 3
9 3 Y 3
10 3 Y 3
11 4 Y 2
12 4 Y 2
13 5 Y 1
14 6 N 3
15 6 Y 3
16 6 Y 3
CodePudding user response:
This can be answered by the following post: dplyr: put count occurrences into new variable
You need to group your data by variable x and y and count the occurences of these combinations. With dplyr a solution would then be:
library(dplyr)
x <- c("1", "1", "1", "1", "2", "2", "2", "3", "3", "3", "4", "4", "5", "6", "6", "6")
y <- c("Y", "Y", "Y", "Y", "N", "N", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "N", "Y", "Y")
df <- data.frame(x, y)
df %>% group_by(x,y) %>% mutate(z = n())
Output:
# A tibble: 16 × 3
# Groups: x, y [8]
x y z
<chr> <chr> <int>
1 1 Y 4
2 1 Y 4
3 1 Y 4
4 1 Y 4
5 2 N 2
6 2 N 2
7 2 Y 1
8 3 Y 3
9 3 Y 3
10 3 Y 3
11 4 Y 2
12 4 Y 2
13 5 Y 1
14 6 N 1
15 6 Y 2
16 6 Y 2