A = c(10009, 10009, 10009, 10009, 10011, 10011, ...)
B = c(23908, 230908, 230908,230908, 23514, 23514, ...)
I have a dataframe with the above 2 columns. How do I create a third column, C, that is B divided by the number of rows that contains the corresponding value in column A?
I tried the below but the error is: "problem with mutate(), column C".
DF = DF %>%
group_by(A) %>%
mutate(C = B/n(A))
CodePudding user response:
n()
doesn't accept any arguments. Try -
library(dplyr)
DF <- DF %>% group_by(A) %>% mutate(C = B/n()) %>% ungroup
CodePudding user response:
You meant length
:
DF %>%
group_by(A) %>%
mutate(C = B / length(A))
Result on your example dataset:
A B C
<dbl> <dbl> <dbl>
1 10009 23908 5977
2 10009 230908 57727
3 10009 230908 57727
4 10009 230908 57727
5 10011 23514 11757
6 10011 23514 11757
CodePudding user response:
Update: A longer version (maybe not the best) for your task might be first to use add_count
and then mutate: With this longer version you can follow the steps:
library(dplyr)
df %>%
group_by(A) %>%
add_count() %>%
mutate(C = B/n) %>%
ungroup() %>%
select(-n)
output:
A B C
<dbl> <dbl> <dbl>
1 10009 23908 5977
2 10009 230908 57727
3 10009 230908 57727
4 10009 230908 57727
5 10011 23514 11757
6 10011 23514 11757
First answer some seconds behind Ronak Shah!
library(dplyr)
df %>%
group_by(A) %>%
mutate(C = B/n())
CodePudding user response:
Using data.table
library(data.table)
setDT(DF)[, C := B/.N, A]