I would like to create a column in my dataset which is the subtraction of positive
and negative
sentiment from my total
column.
So for user Alex
, who has a positive sentiment sum of 80
and a negative
sentiment sum of 13
, the subtracted score will be 67.
The issue I am having is grouping the sentiment column in a way which allows me to preform this operation.
library(tidyverse)
# create mock dataframe
users <- c("Alex", "Alice", "Alexandra", "Andrew", "Alicia", "Alex", "Alice", "Alexandra", "Andrew", "Alicia")
sentiment <- c("positive", "negative", "positive","negative", "positive", "negative", "positive", "negative","positive", "negative")
total <- c(80, 70, 24, 74, 66, 13, 35, 94, 27, 94)
mockdataframe <- cbind(users,sentiment, total) %>% as_tibble()
mockdataframe$sentiment <- as.factor(mockdataframe$sentiment)
mockdataframe$total <- as.numeric(mockdataframe$total)
# using case_when() this way does not work
mockdataframe %>%
mutate(Subtraction = case_when(
sentiment == "positive" ~ (sentiment == "negative")/mockdataframe$total))
I am really struggling trying to solve this. Any help would be appreciated.
CodePudding user response:
Using tidyr::pivot_wider
you could do:
library(tidyverse)
mockdataframe %>%
pivot_wider(names_from = sentiment, values_from = total) %>%
mutate(Subtraction = positive - negative)
#> # A tibble: 5 × 4
#> users positive negative Subtraction
#> <chr> <dbl> <dbl> <dbl>
#> 1 Alex 80 13 67
#> 2 Alice 35 70 -35
#> 3 Alexandra 24 94 -70
#> 4 Andrew 27 74 -47
#> 5 Alicia 66 94 -28
Or using group_by
:
mockdataframe %>%
group_by(users) %>%
mutate(Subtraction = total[sentiment == "positive"] - total[sentiment == "negative"]) |>
ungroup()
#> # A tibble: 10 × 4
#> users sentiment total Subtraction
#> <chr> <fct> <dbl> <dbl>
#> 1 Alex positive 80 67
#> 2 Alice negative 70 -35
#> 3 Alexandra positive 24 -70
#> 4 Andrew negative 74 -47
#> 5 Alicia positive 66 -28
#> 6 Alex negative 13 67
#> 7 Alice positive 35 -35
#> 8 Alexandra negative 94 -70
#> 9 Andrew positive 27 -47
#> 10 Alicia negative 94 -28