I have a data frame that looks like this :
date | var | cat_low | dog_low | cat_high | dog_high | Love | Friend |
---|---|---|---|---|---|---|---|
2022-01-01 | A | 1 | 7 | 13 | 19 | NA | friend |
2022-01-01 | A | 2 | 8 | 14 | 20 | NA | friend |
2022-01-01 | A | 3 | 9 | 15 | 21 | NA | friend |
2022-02-01 | B | 4 | 10 | 16 | 22 | love | NA |
2022-02-01 | B | 5 | 11 | 17 | 23 | love | NA |
2022-02-01 | B | 6 | 12 | 18 | 24 | love | NA |
I want to select the columns related to columns Love and Friend. If the column Love is love to give the columns that starts with cat and if the column Friend is friend to give me the columns that start with dog.
ideally i want to look like this :
date | var | a | b |
---|---|---|---|
2022-01-01 | A | 7 | 19 |
2022-01-01 | A | 8 | 20 |
2022-01-01 | A | 9 | 21 |
2022-02-01 | B | 4 | 16 |
2022-02-01 | B | 5 | 17 |
2022-02-01 | B | 6 | 18 |
library(lubridate)
date = c(rep(as.Date("2022-01-01"),3),rep(as.Date("2022-02-01"),3))
var = c(rep("A",3),rep("B",3))
cat_low = seq(1,6,1)
dog_low = seq(7,12,1)
cat_high = seq(13,18,1)
dog_high = seq(19,24,1)
Friend = c(rep("friend",3),rep(NA,3))
Love = c(rep(NA,3),rep("love",3))
df = tibble(date,var,cat_low,dog_low,cat_high,dog_high,Love,Friend);df
Any help? How i can do that in R using dplyr ?
CodePudding user response:
There might be better ways, but here's one:
library(tidyr)
library(dplyr)
df %>%
pivot_longer(cols = starts_with(c("cat", "dog")),
names_to = c("animal", ".value"),
names_pattern = "(cat|dog)_(low|high)") %>%
filter((is.na(Love) & animal == "dog") |
(is.na(Friend) & animal == "cat")) %>%
select(date, var, low, high)
output
# A tibble: 6 × 4
date var low high
<date> <chr> <dbl> <dbl>
1 2022-01-01 A 7 19
2 2022-01-01 A 8 20
3 2022-01-01 A 9 21
4 2022-02-01 B 4 16
5 2022-02-01 B 5 17
6 2022-02-01 B 6 18
CodePudding user response:
With dplyr
try this.
The first summarise
filters for dog or cat, the second renames and puts the variables together.
library(dplyr)
df %>%
summarise(date, var,
across(starts_with("dog"), ~ .x[Friend == "friend"]),
across(starts_with("cat"), ~ .x[Love == "love"])) %>%
rename(a = dog_low, b = dog_high) %>%
summarise(date, var, a = ifelse(is.na(a), cat_low, a),
b = ifelse(is.na(b), cat_high, b))
date var a b
1 2022-01-01 A 7 19
2 2022-01-01 A 8 20
3 2022-01-01 A 9 21
4 2022-02-01 B 4 16
5 2022-02-01 B 5 17
6 2022-02-01 B 6 18