I have a data set like this
ID departure_airport arrival_airport n
1 ZRH BUD 1
2 ZRH MSY 2
3 MSY IAD 2
4 VIE ZRH 1
I would like to duplicate the rows with a value of 2 in column "n" and invert their airport codes in the departure and arrival columns. The output should look like this:
ID departure_airport arrival_airport n
1 ZRH BUD 1
2 ZRH MSY 2
3 MSY ZRH 2
4 MSY IAD 2
5 IAD MSY 2
6 VIE ZRH 1
Any suggestions on how I can get the resulting table? Thanks in advance.
CodePudding user response:
You can filter all rows where n equals 2 and rbind
it to your data.frame in a different order:
rbind(df1, df1[df1$n == 2, c(1, 3, 2, 4)])
CodePudding user response:
We may use uncount
library(dplyr)
library(tidyr)
df1 %>%
uncount(n, .remove = FALSE) %>%
group_by(ID) %>%
mutate(departure_airport = replace(departure_airport,
duplicated(departure_airport), first(arrival_airport)),
arrival_airport = replace(arrival_airport, duplicated(arrival_airport), first(departure_airport))) %>%
ungroup %>%
mutate(ID = row_number())
-output
# A tibble: 6 × 4
ID departure_airport arrival_airport n
<int> <chr> <chr> <int>
1 1 ZRH BUD 1
2 2 ZRH MSY 2
3 3 MSY ZRH 2
4 4 MSY IAD 2
5 5 IAD MSY 2
6 6 VIE ZRH 1
data
df1 <- structure(list(ID = 1:4, departure_airport = c("ZRH", "ZRH",
"MSY", "VIE"), arrival_airport = c("BUD", "MSY", "IAD", "ZRH"
), n = c(1L, 2L, 2L, 1L)), class = "data.frame",
row.names = c(NA,
-4L))
CodePudding user response:
The other answers are good - I think dplyr
offers several readable methods for doing this, but in this case I'd do:
library(dplyr)
dat |>
filter(n == 2) |>
rename(
departure_airport = arrival_airport,
arrival_airport = departure_airport
) |>
bind_rows(dat) |>
select(names(dat)) |>
arrange(ID)
# ID departure_airport arrival_airport n
# 1 1 ZRH BUD 1
# 2 2 MSY ZRH 2
# 3 2 ZRH MSY 2
# 4 3 IAD MSY 2
# 5 3 MSY IAD 2
# 6 4 VIE ZRH 1
Assumes your data is called dat
obviously.