I am trying to achieve a function that may be too complicated if I code it by myself so I am here seeking help to see if there is any better solution from the pro coders :)
You can find below an example traffic origin-destination table (A, B and C are all locations):
data <- data.frame(origin = c("A", "A", "B", "B", "B", "C"),
destination = c("B", "C", "A", "B", "C", "A"),
number = c(7,2,3,5,6,4))
Which the table looks like this:
origin destination number
1 A B 7
2 A C 2
3 B A 3
4 B B 5
5 B C 6
6 C A 4
Ideally, every independent origin
should have 3 rows of data, which is for example, from A to A, A to B and A to C. But as you have seen in the table, for origin A and C, some data is lacking (A to A for origin A, and C to B and C for origin C).
What I want to achieve is letting R adding the missing rows automatically and assign a number "1" for the column number
Which is to say the final table should looks something like below:
origin destination number
1 A A 1
2 A B 7
3 A C 2
4 B A 3
5 B B 5
6 B C 6
7 C A 4
8 C B 1
9 C C 1
I am wondering if there is any existing R formula that could potentially achieve this function? If not, do you have any suggestion that could make the data processing coding short and efficient? Thanks very much for your help in advance!
CodePudding user response:
The complete
function from tidyr
does this.
library(tidyr)
complete(data, origin, destination, fill = list(number = 1))
# # A tibble: 9 x 3
# origin destination number
# <chr> <chr> <dbl>
# 1 A A 1
# 2 A B 7
# 3 A C 2
# 4 B A 3
# 5 B B 5
# 6 B C 6
# 7 C A 4
# 8 C B 1
# 9 C C 1