I have a dataset like this
FID | osmid | s | e | seg_length |
---|---|---|---|---|
0 | 4999 | 733 | 99 | 7.7 |
1 | 566 | 733 | 33 | 3.2 |
2 | 499 | 713 | 96 | 7.7 |
3 | 56 | 783 | 32 | 3.5 |
4 | 409 | 783 | 98 | 7.6 |
5 | 516 | 736 | 38 | 3.5 |
6 | 459 | 739 | 98 | 7.7 |
7 | 526 | 731 | 33 | 3.2 |
s stands for starting work and e for ending point. Some FID share the same start and end point. I want to keep only one start and end point for every point so when two starting point are shared I want to keep only these with the shortest seg_length. I could'nt find a good code for that. Every end and start point should only have one unique value.
For example FID 0 and 1 share the same starting point and in the new dataset only FID 1 should be there. Also FID 4 and FID 6 share the same end point and in the new dataset only 4 should be there.
CodePudding user response:
We may use slice_min
after grouping by 's'
library(dplyr)
df1 %>%
group_by(s) %>%
slice_min(n = 1, order_by = seg_length) %>%
ungroup
-output
# A tibble: 6 × 5
FID osmid s e seg_length
<int> <int> <int> <int> <dbl>
1 2 499 713 96 7.7
2 7 526 731 33 3.2
3 1 566 733 33 3.2
4 5 516 736 38 3.5
5 6 459 739 98 7.7
6 3 56 783 32 3.5
data
df1 <- structure(list(FID = 0:7, osmid = c(4999L, 566L, 499L, 56L, 409L,
516L, 459L, 526L), s = c(733L, 733L, 713L, 783L, 783L, 736L,
739L, 731L), e = c(99L, 33L, 96L, 32L, 98L, 38L, 98L, 33L), seg_length = c(7.7,
3.2, 7.7, 3.5, 7.7, 3.5, 7.7, 3.2)),
class = "data.frame", row.names = c(NA,
-8L))
CodePudding user response:
We could group and then filter by min(seg_length)
:
library(dplyr)
df1 %>%
group_by(s) %>%
filter(seg_length == min(seg_length)) %>%
ungroup()
FID osmid s e seg_length
<int> <int> <int> <int> <dbl>
1 1 566 733 33 3.2
2 2 499 713 96 7.7
3 3 56 783 32 3.5
4 5 516 736 38 3.5
5 6 459 739 98 7.7
6 7 526 731 33 3.2