I have two list of data frames, but one has an extra ID
in it. I would like to remove the extra ID
using the names that each component of the list is assigned. My actual data set has a whole slew of ID
s so I would like to essentially create a function that will allow me to remove any ID
with a different name using the names in another list as the index.
How could I go about doing this? In this case, I would be removing the ID
D
from the list and not because of the data frames in D
, but because the names in july2
differ from the names in july
.
I have tried using setdiff
but it just ends up returning a the list that I place in the first argument.
> setdiff(july, july2)
<list_of<
tbl_df<
date : date
x : double
y : double
ID : character
jDate: double
Month: double
new : date
>
>[12]>
$A
# A tibble: 16 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-04 60161. 800440. A 14794 7 2010-07-01
2 2010-07-08 61139. 825947. A 14798 7 2010-07-01
3 2010-07-04 60161. 800440. A 14794 7 2010-07-01
4 2010-07-08 61139. 825947. A 14798 7 2010-07-01
5 2010-07-04 60161. 800440. A 14794 7 2010-07-01
6 2010-07-08 61139. 825947. A 14798 7 2010-07-01
7 2010-07-04 60161. 800440. A 14794 7 2010-07-01
8 2010-07-08 61139. 825947. A 14798 7 2010-07-01
9 2010-07-04 60161. 800440. A 14794 7 2010-07-01
10 2010-07-08 61139. 825947. A 14798 7 2010-07-01
11 2010-07-04 60161. 800440. A 14794 7 2010-07-01
12 2010-07-08 61139. 825947. A 14798 7 2010-07-01
13 2010-07-04 60161. 800440. A 14794 7 2010-07-01
14 2010-07-08 61139. 825947. A 14798 7 2010-07-01
15 2010-07-04 60161. 800440. A 14794 7 2010-07-01
16 2010-07-08 61139. 825947. A 14798 7 2010-07-01
$A
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-12 66502. 804956. A 14802 7 2010-07-11
2 2010-07-16 79728. 858097. A 14806 7 2010-07-11
3 2010-07-20 77342. 830852. A 14810 7 2010-07-11
4 2010-07-12 66502. 804956. A 14802 7 2010-07-11
5 2010-07-16 79728. 858097. A 14806 7 2010-07-11
6 2010-07-20 77342. 830852. A 14810 7 2010-07-11
7 2010-07-12 66502. 804956. A 14802 7 2010-07-11
8 2010-07-16 79728. 858097. A 14806 7 2010-07-11
9 2010-07-20 77342. 830852. A 14810 7 2010-07-11
10 2010-07-12 66502. 804956. A 14802 7 2010-07-11
# ... with 14 more rows
$A
# A tibble: 16 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-24 75483. 828763. A 14814 7 2010-07-21
2 2010-07-28 69508. 806470. A 14818 7 2010-07-21
3 2010-07-24 75483. 828763. A 14814 7 2010-07-21
4 2010-07-28 69508. 806470. A 14818 7 2010-07-21
5 2010-07-24 75483. 828763. A 14814 7 2010-07-21
6 2010-07-28 69508. 806470. A 14818 7 2010-07-21
7 2010-07-24 75483. 828763. A 14814 7 2010-07-21
8 2010-07-28 69508. 806470. A 14818 7 2010-07-21
9 2010-07-24 75483. 828763. A 14814 7 2010-07-21
10 2010-07-28 69508. 806470. A 14818 7 2010-07-21
11 2010-07-24 75483. 828763. A 14814 7 2010-07-21
12 2010-07-28 69508. 806470. A 14818 7 2010-07-21
13 2010-07-24 75483. 828763. A 14814 7 2010-07-21
14 2010-07-28 69508. 806470. A 14818 7 2010-07-21
15 2010-07-24 75483. 828763. A 14814 7 2010-07-21
16 2010-07-28 69508. 806470. A 14818 7 2010-07-21
$B
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-01 72826. 888060. B 14791 7 2010-07-01
2 2010-07-05 67469. 807307. B 14795 7 2010-07-01
3 2010-07-09 77834. 868002. B 14799 7 2010-07-01
4 2010-07-01 72826. 888060. B 14791 7 2010-07-01
5 2010-07-05 67469. 807307. B 14795 7 2010-07-01
6 2010-07-09 77834. 868002. B 14799 7 2010-07-01
7 2010-07-01 72826. 888060. B 14791 7 2010-07-01
8 2010-07-05 67469. 807307. B 14795 7 2010-07-01
9 2010-07-09 77834. 868002. B 14799 7 2010-07-01
10 2010-07-01 72826. 888060. B 14791 7 2010-07-01
# ... with 14 more rows
$B
# A tibble: 16 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-13 74643. 845222. B 14803 7 2010-07-11
2 2010-07-17 78530. 807316. B 14807 7 2010-07-11
3 2010-07-13 74643. 845222. B 14803 7 2010-07-11
4 2010-07-17 78530. 807316. B 14807 7 2010-07-11
5 2010-07-13 74643. 845222. B 14803 7 2010-07-11
6 2010-07-17 78530. 807316. B 14807 7 2010-07-11
7 2010-07-13 74643. 845222. B 14803 7 2010-07-11
8 2010-07-17 78530. 807316. B 14807 7 2010-07-11
9 2010-07-13 74643. 845222. B 14803 7 2010-07-11
10 2010-07-17 78530. 807316. B 14807 7 2010-07-11
11 2010-07-13 74643. 845222. B 14803 7 2010-07-11
12 2010-07-17 78530. 807316. B 14807 7 2010-07-11
13 2010-07-13 74643. 845222. B 14803 7 2010-07-11
14 2010-07-17 78530. 807316. B 14807 7 2010-07-11
15 2010-07-13 74643. 845222. B 14803 7 2010-07-11
16 2010-07-17 78530. 807316. B 14807 7 2010-07-11
$B
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-21 61332. 840310. B 14811 7 2010-07-21
2 2010-07-25 69102. 809024. B 14815 7 2010-07-21
3 2010-07-29 66088. 817887. B 14819 7 2010-07-21
4 2010-07-21 61332. 840310. B 14811 7 2010-07-21
5 2010-07-25 69102. 809024. B 14815 7 2010-07-21
6 2010-07-29 66088. 817887. B 14819 7 2010-07-21
7 2010-07-21 61332. 840310. B 14811 7 2010-07-21
8 2010-07-25 69102. 809024. B 14815 7 2010-07-21
9 2010-07-29 66088. 817887. B 14819 7 2010-07-21
10 2010-07-21 61332. 840310. B 14811 7 2010-07-21
# ... with 14 more rows
$C
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-02 71110. 898586. C 14792 7 2010-07-01
2 2010-07-06 78769. 821287. C 14796 7 2010-07-01
3 2010-07-10 62446. 874366. C 14800 7 2010-07-01
4 2010-07-02 71110. 898586. C 14792 7 2010-07-01
5 2010-07-06 78769. 821287. C 14796 7 2010-07-01
6 2010-07-10 62446. 874366. C 14800 7 2010-07-01
7 2010-07-02 71110. 898586. C 14792 7 2010-07-01
8 2010-07-06 78769. 821287. C 14796 7 2010-07-01
9 2010-07-10 62446. 874366. C 14800 7 2010-07-01
10 2010-07-02 71110. 898586. C 14792 7 2010-07-01
# ... with 14 more rows
$C
# A tibble: 16 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-14 77316. 882468. C 14804 7 2010-07-11
2 2010-07-18 65028. 815016. C 14808 7 2010-07-11
3 2010-07-14 77316. 882468. C 14804 7 2010-07-11
4 2010-07-18 65028. 815016. C 14808 7 2010-07-11
5 2010-07-14 77316. 882468. C 14804 7 2010-07-11
6 2010-07-18 65028. 815016. C 14808 7 2010-07-11
7 2010-07-14 77316. 882468. C 14804 7 2010-07-11
8 2010-07-18 65028. 815016. C 14808 7 2010-07-11
9 2010-07-14 77316. 882468. C 14804 7 2010-07-11
10 2010-07-18 65028. 815016. C 14808 7 2010-07-11
11 2010-07-14 77316. 882468. C 14804 7 2010-07-11
12 2010-07-18 65028. 815016. C 14808 7 2010-07-11
13 2010-07-14 77316. 882468. C 14804 7 2010-07-11
14 2010-07-18 65028. 815016. C 14808 7 2010-07-11
15 2010-07-14 77316. 882468. C 14804 7 2010-07-11
16 2010-07-18 65028. 815016. C 14808 7 2010-07-11
$C
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-22 65117. 866750. C 14812 7 2010-07-21
2 2010-07-26 78462. 823259. C 14816 7 2010-07-21
3 2010-07-30 69577. 848118. C 14820 7 2010-07-21
4 2010-07-22 65117. 866750. C 14812 7 2010-07-21
5 2010-07-26 78462. 823259. C 14816 7 2010-07-21
6 2010-07-30 69577. 848118. C 14820 7 2010-07-21
7 2010-07-22 65117. 866750. C 14812 7 2010-07-21
8 2010-07-26 78462. 823259. C 14816 7 2010-07-21
9 2010-07-30 69577. 848118. C 14820 7 2010-07-21
10 2010-07-22 65117. 866750. C 14812 7 2010-07-21
# ... with 14 more rows
$D
# A tibble: 16 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-03 77586. 819905. D 14793 7 2010-07-01
2 2010-07-07 76249. 848582. D 14797 7 2010-07-01
3 2010-07-03 77586. 819905. D 14793 7 2010-07-01
4 2010-07-07 76249. 848582. D 14797 7 2010-07-01
5 2010-07-03 77586. 819905. D 14793 7 2010-07-01
6 2010-07-07 76249. 848582. D 14797 7 2010-07-01
7 2010-07-03 77586. 819905. D 14793 7 2010-07-01
8 2010-07-07 76249. 848582. D 14797 7 2010-07-01
9 2010-07-03 77586. 819905. D 14793 7 2010-07-01
10 2010-07-07 76249. 848582. D 14797 7 2010-07-01
11 2010-07-03 77586. 819905. D 14793 7 2010-07-01
12 2010-07-07 76249. 848582. D 14797 7 2010-07-01
13 2010-07-03 77586. 819905. D 14793 7 2010-07-01
14 2010-07-07 76249. 848582. D 14797 7 2010-07-01
15 2010-07-03 77586. 819905. D 14793 7 2010-07-01
16 2010-07-07 76249. 848582. D 14797 7 2010-07-01
$D
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-11 61531. 883305. D 14801 7 2010-07-11
2 2010-07-15 69514. 867063. D 14805 7 2010-07-11
3 2010-07-19 69178. 890183. D 14809 7 2010-07-11
4 2010-07-11 61531. 883305. D 14801 7 2010-07-11
5 2010-07-15 69514. 867063. D 14805 7 2010-07-11
6 2010-07-19 69178. 890183. D 14809 7 2010-07-11
7 2010-07-11 61531. 883305. D 14801 7 2010-07-11
8 2010-07-15 69514. 867063. D 14805 7 2010-07-11
9 2010-07-19 69178. 890183. D 14809 7 2010-07-11
10 2010-07-11 61531. 883305. D 14801 7 2010-07-11
# ... with 14 more rows
$D
# A tibble: 24 x 7
date x y ID jDate Month new
<date> <dbl> <dbl> <chr> <dbl> <dbl> <date>
1 2010-07-23 74554. 898077. D 14813 7 2010-07-21
2 2010-07-27 77455. 834715. D 14817 7 2010-07-21
3 2010-07-31 77461. 873993. D 14821 7 2010-07-21
4 2010-07-23 74554. 898077. D 14813 7 2010-07-21
5 2010-07-27 77455. 834715. D 14817 7 2010-07-21
6 2010-07-31 77461. 873993. D 14821 7 2010-07-21
7 2010-07-23 74554. 898077. D 14813 7 2010-07-21
8 2010-07-27 77455. 834715. D 14817 7 2010-07-21
9 2010-07-31 77461. 873993. D 14821 7 2010-07-21
10 2010-07-23 74554. 898077. D 14813 7 2010-07-21
# ... with 14 more rows
ID <- rep(c("A","B","C", "D"), 1000)
ID2 <- rep(c("A", "B", "C"), 1000)
date <- rep_len(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"), 500)
x <- runif(length(date), min = 60000, max = 80000)
y <- runif(length(date), min = 800000, max = 900000)
df <- data.frame(date = date,
x = x,
y =y,
ID)
df2 <- data.frame(date = date,
x = x,
y =y,
ID2)
df2$jDate <- julian(as.Date(df2$date), origin = as.Date("1970-01-01"))
df2$Month <- month(df2$date)
july <- df %>%
# Creates a new column assigning the first day in the 10-day interval in which
# the date falls under (e.g., 01-03-2021 would be in the first 10-day interval
# so the `floor_date` assigned to it would be 01-01-2021)
mutate(new = floor_date(date, "10 days")) %>%
# For any months that has 31 days, the 31st day would normally be assigned its
# own interval. The code below takes the 31st day and joins it with the
# previous interval.
group_by(ID) %>%
mutate(new = if_else(day(new) == 31, new - days(10), new)) %>%
group_by(new, .add = TRUE) %>%
# Filter the data by the season based on the `season_categ` column
filter(Month == "7") %>%
group_split()
july2 <- df2 %>%
# Creates a new column assigning the first day in the 10-day interval in which
# the date falls under (e.g., 01-03-2021 would be in the first 10-day interval
# so the `floor_date` assigned to it would be 01-01-2021)
mutate(new = floor_date(date, "10 days")) %>%
# For any months that has 31 days, the 31st day would normally be assigned its
# own interval. The code below takes the 31st day and joins it with the
# previous interval.
group_by(ID2) %>%
mutate(new = if_else(day(new) == 31, new - days(10), new)) %>%
group_by(new, .add = TRUE) %>%
# Filter the data by the season based on the `season_categ` column
filter(Month == "7") %>%
group_split()
names(july) <- sapply(july, function(x) paste(x$ID[1]))
names(july2) <- sapply(july2, function(x) paste(x$ID2[1]))
CodePudding user response:
It sounds like you're trying to remove the entries in the july list that don't have names in the july2 list. If that's the case, adding the the code below to your stack will do the trick:
# Get the unique names of the `july` list
jul_names <- unique(names(july))
# Find out which names are shared between the two lists
same_names <- jul_names[jul_names%in%unique(names(july2))]
# Subset the july list to only keep those entries with specific names
july <- july[names(july)%in%same_names]
If that's not what you're hoping for, then we'll need a few more details about the problem. As a commenter pointed out, there's a couple bugs in your reprex so I made my best guess for what you were trying to do:
july <- df %>%
mutate(new = floor_date(date, "10 days")) %>%
mutate(new = if_else(day(new) == 31, new - days(10), new)) %>%
group_by(ID, new) %>%
filter(month(date) == 7) %>%
group_split()