With stackoverflow help I could manage to finalize my script. One of my problems got solved here: How to select multiple, inner list elements with lapply in R?. Since my script was ready I tried to handle a bigger nested list now and I am stuck again at the same line of code with a new error.
This is the nested list:
str(df_raw_comments_threads)
List of 89
$ tgg001 :List of 2
..$ comments:'data.frame': 992 obs. of 12 variables:
.. ..$ date_utc : chr [1:992] "2019-02-05" "2019-02-05" "2019-02-05" "2019-02-05" ...
.. ..$ timestamp : num [1:992] 1.55e 09 1.55e 09 1.55e 09 1.55e 09 1.55e 09 ...
.. ..$ subreddit : chr [1:992] "hardwareswap" "hardwareswap" "hardwareswap" "hardwareswap" ...
.. ..$ thread_author : chr [1:992] "aagarwal82" "aagarwal82" "aagarwal82" "[deleted]" ...
.. ..$ comment_author: chr [1:992] "tgg001" "tgg001" "tgg001" "tgg001" ...
.. ..$ thread_title : chr [1:992] "[USA-NC] [H] NVIDIA GTX 1080 TI MSI Duke [W] PayPal" "[USA-NC] [H] NVIDIA GTX 1080 TI MSI Duke [W] PayPal" "[USA-NC] [H] NVIDIA GTX 1080 TI MSI Duke [W] PayPal" "[USA-VA] [H] G.Skill Trident Z RGB 32GB 4 x 8GB DDR4-3200, Enermax Liqtech TR4 II 280mm, Asus X399 Zenith Extre"| __truncated__ ...
.. ..$ comment : chr [1:992] "Yes I paid for it" "SOLD TO ME W\017" "check pm" "Well I need the other 16gb.... OP can we split this?" ...
.. ..$ score : num [1:992] 1 1 1 1 1 1 1 1 1 2 ...
.. ..$ up : num [1:992] 1 1 1 1 1 1 1 1 1 2 ...
.. ..$ downs : num [1:992] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ golds : num [1:992] 0 0 0 0 0 0 0 0 0 0 ...
..$ threads :'data.frame': 62 obs. of 11 variables:
.. ..$ date_utc : chr [1:62] "2019-01-22" "2019-01-21" "2019-01-21" "2018-11-29" ...
.. ..$ timestamp: num [1:62] 1.55e 09 1.55e 09 1.55e 09 1.54e 09 1.54e 09 ...
.. ..$ subreddit: chr [1:62] "FortniteCompetitive" "FortNiteBR" "FortNiteBR" "Kanye" ...
.. ..$ author : chr [1:62] "tgg001" "tgg001" "tgg001" "tgg001" ...
.. ..$ title : chr [1:62] "Well rip my 4th consecutive solo win streak today. he had 1 kill and 50 health... clearly the better player" "15 kills later basically solo squad and this is how I die. Epic can we vault trees?" "I went AFK for 20 seconds and I came back at the perfect time lol" "=«>â Is he wavy? >\024 If not try to roast him with Kanye lyric-related comments" ...
.. ..$ text : chr [1:62] "" "" "" "" ...
.. ..$ golds : num [1:62] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ score : num [1:62] 0 3289 3115 0 41 ...
.. ..$ ups : num [1:62] 0 3289 3115 0 41 ...
.. ..$ downs : num [1:62] 0 0 0 0 0 0 0 0 0 0 ...
I want to select "subreddit" and "date_utc" from each df of each element in the nested list:
df_map_date_sub <- map2(
df_raw_comments_threads |> map(~ .x$comments |> dplyr::select("subreddit", "date_utc")), df_raw_comments_threads |> map(~ .x$threads |> dplyr::select("subreddit", "date_utc")),
~ list(comments = .x, threads = .y) )
Error in UseMethod("select") :
no applicable method for 'select' applied to an object of class "NULL"
I thought wow okay, this line already worked so maybe I restart R, then I thought hm okay maybe the list does have some NULLs, then I could trycatch so I found this code to check if there are some NULL entries:
group_errs = df_map_comments_threads %>% map(~ .x$comments |>
keep(~is.null(.x) ) ) %>%
names()
> length(group_errs)
[1] 89
I am very confused now. What does the error mean? pls help
I also tried:
df_map_date_sub <- map2(
df_raw_comments_threads |> map(.x$comments, ~ dplyr::select(.,subreddit, date_utc)), df_raw_comments_threads |> map(.x$threads, ~ dplyr::select(.,subreddit, date_utc)),
~ list(comments = .x, threads = .y) )
Error in as_mapper(.f, ...) : object '.x' not found
CodePudding user response:
We could use purrr::safely
to provide a default empty data frame of comments and threads in case of error:
extract_subreddit_date_safely <- safely(
~ list(
comments = .x[["comments"]] |> dplyr::select("subreddit", "date_utc"),
threads = .x[["threads"]] |> dplyr::select("subreddit", "date_utc")
),
otherwise = list(comments = data.frame(), threads = data.frame())
)
res_safe <-
df_raw_comments_threads %>%
map(extract_subreddit_date_safely)
res <-
res_safe %>%
map("result")
Examine problematic records with:
df_raw_comments_threads[map_lgl(res_safe, ~ !is.null(.x[["error"]]))]