I have a dataframe with a column "URLs" that contains 23k website url redirects. I want to get the final url from these redirects and store them in a new column. However, some of the original urls are not valid anymore and lead to an error, so that I want to try the code with tryCatch
. But since I am still a beginner in R, I am not sure how to correctly state this.
I used dput
on my "URLs" column for the first couple of rows and edited one url in, that is incorrect:
c("https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https://sirinlabs.com?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1036136?to=https://dash2trade.com?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1035284?to=https://impt.io?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1030235?to=https://calvaria.io?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1011041?to=https://artyfact.art?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1031430?to=https://www.projectnexus.app?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1005962?to=https://seedon.io?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1033498?to=https://vicuna.network?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1036409?to=https://cryptoffer.io/?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/23905?to=http://www.bitcoin.org/?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1450?to=https://ethereum.org?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/17581?to=https://telegram.org?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/1009688?to=https://egoco.in/?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/19163?to=https://lapo.io?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/20971?to=https://ingotcoin.io?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/26401?to=https://restotoken.org?utm_source=icoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https://ccc"
)
and the code I am playing around with currently looks like this:
library(httr)
df$URLs <- tryCatch(sapply(df$URLs, function(x) GET(x)$url), error = function(e) return(NULL))
I have seen questions like this: How to write trycatch in R explaining how to use tryCatch, however, I am not sure how to adapt it to my specific case. Would be grateful for any tips and code adaptations!!!
CodePudding user response:
Instead of tryCatch()
, I used possibly()
that comes with purrr and pretty much does the same thing. If the function throws an error it will replace it with NA
library(tidyverse)
library(httr)
df %>%
mutate(final_url = map_chr(
links,
possibly( ~ .x %>%
GET() %>%
pluck("url"),
otherwise = NA_character_)
))
# A tibble: 17 x 2
links final~1
<chr> <chr>
1 https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https:/~ https:~
2 https://icoholder.com/en/v2/ico/ico-redirect/1036136?to=https%~ https:~
3 https://icoholder.com/en/v2/ico/ico-redirect/1035284?to=https%~ https:~
4 https://icoholder.com/en/v2/ico/ico-redirect/1030235?to=https%~ https:~
5 https://icoholder.com/en/v2/ico/ico-redirect/1011041?to=https%~ https:~
6 https://icoholder.com/en/v2/ico/ico-redirect/1031430?to=https%~ https:~
7 https://icoholder.com/en/v2/ico/ico-redirect/1005962?to=https%~ https:~
8 https://icoholder.com/en/v2/ico/ico-redirect/1033498?to=https%~ https:~
9 https://icoholder.com/en/v2/ico/ico-redirect/1036409?to=https%~ https:~
10 https://icoholder.com/en/v2/ico/ico-redirect/23905?to=http:/~ https:~
11 https://icoholder.com/en/v2/ico/ico-redirect/1450?to=https:/~ https:~
12 https://icoholder.com/en/v2/ico/ico-redirect/17581?to=https:~ https:~
13 https://icoholder.com/en/v2/ico/ico-redirect/1009688?to=https%~ https:~
14 https://icoholder.com/en/v2/ico/ico-redirect/19163?to=https:~ https:~
15 https://icoholder.com/en/v2/ico/ico-redirect/20971?to=https:~ https:~
16 https://icoholder.com/en/v2/ico/ico-redirect/26401?to=https:~ http:/~
17 https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https:/~ NA
# ... with abbreviated variable name 1: final_url