I have these data:
u1 <- 'www.link1.com'
u2 <- 'www.link2.com'
u3 <- 'www.link3.com'
u4 <- 'www.link4.com'
I want to do some work on them, how can I use for-loop to do this?!
for (i in u1 : ur4)
{
texti <- gettxt(i)
}
CodePudding user response:
You can use a sapply
with get
to obtain the value of the object.
The first argument of sapply
depends on how many urls do you have.
sapply(1:4, function(x) gettext(get(paste0("u", x))))
[1] "www.link1.com" "www.link2.com" "www.link3.com" "www.link4.com"
Updated
If you wish to export the gettxt
results to individual vectors, you can use the following (note that lapply
is used here instead of sapply
):
setNames(lapply(1:4, function(x) gettxt(get(paste0("u", x)))), paste0("u", 1:4, "_txt")) %>%
list2env(envir = globalenv())
Where it'll output four vectors with the names as u1_txt
to u4_txt
grep("^u", ls(), value = T)
[1] "u1" "u1_txt" "u2" "u2_txt" "u3" "u3_txt" "u4"
[8] "u4_txt"
Example
I've used some dummy sites to test the codes.
u1 <- "https://CRAN.R-project.org/package=htm2txt"
u2 <- "https://CRAN.R-project.org/package=dplyr"
u3 <- "https://CRAN.R-project.org/package=tidyverse"
u4 <- "https://CRAN.R-project.org/package=tidyr"
setNames(lapply(1:4, function(x) gettxt(get(paste0("u", x)))), paste0("u", 1:4, "_txt")) %>%
list2env(envir = globalenv())
grep("^u", ls(), value = T)
[1] "u1" "u1_txt" "u2" "u2_txt" "u3" "u3_txt" "u4"
[8] "u4_txt"
u1_txt
[1] "htm2txt: Convert Html into Text\n\nConvert a html document to simple plain texts by removing all html tags. This package utilizes regular expressions to strip off html tags. It also offers gettxt() and browse() function, which enables you to get or browse texts at a certain web page.\n\nVersion: 2.1.1\n\nDepends: R (≥ 3.0.0)\n\nPublished: 2017-10-19\n\nAuthor: Sangchul Park [aut, cre]\n\nMaintainer: Sangchul Park <mail at sangchul.com>\n\nBugReports: https://github.com/sangchulpark/htm2txt/issues\n\nLicense: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]\n\nURL: https://github.com/sangchulpark\n\nNeedsCompilation: no\n\nIn views: WebTechnologies\n\nCRAN checks: htm2txt results\n\nDocumentation:\n\nReference manual: htm2txt.pdf\n\nDownloads:\n\nPackage source: htm2txt_2.1.1.tar.gz\n\nWindows binaries: r-devel: htm2txt_2.1.1.zip, r-release: htm2txt_2.1.1.zip, r-oldrel: htm2txt_2.1.1.zip\n\nmacOS binaries: r-release (arm64): htm2txt_2.1.1.tgz, r-release (x86_64): htm2txt_2.1.1.tgz, r-oldrel: htm2txt_2.1.1.tgz\n\nOld sources: htm2txt archive\n\nReverse dependencies:\n\nReverse imports: getDEE2\n\nLinking:\n\nPlease use the canonical form https://CRAN.R-project.org/package=htm2txt to link to this page."
CodePudding user response:
As pointed out by @slowowl and @r2evans, you don't need a loop as R is vectorized. Consider storing the URLs in a vector.
You can do that as follows:
URLs <- c('www.link1.com', 'www.link2.com', 'www.link3.com', 'www.link4.com')
And then you can use lapply
over this vector as follows:
text <- lapply(URLs, gettext)
To make a vector from individual variables you can use the following code:
vec <- c()
for (i in 1:4) {
vec <- c(vec, get(sprintf("u%i", i)))
}
[1] "www.link1.com" "www.link2.com" "www.link3.com" "www.link4.com"