Home > Blockchain >  For loop in R: How to store all outputs in a loop instead of the last iteration?
For loop in R: How to store all outputs in a loop instead of the last iteration?

Time:02-16

I am a beginner in both programming and this community so if there anything I neglected, please forgive me~

I am trying to save the output of a for loop in a data frame, but I finally got the output of the last iteration. Here is my original code:

for (i in 1: 4){
  web <- read_html(html$html[i])
  news_title <- web %>% html_nodes('div.m-statement__quote a')
  title <- news_title %>% html_text()
  result_title <- data.frame(title)
}
View(result_title)
dim(result_title)

I am sure that the first three line of code inside the loop is working by adding a line print(title) like this:

for (i in 1: 4){
  web <- read_html(html$html[i])
  news_title <- web %>% html_nodes('div.m-statement__quote a')
  title <- news_title %>% html_text()
  print(title)
}

The result of the above code is still the result of all 4 iterations, but the number before the code is not serial. The incomplete data frame I got through the original code contains only the last two lines of output.

[1] "Girl Scouts support \"Planned Parenthood and pro-abortion politicians.\""                                                                                                                                            
 [2] "The Texas abortion law “provides at least six weeks for a person to be able to get an abortion.”"                                                                                                                  
 [3] "“Newly obtained emails show UCSF harvesting the clitorises, testicles and penises of murdered babies.”"                                                                                                            
 [4] "\"There is no law that deals specifically with the term ‘partial-birth abortion.’”"                                                                                                                               
 [5] "“Innocent lives will be saved” by ending taxpayer funding of Planned Parenthood."
 [1] "A provision in the first bill passed by House Democrats would \"allow the billions of dollars of aid we send to other countries to be used for abortions.\""                                                                   
 [2] "\"Right now, women are able to access an abortion in the later stages of their pregnancy under certain conditions with approval of their medical doctors. I’ve done nothing to change that.\""                                
 [3] "A New York law makes it \"now perfectly legal to murder\" a baby a minute before it would be born. "                                                                                                                           
 [4] "Says at his confirmation hearing Brett Kavanaugh said birth control methods could be considered \"abortion-inducing drugs.\""                                                                                                  
 [5] "\"Rauner made you pay for abortions in all nine months of pregnancy.\""   
[1] "Millennials are \"more pro-life than baby boomers and older Americans.\" "                                                                                                                                
 [2] "Girl Scouts USA’s curriculum \"promotes Margaret Sanger, founder of Planned Parenthood, Betty Friedan, founder of NARAL Prochoice, and other pro-abortion women as icons for our children to emulate.\"" 
 [3] "A bill backed by Sean Duffy and other House Republicans \"could actually require the Internal Revenue Service to conduct audits of rape victims\" who get an abortion. "                                  
 [4] "\"Wendy Davis opposes any limits on abortion.\""                                                                                                                                                          
 [5] "\"Every (personhood) bill I’ve ever support has either had language that says we’re conforming to the constitutional rulings of the Supreme Court or something to that effect.\"" 
[1] "Millennials are \"more pro-life than baby boomers and older Americans.\" "                                                                                                                                
 [2] "Girl Scouts USA’s curriculum \"promotes Margaret Sanger, founder of Planned Parenthood, Betty Friedan, founder of NARAL Prochoice, and other pro-abortion women as icons for our children to emulate.\""                

I guess the discontinuous number before the output is the reason why I can't store the output of all iterations. But I don't know what to do next... Please enlighten me if you have any ideas on how to fix this. Thank you very much.

CodePudding user response:

At this part: result_title <- data.frame(title) What you're doing is to create new instances of data.frame everytime in each for iteration. Instead do this:

result_title <- data.frame()
for (i in 1: 4){
  web <- read_html(html$html[i])
  news_title <- web %>% html_nodes('div.m-statement__quote a')
  title <- news_title %>% html_text()
  this_result <- data.frame(title)
  result_title <- rbind(result_title, this_result)
}

CodePudding user response:

There usually is no need to create an empty data.frame in advance and a common approach is to use lapply to create a list of data.frames and then rbind them together.

this_result <- lapply(seq_along(1:4), function(i) {
  web <- read_html(html$html[i])
  news_title <- web %>% html_nodes('div.m-statement__quote a')
  title <- news_title %>% html_text()
  data.frame(title)
})

do.call(rbind, this_result)
  • Related