rbind error while performing a for-loop: duplicate 'row.names' are not allowed-CodePudding

The enclosed code is an attempt to extract data from an api, but when I try to paginate and bind the rows, the row index duplicates posing the below error:

**Error in `.rowNamesDF<-`(x, value = value) : duplicate 'row.names' are not allowed**
**In addition: Warning message: non-unique values when setting 'row.names':**

The code is:

df = tibble()

for (i in seq(from = 0, to = 620, by = 24)) {
  linky = paste0("https://www.rightmove.co.uk/api/_search?locationIdentifier=REGION^94405&numberOfPropertiesPerPage=24&radius=0.0&sortType=2&index=",i,"&includeSSTC=false&viewType=LIST&channel=BUY&areaSizeUnit=sqft&currencyCode=GBP&isFetching=false")
  pge <- jsonlite::fromJSON(linky)
  props <- pge$properties
  print(linky)
  Sys.sleep(runif(1, 2.34, 6.19))
  
  df = rbind(df, tibble(props))
  
  print(paste("Page:", i))  
  
}

HA_area_ <- df

CodePudding user response：

As the error indicates due to different column names the dataframes can't be bound together. Below are the column names for first two dataframes.

[[1]]
 [1] "id"                          "bedrooms"                    "bathrooms"                   "numberOfImages"             
 [5] "numberOfFloorplans"          "numberOfVirtualTours"        "summary"                     "displayAddress"             
 [9] "countryCode"                 "location"                    "propertyImages"              "propertySubType"            
[13] "listingUpdate"               "premiumListing"              "featuredProperty"            "price"                      
[17] "customer"                    "distance"                    "transactionType"             "productLabel"               
[21] "commercial"                  "development"                 "residential"                 "students"                   
[25] "auction"                     "feesApply"                   "feesApplyText"               "displaySize"                
[29] "showOnMap"                   "propertyUrl"                 "contactUrl"                  "staticMapUrl"               
[33] "channel"                     "firstVisibleDate"            "keywords"                    "keywordMatchType"           
[37] "saved"                       "hidden"                      "onlineViewingsAvailable"     "lozengeModel"               
[41] "hasBrandPlus"                "propertyTypeFullDescription" "addedOrReduced"              "formattedDistance"          
[45] "heading"                     "enhancedListing"             "displayStatus"               "formattedBranchName"        
[49] "isRecent"                   

[[2]]
 [1] "id"                          "bedrooms"                    "bathrooms"                   "numberOfImages"             
 [5] "numberOfFloorplans"          "numberOfVirtualTours"        "summary"                     "displayAddress"             
 [9] "countryCode"                 "location"                    "propertyImages"              "propertySubType"            
[13] "listingUpdate"               "premiumListing"              "featuredProperty"            "price"                      
[17] "customer"                    "distance"                    "transactionType"             "productLabel"               
[21] "commercial"                  "development"                 "residential"                 "students"                   
[25] "auction"                     "feesApply"                   "feesApplyText"               "displaySize"                
[29] "showOnMap"                   "propertyUrl"                 "contactUrl"                  "staticMapUrl"               
[33] "channel"                     "firstVisibleDate"            "keywords"                    "keywordMatchType"           
[37] "saved"                       "hidden"                      "onlineViewingsAvailable"     "lozengeModel"               
[41] "hasBrandPlus"                "displayStatus"               "formattedBranchName"         "addedOrReduced"             
[45] "isRecent"                    "formattedDistance"           "propertyTypeFullDescription" "enhancedListing"            
[49] "heading"

You can see different names of column at certain positions.

Instead of rbind we can use lapply and store results in a list.

Wee shall create function f1 to get the dataframe required and then use possibly to skip any errors.

f1 = function(x){
  linky = paste0("https://www.rightmove.co.uk/api/_search?locationIdentifier=REGION^94405&numberOfPropertiesPerPage=24&radius=0.0&sortType=2&index=",x,"&includeSSTC=false&viewType=LIST&channel=BUY&areaSizeUnit=sqft&currencyCode=GBP&isFetching=false")
  pge <- jsonlite::fromJSON(linky)
  props <- pge$properties
  print(linky)
  Sys.sleep(runif(1, 2.34, 6.19))
  print(paste("Page:", x)) 
  return(props)
}

x = seq(from = 0, to = 620, by = 24)
df = lapply(x, possibly(f1, NA))

CodePudding user response：

library(data.table)

dt <- lapply(seq(from = 0, to = 620, by = 24), function(i) {
  uri <- paste0("https://www.rightmove.co.uk/api/_search?locationIdentifier=REGION^94405&numberOfPropertiesPerPage=24&radius=0.0&sortType=2&index=", i,"&includeSSTC=false&viewType=LIST&channel=BUY&areaSizeUnit=sqft&currencyCode=GBP&isFetching=false")
  as.data.table(jsonlite::fromJSON(uri)$properties)
})

dt <- rbindlist(dt, fill = T)